Hi there! As previously announced WebRender has made it to the stable channel and a couple of million users are now using it without having opted into it manually. With this important milestone behind us, now is a good time to widen the scope of the newsletter and give credit to other projects being worked on by members of the graphics team.
The WebRender newsletter therefore becomes the gfx newsletter. This is still far from an exhaustive list of the work done by the team, just a few highlights in WebRender and graphics in general. I am hoping to keep the pace around a post per month, we’ll see where things go from there.
What’s new in gfx
Async zoom for desktop Firefox
Botond has been working on desktop zooming
- The work is currently focused on the ability to use pinch gestures to zoom (scaling only, no reflow, like on mobile) on desktop platforms.
- The initial focus is on touchscreens, with support for touchpads to follow.
- We hope to have this ready for some early adopters to try out in the coming weeks.
WebGL power usage
Jeff Gilbert has been working on power preference options for WebGL.
WebGL has three power preference options available during canvas context creation:
The vast majority of web content implicitly requests “default”. Since we don’t want to regress web content performance, we usually treat “default” like “high-performance”. On macOS with multiple GPUs (MacBook Pro), this means activating the power-hungry dedicated GPU for almost all WebGL contexts. While this keeps our performance high, it also means every last one-off or transient WebGL context from ads, trackers, and fingerprinters will keep the high-power GPU running until they are garbage-collected.
In bug 1562812, Jeff added a ramp-up/ramp-down behavior for “default”: For the first couple seconds after WebGL context creation, things stay on the low-power GPU. After that grace period, if the context continues to render frames for presentation, we migrate to the high-power GPU. Then, if the context stops producing frames for a couple seconds, we release our lock of the high-power GPU, to try to migrate back to the low-power GPU.
What this means is that active WebGL content should fairly quickly end up ramped-up onto the high-power GPU, but inactive and orphaned WebGL contexts won’t keep the browser locked on to the high-power GPU anymore, which means better battery life for users on these machines as they wander the web.
DisplayList building optimization
Miko, Matt, Timothy and Dan have worked on improving display list build times
- The two main areas of improvement have been avoiding unnecessary work during display list merging, and improving the memory access patterns during display list building.
- The improved display list merging algorithm utilizes the invalidation assumptions of the frame tree, and avoids preprocessing sub display lists that cannot have changed. (bug 1544948)
- Some commonly used display items have drastically shrunk in size, which has reduced the memory usage and allocations. For example, the size of transform display item went down from 1024 bytes to 512 bytes. (bug 1502049, bug 1526941)
- The display item size improvements have also tangentially helped with caching and prefetching performance. For example, the base display item state booleans were collapsed into a bit field and moved to the first cache line. (bug 1526972, bug 1540785)
- Retained display lists were enabled for Android devices and for the parent process. (bugs 1413567 and 1413546)
- Telemetry probes show that since the Orlando All Hands, the mean display list build time has gone down by 40%, from ~1.8ms to ~1.1ms. The 95th percentile has gone down by 30%, from ~6.2ms to ~4.4ms.
- While these numbers might seem low, they are still a considerable proportion of the target 16ms frame budget. There is more promising follow-up work scheduled in bugs 1539597 and 1554503.
What’s new in WebRender
Software backend investigations
Glenn, Jeff Muizelaar and Jeff Gilbert are investigating WebRender on top of swiftshader or llvmpipe when the avialable GPU featureset is too small or too buggy. The hope is that these emulation layers will help us quickly migrate users that can’t have hardware acceleration to webrender with a minimal amount of specific code (most likely some simple specialized blitting routines in a few hot spots where the emulation layer is unlikely provide optimal speed).
It’s too early to tell at this point whether this experiment will pan out. We can see some (expected) regressions compared to the non-webrender software backend, but with some amount of optimization it’s probable that performance will get close enough and provide an acceptable user experience.
This investigation is important since it will determine how much code we have to maintain specifically for non-accelerated users, a proportion of users that will decrease over time but will probably never quite get to zero.
Pathfinder 3 investigations
Nical spent some time in the last quarter familiarizing with pathfinder’s internals and investigating its viability for rendering SVG paths in WebRender/Firefox. He wrote a blog post explaining in details how pathfinder fills paths on the GPU.
The outcome of this investigation is that we are optimistic about pathfinder’s approach and think it’s a good fit for integration in Firefox. In particular the approach is flexible and robust, and will let us improve or replace parts in the longer run.
Nical is also experimenting with a different tiling algorithm, aiming at reducing the CPU overhead.
The next steps are to prototype an integration of pathfinder 3 in WebRender, start using it for simple primitives such as CSS clip paths and gradually use it with more primitives.
Picture caching improvements
Glenn continues his work on picture caching. At the moment WebRender only triggers the optimization for a single scroll root. Glenn has landed a lot of infrastructure work to be able to benefit from picture caching at a more granular level (for example one per scroll-root), and pave the way for rendering picture cached slices directly into OS compositor surfaces.
The next steps are, roughly:
- Add a couple follow up optimizations such as exact dirty rectangles for smaller updates / multi-resolution tiles, detect tiles that are solid colors.
- Enable caching on content + UI explicitly (as an intermediate step, to give caching on the UI).
- Implement the right support for multiple cached slices that have transparency and subpixel text anti-aliasing.
- Enable fine grained caching on multiple slices.
- Expose those multiple cached slices to OS compositor for power savings
and better scrolling performance where supported.
WebRender on Android
Thanks to the continuous efforts of Jamie, Kats, WebRender is starting to be pretty solid on android. GeckoView is required to enable WebRender so it isn’t possible to enable it now in Firefox for Android, but we plan make the option available for the Firefox Preview browser (which is powered by GeckoView) when it will have nightly builds.