-
Notifications
You must be signed in to change notification settings - Fork 278
2019 Toronto Wednesday
End of 2019 goals:
- all platforms (a bit of everything)
- Linux (wayland prioritized)
- MVP Android
- Laptops
- more Windows (include Win7/8)
- Beta
Android:
- glyph zooming $
- PLS optimizaitons
- GLES 2.0
- RGBA/swizzling
- disable array textures $
- tests
- fix
getScaledFont
crash
Linux:
- shader binary support
- blacklisting
- Wayland support
- fix vsync
Mac:
- Core Animation (CA) presentation
- CA document splitting
- CA WebGL
- blob recording of native themes:
- don't rasterize themes in content process
- texture uploads
- testing coverage
Picture caching:
- version 2.0
- universal picture internation
- cache filter outputs (blur)
- cache and share clip masks
- use blits for tiles $
Direct composition:
- scrolling
- document rendering
- WebGL
- video
- Windows 7 presentation
- Angle subpx extension
- test suits run with WR
- replace D2D with WR/Canvas2D
Threading:
- use less threads
- multi-thread scene building
- parallel task scheduling that isn't Rayon $$$
- non-blocking hit-testing
- remove IPC channel support
Display list:
- delta encoding
- spatial/clip trees
- make items tighter $
- reduce scene building times
- data pipe:
- better way to pass data through IPC
- more animated propertied $$
Blobs:
- recoordination
- bounds changing invalidation
- SVG filters
- (some form of) path rendering
- image font performance
- global locks in Skia
- single global context
- clip paths on GPU
Software WR:
- test LLVM pipe
- ship SwiftShader
- pick low-hanging fruits
Performance:
- enable document splitting
- gradient fast path
- make box shadows to be 1-st class primitives
- improve opaque pass fragment count
- optimize resource bindings
- optimize clip mask renderings
- local space raster scale $$
- SIMD optimizations
- render task graph 2.0
- proof of concept Vulkan/D3D12/Metal $$$
- better primitive culling
- animation junk at 60fps (frame scheduling)
- consider spatial culling structure
- optimize Intel GPU perf
- make FPS shooter fast
- BGRA8 and swizzling support
- glyph cache optimizations
- mipping
- size/scaling re-use
- sharing between windows
Tooling:
- Android mobile profiling tools
- multi-frame WR captures
- WR capture tiled blobs
- picture caching debugging infrastructure $
Correctness:
- WR 67 bugs
- WR 68 bugs
- snapping!
Refactor:
- remove Cairo
- rename Document and Pipeline terms
- tech debt cleanup
- rename some modules: tiling.rs, clip_scroll_tree.rs
Other:
- hire more engineers
- compile the list of websites we are good at (better than Chrome)
Fission:
- move ImageLib to WR process
- move font management to WR process
Security:
- fuzzying
- font sanitation
(??)
Goals:
- avoid doing work more than once (when a clip affects multiple primitives)
- avoid doing work on fully opaque areas of the clip
- simplify the cs_clip shaders
Ideas to explore:
- Clip mask inversion if we know that it's more 0 than 1.
- Use stencil. Potentially, test stencil for each clip.
- Share clips between items (under conditions).
Gather data about:
- number of clips affecting items
- average ratio of a clip area to the sum area of all clips
- how widely image masks are used
- how often the clip is shared between primitives
- what is the ratio of total primitive area versus the clip area
- are sub-pixel offsets of the primitives different?
We need to get back to a point where clips are fast-cleared to 1. This requires disabling scissor and re-evaluating performance against the current path that tries to render the first clip without blending. We can still render the corners of the first clip without blending. We don't need to render the opaque areas at all.
Need automated infrastructure that modifies the reftests:
- scaling both ref and image
- switch some of the pictures to have their own surfaces
- change the node in spatial tree where we switch to screen space rasterization
On Android, we don't always have BGRA8 internal format with glTexStorage
. On MacOS, we never have that.
Choices are:
- use
glTexImage2D
to make BGRA8 our internal format. Pay for mipmap allocation in VRAM. (Currently used on Android) - use
glTexStorage(RGBA8)
and pay for conversion of data from BGRA8. (Currently used on Mac).- we can convince ImageLib to produce RGBA8 data in the first place
- use
glTexStorage(RGBA8)
and pretend the data is in RGBA8, but use a swizzling sampler state when reading from it.- as a follow up, we can make some of the cached render tasks to produce BGRA8 right away, so that texture cache entries have more consistent swizzling
- use texture rectangles with BGRA8 internal format. Requires us to remove texture arrays.
Core idea:
- move internation logic scene building to the API side
- the current DL builder would just intern everything as an implementation detail
- another DL builder would work with primitive handles and update vectors
- there is a benefit of providing structure of update arrays, especially if those don't have any variable-encoded enums inside
DL restructuring:
- provide spatial tree, clip tree, picture tree and potentially a hit test tree
Picture cache slices:
-
Introduced by:
- WebGL, canvas, video elements
- Scroll roots (if using for performance within WR / low-end GPUs)
-
Don't want to do component alpha blend, because:
- It's not supported by OS compositors.
- If we are doing slices for internal WR reasons (performance) we probably don't want to render twice anyway.
-
For each slice:
- Try to determine if opaque.
- If yes, enable subpixel AA.
- Otherwise, use grayscale AA.
- Try to determine if opaque.
-
Various possible options for switching between subpx / gray AA:
- Consider a sticky downgrade where an interned text run stays gray after downgrading.
- Might be OK to switch between them.
-
Consider using framebuffer fetch as a follow up.
- Interpolate between subpx / grayscale based on fragment alpha.