-
-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix low hanging fruits for render performance #4485
Conversation
Compile Times benchmarkNote, that these numbers may fluctuate on the CI servers, so take them with a grain of salt. All benchmark results are based on the mean time and negative percent mean faster than the base branch. Note, that GLMakie + WGLMakie run on an emulated GPU, so the runtime benchmark is much slower. Results are from running: using_time = @ctime using Backend
# Compile time
create_time = @ctime fig = scatter(1:4; color=1:4, colormap=:turbo, markersize=20, visible=true)
display_time = @ctime Makie.colorbuffer(display(fig))
# Runtime
create_time = @benchmark fig = scatter(1:4; color=1:4, colormap=:turbo, markersize=20, visible=true)
display_time = @benchmark Makie.colorbuffer(fig)
|
This is failing because lines drop the |
Benchmark ResultsSHA: 96d592f9586ac63ff051fb7abc3db079f8db68a2 Warning These results are subject to substantial noise because GitHub's CI runs on shared machines that are not ideally suited for benchmarking. |
Same benchmark code, different sorting options:
Calling Moving around some clip planes code to make |
@@ -31,7 +31,12 @@ function render_frame(screen::Screen; resize_buffers=true) | |||
ShaderAbstractions.switch_context!(nw) | |||
|
|||
function sortby(x) | |||
return x[3][:model][][3, 4] | |||
robj = x[3] | |||
plot = screen.cache2plot[robj.id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plot lookup was the expensive bit here, so would be nice if we could avoid it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dict lookups are O(1) and somewhere around 10ns. My benchmarks for the current solution only have 3% difference to the old one.
Using robj[:model]
is also what made this pr fail, because lines with linestyles apply it on the CPU so it does not end up in uniforms. I would also expect this to subtly fail with f32 converts because those can change the model matrix (set it to I after applying it on the CPU)
Description
Makie master
118.770 μs (1193 allocations: 30.41 KiB)
With sorting change
104.764 μs (729 allocations: 25.48 KiB)
With framebuffer_size optimization
85.648 μs (727 allocations: 25.45 KiB)