-
-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Larger than memory heatmaps #973
Comments
from #443 we got an example, that shows how one can quite easily implement some calculating/fetching of data when zooming in/out: using GLMakie
fig = Figure()
ax = Axis(fig[1, 1])
function datashader(limits, pixelarea)
# return your heatmap data
# here, I just calculate a sine field as a demo
xpixels, ypixels = widths(pixelarea)
xmin, ymin = minimum(limits)
xmax, ymax = maximum(limits)
[cos(x) * sin(y) for x in LinRange(xmin, xmax, xpixels),
y in LinRange(ymin, ymax, ypixels)]
end
xrange = lift(x -> minimum(x)[1] .. maximum(x)[1], ax.finallimits)
yrange = lift(x -> minimum(x)[2] .. maximum(x)[2], ax.finallimits)
pixels = lift(datashader, ax.finallimits, ax.scene.px_area)
heatmap!(ax, xrange, yrange, pixels,
xautolimits = false, yautolimits = false)
display(fig) |
Thanks! Amazing that can be done already. Is it possible to put this in a recipe? It would be good if GeoData.jl could do this by default above a certain size. |
Yeah I'd be happy to... I can help with putting it in a recipe, if someone extends the above script with the required interpolation math etc ;) |
Oh that's ok I can do it, I just wondered if it was possible at all! But didn't make that so clear. I guess I'll let you know if I hit any snags. |
@rafaqz hey, so... how are you actually using |
Yep, it works fine in basic scripts. Im too busy this month and I'll wait for the recipes iterface to settle here, but the plan is that GeoData.jl will have lazy plot zooming leveraging DiskArrays.jl/Makie.jl, using something like this. Edit: With the small text on my phone I missed the word "how" and thought you were just asking if I actually used this! |
Would you mind already posting that basic script so others can play around with it? |
Ok, trying to clean up my example to post here, but it seems like the example above doesn't work anymore either? julia> fig = GLMakie.Figure()
ERROR: type GridLayout has no field needs_update
Stacktrace:
[1] getproperty(x::GridLayout, f::Symbol)
@ Base ./Base.jl:33
[2] (::AbstractPlotting.var"#831#832"{GridLayout})(al::Outside)
@ AbstractPlotting ~/.julia/packages/AbstractPlotting/M8Nlv/src/figures.jl:80 And to clarify, in my now-broken the code, the disk loading is abstracted DiskArrays.jl/GeoData.jl internals, it's not apparent that's happening, it's just used like an array. But I'll post that if we can get the above working and I can fix my code too. Here's a simple way to replace the sin/cos array used above (I think, can't test): dxmin, dymin = map(max, map(x -> trunc(Int, x), (xmin, ymin)), map(first, axes(A)))
dxmax, dymax = map(min, map(x -> trunc(Int, x), (xmax, ymax)), map(last, axes(A)))
dxpixels = round(Int, (dxmax - dxmin) / (xmax - xmin) * xpixels)
dypiyels = round(Int, (dymax - dymin) / (ymax - ymin) * ypixels)
xs = [trunc(Int, x) for x in LinRange(dxmin, dxmax, xpixels)]
ys = [trunc(Int, y) for y in LinRange(dymin, dymax, ypixels)]
A[xs, ys] You could replace The more complicated script uses DimensionalData.jl selectors to work in e.g. lat/lon instead of pixels, but I don't think that was finished and I also can't test it. |
This is a regression in relation to GridLayoutBase and is already fixed, try updating Makie. You seem to be on an older version if AbstractPlotting is loaded. |
Right. WebIO.jl needs to register a patch version, it's holding back Observables to 0.3.3. Thats why no new Makie... |
This works quite well for me, but I'm getting poor performance (unusable) when using |
It should be much slower than GLMakie ...Although not very optimized at the moment... So if you want to do some profiling, you may find some low hanging fruits to improve performance... |
With #2779 I'm wondering if Makie should do this by default for all heatmaps? Just plotting regular in-memory arrays as heatmaps is slow if they get too big. |
Hm we don't have axis recipes yet, and this would need access to the finallimits... But maybe we can prototype this in some package, and then figure out how to integrate it into Makie going forward? |
Isn't this what Tyler.jl does (more or less)? I think there's a PR there which would allow arbitrary tiles, which one could extend to tiles mapped or resampled from a backing DiskArray... |
Yeaah, I was thinking about that.... But didn't have a quick idea how to make it work nicely (I mean, something I could implement in ~30 min or so) |
isn't that similar to the stuff vlines etc do, this works with the camera observables we get via the parent scene already. In the future, that could just be part of the scene/plot interface so we don't have to hook the observables directly |
Tyler.jl has 2x resolution changes though, so there will be noticeable jumps when you zoom if heatmaps work like Tyler. I'm just doing a rounding interpolation currently. function datashader(A, limits, pixelarea; maxres = 500)
xmin, ymin = minimum(limits)
xmax, ymax = maximum(limits)
xpixels, ypixels = widths(pixelarea)
xres = round(Int, max(1, min(maxres, xpixels, xmax - xmin)))
yres = round(Int, max(1, min(maxres, ypixels, ymax - ymin)))
dxmin = max(xmin, 1)
dxmax = min(xmax, size(A, 1))
dymin = max(ymin, 1)
dymax = min(ymax, size(A, 2))
xrange = LinRange(dxmin, dxmax, xres)
yrange = LinRange(dymin, dymax, yres)
# Interpolate to vectors of indices
xs = trunc.(Int, xrange)
ys = trunc.(Int, yrange)
return A[xs, ys]
end
xrange = lift(ax.finallimits) do fl
max(minimum(fl)[1], 1) .. min(maximum(fl)[1], size(first(maps)[], 1))
end
yrange = lift(ax.finallimits) do fl
max(minimum(fl)[2], 1) .. min(maximum(fl)[2], size(first(maps)[], 2))
end
pixels = lift(datashader, m, ax.finallimits, ax.scene.px_area)
p = heatmap!(ax, xrange, yrange, pixels)
|
It could be that this implementation where x, y and the data is lifted from finallimits updates inefficiently because the heatmap args conversion is triggered three times per limit change. Might be better implemented with |
@rafaqz is this something we would like to look at next week during MakieCon? |
Yeah we could totally hack on this. I was thinking it would be good to have zoom level caching like Tyler.jl, which would need defining zoom levels... so @asinghvi17 is right its basically just generalising Tyler.jl. But we could build the tile pyramid efficiently somehow, just doing one pass to make e.g. the first 4 layers of tiles. The do the rest lazily as they need less disk reads anyway. We could also try using layers of COG tiffs and similar where the tile pyramid is already built. Rasters.jl doesnt access those currently so thats a little more work. |
If we can get a good recipe for layers, we could make it easy to create a layers object from file, then just plot that. Perhaps an alternate dispatch for heatmap would also serve this purpose. |
Interesting, yeah if there was some kind of plottable |
Regular heatmaps could also use it with a function like the one above for subsampling. |
FYI this is something that ImageView does (via |
Ah interesting ImageBase.jl I was imagining for quick plotting of large disk-based data we would just use a nearest pixel algorithm so its fast. Then with some chunking patterns we can probably skip reading some of the data at all for the top level images. But we could use |
Some light reading: @felixcremer from your other Rasters.jl plotting issue where you mosiac a lot of plots I guess you would want that for this too? We could build this so its efficient for multiple mosaic tiles. |
In my case the data of a single tile is 15000 x 15000 which already kills Julia if I try to plot all of it. I currently only plot every 10nth pixel. But yes, when we go to more tiles I would like to only plot a meaningful subset. |
We should be able to mosaic 20 of those and pan and zoom around with no lag (I mean it should be possible to write it so we can do that 😅 ) |
That was how I was reading this, yes. So lets do work on this on tuesday. |
Yeah, the problem with plotting every There's a certain sense in which I think the best option is to come up with a way of representing the extrema (min/max) of each superpixel, but I don't know of a good way of doing that. If you had to pick one, the mean seems as good as anything else I can think of. |
The mean is always nice. But it depends how much patience we have for building the pyramid when its so easy to just zoom in and see more detail of a particular area. Reading every Nth might mean reading 1 Nth of the data if e.g. the chunks are colums. So N times faster to get the first overview image. |
I agree. But the downsides should be acknowledged. Maybe a practical example would make it clearer: imagine a photograph from astronomy and imagine stars are pointlike and very sparse. The every-nth pixel solution might return a completely black image, missing every single star in the image. |
#4317 is an implementation of the every-nth-pixel method if I understand correctly, but it shouldn't be too hard to generalize that to some arbitrary accumulator. |
It also added a gaussian Pyramid, which helps to remove those problems... Interesting would be how this works with e.g. DiskArrays and how to interpolate larger than memory data... |
We will need to handle the windowing, but DiskArrayEngine should be able to do this cc @meggart |
Fabian is out of office for the next month unfortunately :D. Maybe @felixcremer has some idea? The ideas in generating a pyramid in PyramidScheme are probably similar. |
I tried to play around with the new Makie.Pyramid function, but this is killing my julia process with the example in the docs, because I get a memory usage of more than 15 GB. PyramidScheme.jl is using DiskArrayEngine to do the aggregation of the data. We are not doing any interpolation, but aggregating every two by two pixel to a coarser resolution pixel. One could use PyramidScheme.Pyramid to prepare a Pyramid for plotting. This would not load the high resolution data into memory if it is a DiskArray and therefore have a smaller memory footprint. |
One would surely have to allocate the pyramid using mmap. |
With spatial raster data it would be good to be able to plot larger-than-memory heatmaps and images, and zoom in to get detail.
I'm not sure how/where this could be implemented, just flagging this as something that would be very useful in future, and as a key requirement for some spatial plotting workflows.
See:
yeesian/ArchGDAL.jl#195
https://github.com/rafaqz/GeoData.jl/issues/172
The text was updated successfully, but these errors were encountered: