-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plate labels fix #207
base: master
Are you sure you want to change the base?
Plate labels fix #207
Conversation
ome_zarr/reader.py
Outdated
first_well_path = self.plate_data["wells"][0]["path"] | ||
image_zarr = self.zarr.create(self.get_image_path(first_well_path)) | ||
# Create a Node for image, with no 'root' | ||
self.first_well_image = Node(image_zarr, []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshmoore @sbesson Here I'm creating a Node for an Image as I want a Multiscales
spec to do the parsing of the datasets, to give me the sizes of each resolution of the pyramid. However, this is leading to recursion errors in the tests (although it seems to work fine for me viewing local data in napari).
I don't see a way to reuse that logic in Multiscales
spec, without creating a Node? But I don't need any Node traversing logic. I guess I want to create a Multiscales
spec with no node. Should I make the node
optional in the Spec class, or is there another way I should be thinking about this?
NB: get_image_path()
is overridden by the PlateLabels
subclass to point to a labels image, so this works both for images in Wells and their child labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for having missed this. Without stepping through your code, I don't know offhand why you're getting the recursion. I assume that the seen
field is not getting updated and therefore it just keeps looping. Perhaps that's caused by not setting the root
.
What functionality do you want from the Spec
without a Node
? Could we just refactor that logic somewhere re-usable? (Perhaps statically?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I just need the data.shape
of each resolution of the multiscales pyramid, and the dtype
.
Before my last commit d731b87 I was passing in a root node (plate node) but that was also giving me recursion errors:
self.first_well_image = Node(image_zarr, node)
I'll try stepping though my code and see if I can work out the loop...
With the recent community interest (#65 (comment)), any thoughts on next steps here, @will-moore? |
@joshmoore Yes, I got it working, but tests are failing - See my question above at #207 (comment) |
creating a zarr, with an empty img_path, then creating a Node from that, resulted in a Plate node - recursively
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #207 +/- ##
==========================================
+ Coverage 84.79% 85.02% +0.23%
==========================================
Files 13 13
Lines 1473 1523 +50
==========================================
+ Hits 1249 1295 +46
- Misses 224 228 +4
... and 1 file with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As raised in https://forum.image.sc/t/hcs-ome-zarr-data-unable-to-download/57114, the re-enabling of PlateLabels
breaks the ome_zarr download
workflow for any plate
(conversion) [sbesson@pilot-zarr1-dev ~]$ ome_zarr download https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0013A/3451.zarr
downloading...
3451.zarr
3451.zarr
to .
Traceback (most recent call last):
File "/home/sbesson/miniconda3/envs/conversion/bin/ome_zarr", line 8, in <module>
sys.exit(main())
File "/home/sbesson/miniconda3/envs/conversion/lib/python3.9/site-packages/ome_zarr/cli.py", line 166, in main
ns.func(ns)
File "/home/sbesson/miniconda3/envs/conversion/lib/python3.9/site-packages/ome_zarr/cli.py", line 35, in download
zarr_download(args.path, args.output)
File "/home/sbesson/miniconda3/envs/conversion/lib/python3.9/site-packages/ome_zarr/utils.py", line 75, in download
for path, node in sorted(zip(paths, nodes)):
TypeError: '<' not supported between instances of 'Node' and 'Node'
More generally, I find the semantics of PlateLabels
confusing. I understand the driver fot a tool like napari to define a new top-level layer allowing to overlay the labels on top of the image. My concerns are rather about the API allowing to do so. Looking at the output of ome_zarr info
:
(conversion) [sbesson@pilot-zarr1-dev ~]$ ome_zarr info https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0013A/3451.zarr
https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0013A/3451.zarr/ [zgroup]
- metadata
- Plate
- data
- (93, 16384, 32256)
- (93, 8192, 16128)
- (93, 4096, 8064)
- (93, 2048, 4032)
- (93, 1024, 2016)
https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0013A/3451.zarr/ [zgroup]
- metadata
- PlateLabels
- data
a few comments and questions:
- having two separate nodes pointing to the same path in the hierarchy feels strange. Should it rather be a unique node with different specs as this is the case for label image which implement
Label
andMutliscales
? - the
PlateLabels
concept is not defined in the specification which makes it hard understand the expectation and the contract. Should this spec be defined for a plate with no label data? Otherwise, do any or all of the image need have some label? Are the labels expected to be consistent across the plate? - I assume the reader could register the
Label
(orLabels
?) node in the plate hierarchy graph for the first populated well. Would it possible for napari to detect the fact that a hierarhcy contains both aPlate
and aLabel
node ?
Unfortunately the path to the Plate and the PlateLabels is the same because the Labels are actually children of each Image, so there is not one path to a child Plate-Label. The reason for this is so that you can view an individual Image and it's labels will be below it's It's certainly expected that the labels for all Images in the plate will be the same size. I think the reading will fail if they are not. I guess it could be possible to handle that, but it seems like it might be a fair bit of work for an edge-case. If some wells are missing labels I think it should be fine, but if the first Well is missing labels, then I don't think they'll be shown. |
Would pointing a fictive node e.g.
In many ways, what this API attempts to achieve raises the question of whether there should be a lightweigth specification at the top-level allowing to register the labelsin addition to registering them at the individual image label? This would also help clarifying the assumptions about the existence/consistency of labels across wells. |
So, to determine whether the node at I could try to look at that... I'm not sure whether it will fix the download issue... |
The last commit tries to support a "fictive" node at
|
0b5c783
to
8ab3ad6
Compare
@sbesson That adds support for
|
Tested with the sample 2551.zarr IDR plate uploaded to https://idr.github.io/ome-ngff-samples/. Without ome/napari-ome-zarr#54,
This will probably mean some With both PRs installed,
Probably the most important feedback is that the label loading code introduces a significant performance overhead. The loading of the remote plate above increases from 13s to 50s before displaying the data. This performance issue is not specific to the napari workflow and also affects Tested briefly the
From my side this raises two high-level questions:
|
For the images at e.g. https://idr.openmicroscopy.org/webclient/img_detail/1229801/ the masks have no Z-index, and they are currently exported as a 2D labels (e.g. see https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0001A/2551.zarr/A/1/0/labels/0/).
I don't know of a way for a napari reader plugin to add anything to the napari UI: https://napari.org/stable/plugins/contributions.html#contributions-readers Migrating all of the Performance: the only way I can see to avoid making so many requests is to make some assumptions: e.g. assume that the labels are all named the same for each image. This is similar to previous assumptions (in vizarr) that the path to each Image in all Wells would be the same (hms-dbmi/vizarr#118), which had to be reverted. We do assume that all labels are the same shape and dtype. |
Agreed on the size increase but as per https://www.openmicroscopy.org/Schemas/Documentation/Generated/OME-2016-06/ome_xsd.html#Shape_TheZ, 2D labels is an incorrect representation of the original data when TheZ is not specified.
The current delegation mechanism is attractive in terms of simplicity and allows to keep the Assuming we want to separate the data/metadata retrieval logic at the level of this library and forward it to napari-ome-zarr, could we see definining a new separate top-level concept e.g. node overlay/layer allowing to register these additional representations which are effectively composites? At the consumer level, https://github.com/ome/napari-ome-zarr/pull/54/files#diff-3ac23664593f6255d26806df61d2364f3d3f0b66c53a4fad561a70877093239aR126 would generically need to test the existing of such elements and display them using their data/metadata.
So your feeling is that in addition to the retrieval of additional data per label, the primary overhead arises from the handling of label names for each field of view? What would be the current behavior if e.g. all fields of view had a |
Maybe if when viewing a Plate with Labels (e.g in The only thing that would break if we store the labels in this way is when you view an Image, you don't automatically see the Labels - For that we'd need the |
Hmmm.... so what's the next step here, @will-moore ? |
@sbesson Now we're both back, do you want to discuss this PR sometime? |
Very cool to see that this discussion is ongoing, looking forward to seeing labels on full plates!
I was wondering about this topic: Is there already a discussion ongoing somewhere about mixed 2D/3D datasets? Having 2D labels is one use case. Another thing we often hit is doing maximum intensity projections along Z to get better overviews of large 3D datasets. So far, we've always saved them as separate OME-Zarr plate files. I've been meaning to create a image.sc forum post or open an issue on this topic, as it would be great if there were a way to have them in the same dataset and to have a way to specify that one is a 2D-only dataset, e.g. the 2D labels should belong to all planes.
While there is the interesting debate about what assumptions can be made and what metadata needs to be read per field, I had 2 additional questions on this topic:
|
@jluethi For 2D -> 3D transforms (e.g. 2D masks/projections and 3D data) there is a collection of "user stories" at ome/ngff#84 - Story 9 is probably most relevant to the discussion here. |
@will-moore Thanks for linking me to the user stories. I will be joining the community calls. Was mostly drawn by the tables discussion, but now I'm also looking forward to learn more about how the transformations can be used for this 2D -> 3D mapping. Do you think such transformations could then also be used on the 2D labels instead of duplicating them along 3D? Also, do you want some testing & feedback on this PR to display the labels? Happy to test on some different OME-Zarrs we have to check how things perform :) Don't have any large plates with labels ready just yet, but can hopefully generate them soon to test. |
@jluethi with this PR and ome/napari-ome-zarr#54 installed, you can try:
But remember you have to move to Z-index: 0 before you see any labels: |
@jluethi As you will see with that test command, there are a lot of requests to get the labels path for each Well (instead of assuming it is the same for every Well). It is already quite slow on a 96-well plate. I think that this is probably not scalable to a 384-well plate, particularly if you want to include multiple acquisitions! So maybe it needs a re-think? |
Hmm, I'm just preparing some example datasets with labels for a whole plate. In a first test for a plate that had labels for 6 wells (and each well consisting of 72 images fused into a single array per well) is actually very cool to browse! The initial loading is fairly slow (~ 40s, compared to loading of 25s for the normal OME-Zarr version that doesn't load labels). I'll retest again once I've managed to actually process a whole plate with labels (labels on 23 wells instead of 6). But thought I'd share this impression at least :) Video of plate with partial labels and this PR: https://www.dropbox.com/s/2ds0nxhe9btqaxk/napari_dev.mov?dl=0 Video of loading the same plate in standard napari 0.4.16, released napari-ome-zarr: https://www.dropbox.com/s/2ejnseb19dm8i70/standard_ome_Zarr_py.mov?dl=0 Maybe having the label layer as default visible=False for loading of the whole plate would be a good idea to speed up initial loading? That's also how it's handled for individual images and at a plate level, labels aren't that informative. They really become interesting when one starts zooming in :) Also, this was an MIP plate, thus everything was 2D only and the browsing of labels worked well for that :) Thanks in any case for the installation instructions @will-moore , I'm very impressed with being able to browse plates with labels in that way in napari! ❤️ |
I have now created a larger plate with complete labels and tested it there. For a plate with 23 wells à 19440 x 20480 pixels (combined 8x9 field of views into a single image per well), 3 channels and 1 label channel (cellpose nuclear segmentation), it loads very well! That's about 100 GBs of image data loading remotely (via VPN & a quite slow samba share that's mounted locally) I don't have any 384 well examples or test cases with many wells at the moment. So maybe number of label images loading slows things down a bit. But this dataset that was originally 1656 field of views can be stored in OME-Zarrs in a manner that allows for very responsive loading! :) Here are the example videos: |
@will-moore: can you check the conflict here? |
Anyone have a preference on whether or not to role this into 0.7.0 (#267)? |
Just tested this again with https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0001A/2551.zarr inspired by the discussion at https://forum.image.sc/t/ome-zarr-hcs-with-labels-on-s3/97327. It works (using latest Using the current release of |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/ome-zarr-hcs-with-labels-on-s3/97327/8 |
Fixes #65.
This refactors
PlateLabels
:row:0 column:0
."0"
atimage/labels/0/
. Readsimage/labels/.zattrs
for every Well to find first name under{"labels": ["first_name"]}
.A/1/0/labels/name/.zattrs
to getdtype
and shapes of image pyramid, to generate a dask pyramid for the PlateNeeds a fix in
napari-ome-zarr
to work: ome/napari-ome-zarr#54