Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement labels loading for OME-Zarr HCS plates #1066

Open
tischi opened this issue Nov 24, 2023 · 22 comments
Open

Implement labels loading for OME-Zarr HCS plates #1066

tischi opened this issue Nov 24, 2023 · 22 comments
Assignees
Labels
bug Something isn't working

Comments

@tischi
Copy link
Contributor

tischi commented Nov 24, 2023

I think labels are not yet supported.

These datasest have labels and may be used for development:

@tischi tischi added the bug Something isn't working label Nov 24, 2023
@tischi tischi self-assigned this Nov 24, 2023
@tischi
Copy link
Contributor Author

tischi commented Nov 28, 2023

works now.

@tischi tischi closed this as completed Nov 28, 2023
@jeskowagner
Copy link

Does this still work?

For the Zenodo-hosted MIP I get:

[ERROR] Command errored: Open HCS Dataset...
java.lang.RuntimeException: Could not determine HCSPattern for /<...>/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr.zip
	at org.embl.mobie.lib.hcs.Plate.determineHCSPattern(Plate.java:419)
	at org.embl.mobie.lib.hcs.Plate.<init>(Plate.java:139)
	at org.embl.mobie.MoBIE.openHCSDataset(MoBIE.java:263)
	at org.embl.mobie.MoBIE.<init>(MoBIE.java:153)
	at org.embl.mobie.command.open.OpenHCSDatasetCommand.run(OpenHCSDatasetCommand.java:87)
	at org.scijava.command.CommandModule.run(CommandModule.java:196)
	at org.scijava.module.ModuleRunner.run(ModuleRunner.java:165)
	at org.scijava.module.ModuleRunner.call(ModuleRunner.java:125)
	at org.scijava.module.ModuleRunner.call(ModuleRunner.java:64)
	at org.scijava.thread.DefaultThreadService.lambda$wrap$2(DefaultThreadService.java:247)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

And for the S3 hosted data the zarr loads fine, but I cannot see the labels:
image

Fiji version: 2.14.0/1.54f
Build: c89e8500e4
OS: macOS Sonoma 14

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

Dear @jeskowagner,

Could you please share the S3 address with me?

I don't think streaming directly from Zenodo is something that we are supporting. Is that what you tried? I think you may first have to download the data locally. If you feel directly streaming from Zenodo should work, please let me know and I can look into it....

Just saw that the Zenodo file is a ZIP (20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr.zip), I think this for sure will currently now work. If you send me the full address we can however investigate whether this may be possible in the future.

@jeskowagner
Copy link

Hi @tischi,
Thanks for your fast response! You were spot on, I was being slow this morning and didn't spot the zarr comes zipped (should've figured). It loads perfectly fine after unzipping; reading zipped zarr or streaming from Zenodo is not an important use case for me.
Regarding the S3 link: sorry for being unclear, I meant the links posted in the first comment on this issue, i.e. https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.1/plates/2551.zarr.

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

@jeskowagner I guess the labels are not shown in either case: S3 and local, right?
I will try to have a look, but I will be on holidays next week.
Unless I find a quick fix today, it will have to wait one or two weeks 😞

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

Hi @tibuch,

  1. Does working labels from OME-Zarr plates work for you?
  2. Would you be able to provide a minimal (again not chunked) example data set including labels for testing?
  3. Thanks!

@jeskowagner
Copy link

jeskowagner commented May 17, 2024

For the zenodo-hosted data (after downloading) displaying the labels works:
image
Although I think it technically doesn't show all labels, because under B/03/0/labels/.zattrs we have

{
    "labels": [
        "nuclei",
        "wf_2_labels",
        "wf_3_labels",
        "wf_4_labels"
    ]
}

And all of these labels do indeed have corresponding folders in the labels folder. In the picture above it is not clear which of those labels we are seeing (presumably nuclei?).

For the S3 data I have only tried streaming, because I get permission errors when trying to download through aws (though I'm sure there is something I am doing wrong: aws s3 cp s3://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.1/plates/5966.zarr . --recursive --no-sign-request).

And just to add: this is not a high priority issue of course, please enjoy your Friday and vacation!

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

@jeskowagner

  1. Ok...I don't think we ever had a use-case with multiple labels fields...might be that this is not supported yet; how big is this Zendo plate? Is that something you could share with me such that I could test locally?
  2. The data on S3, is that the same data or some other data?

@tibuch even though it seems to work in principle a minimal dataset for testing including labels would be awesome.

@jeskowagner
Copy link

jeskowagner commented May 17, 2024

Note that I am not the author of either data. That said, you can download the Zenodo data I visualised with the labels here. The download is quite modest - around 100 Mb and contains only a single well and a single site with 3 imaging channels and seemingly 4 label masks. As to motivation - having the option to toggle nuclei/cell/cytoplasm masks can be handy when troubleshooting segmentation issues.
Pinging @jluethi who is the author of the Zenodo-hosted data.

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

...and another post (sorry, please read everything above).

I opened https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.1/plates/2551.zarr in Fiji using: Plugins › BigDataViewer › HDF5/N5/Zarr/OME-NGFF Viewer

And this dataset still has the issue (see the very first post in this issue) that the labels are wrongly down-sampled: the datatype is changing to float during the downsampling. This is not a valid OME-Zarr. I think I already mentioned this to @joshmoore but apparently this had not yet been fixed (maybe there also never was a plan of fixing this, I cannot remember now 😄 ).

image

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

Thanks for the Zenodo link; I will download this and see whether we can manage to open all labels.

@jeskowagner
Copy link

jeskowagner commented May 17, 2024

In the Zenodo description I just saw that my interpretation of having multiple labels in this file may be incorrect. The description states (emphasis mine):

20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr contains the same 3 channels, but as maximum intensity projections. It contains nuclear segmentation through cellpose. It also contains 6 tables: The region of interests like in the 3D data, as well as measurements performed with napari-skimage-regionprops.

I would take that to mean it only contains a single label mask. In that case I am unsure what wf_2_labels to wf_4_labels, which are in B/03/0/labels/, represent.

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

Maybe @jluethi knows?

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

It turns out that loading the labels does not work anymore in mobie-5.0.0, working on this here: https://github.com/mobie/mobie-viewer-fiji/tree/issue-1066

@jluethi
Copy link

jluethi commented May 17, 2024

Note that I am not the author of either data. That said, you can download the Zenodo data I visualised with the labels here. The download is quite modest - around 100 Mb and contains only a single well and a single site with 3 imaging channels and seemingly 4 label masks. As to motivation - having the option to toggle nuclei/cell/cytoplasm masks can be handy when troubleshooting segmentation issues.
Pinging @jluethi who is the author of the Zenodo-hosted data.

Re the example OME-Zarr we have on Zenodo: Feel free to use that for tests if it's useful. We placed them there to use them that way as well :)

There is a slightly newer version available here: https://zenodo.org/records/10257532
(mostly with fixes to table metadata I think)

In the Zenodo description I just saw that my interpretation of having multiple labels in this file may be incorrect. The description states (emphasis mine):

That's just a non-updated description then. The OME-Zarr really does contain 4 labels:
Screenshot 2024-05-17 at 16 16 21

(screenshot in napari viewer, though typical issues there are deprecation of label color coming with 0.5.0 of napari => work on the plugin needed. And that it doesn't load any labels when loading a whole plate by default, only when loading individual images).
I now updated the description to be accurate again :)

@tischi
Copy link
Contributor Author

tischi commented May 17, 2024

OK, I have to fix my code then.

Notes to self:

  • Probably a fix is needed in mobie-io where for each OME-Zarr dataset I look whether there are sub-folder datasets where the path contains labels.
  • These should be then added to the List< SourceAndConverter > of the corresponding N5ImageData
  • I will probably also need to introduce a getDatasetName( index ) into the ImageData interface to be able to check whether the data set name contains labels and then treat it as a segmentation in MoBIE; having dataset names could be useful anyway, because for some image file formats I may be able to fetch channel names from the metadata.
  • https://imagesc.zulipchat.com/#narrow/stream/328251-NGFF/topic/Making.20it.20easy.20to.20read.20.2B.20write.20ome-zarr.20in.20Fiji

Fetching all the multiscale datasets:

final N5Reader n5 = ...
final N5TreeNode root = N5DatasetDiscoverer.discover(n5); // parse all metadata
N5TreeNode.flattenN5Tree(root)
        .filter(x -> {   // get all the "multiscales" metadata
            final N5Metadata meta = x.getMetadata();
            return meta != null && meta instanceof OmeNgffMetadata;
        }).flatMap( x -> {
            return x.getDescendants(n -> true); // return every child of a node containing multiscales
        }).forEach(x -> {
            System.out.println(x.getPath());
        });

@jeskowagner
Copy link

@jluethi which napari plugin are you using for that picture? With the current ome-zarr-py(https://github.com/ome/napari-ome-zarr) v0.5.3 and napari 0.4.19 I cannot see the labels:
image
I also tried napari v0.5.0a1, unfortunately also without success.

@jluethi
Copy link

jluethi commented May 21, 2024

Hey @jeskowagner
yeah, the napari ome zarr plugin unfortunately doesn't load labels by default, see discussion here:
Library: ome/ome-zarr-py#207
Plugin: ome/napari-ome-zarr#54

e.g. if you have a /path/to/my/plate.zarr and drag & drop this into napari, it will only load the images. With the 2 branches above, it can also load the labels.

Default napari-ome-zarr opens labels only if you load a single image, e.g. /path/to/my/plate.zarr/A/01/0.

Also, please be aware that napari 0.5.0 is deprecating some label color functions that the plugin currently uses. Until that is updated, the napari-ome-zarr plugin is not fully compatible with the current main build of napari (like 0.5.0a1)

@jeskowagner
Copy link

Fantastic, thanks @jluethi! I can use napari to test my implementation of writing labels into an existing zarr using faim-ipa with those PRs of napari/napari-ome-zarr/ome-zarr-py then!

@tischi that sounds like a great plan.

Probably a fix is needed in mobie-io where for each OME-Zarr dataset I look whether there are sub-folder datasets where the path contains labels.

I saw that the spec mentions that all label images should be listed in labels/.zattrs (though I am surprised by this being optional, because the preceding sentence seems to suggest this is mandatory. In any case, it might be faster/less IO heavy to just read labels/.zattrs rather than checking subfolders of labels.

Unfortunately my Java is incredibly rusty, so I won't be able to contribute a PR. Sorry about that.

@tischi
Copy link
Contributor Author

tischi commented May 21, 2024

In any case, it might be faster/less IO heavy to just read labels/.zattrs rather than checking subfolders of labels.

Yes, this is a consideration. In general I wonder what one would expect if one "opens" something like /path/to/my/plate.zarr/A/01/0...my current thinking is that it could make sense to open all datasets that are at that level and deeper than that, which in our case would include all the labels subfolders. Since, based on recent discussions around the ngff spec, I feel that labels/.zattrs might become deprecated I don't want to parse it right now, but simply open everything that's "below" /path/to/my/plate.zarr/A/01/0. However, indeed, if it turns out that discovering all datasets in all subfolders is slow than that is a concern...I will have to benchmark this.

In any case. I am on holidays this week. I will get back to it next week.

@tibuch
Copy link
Collaborator

tibuch commented May 21, 2024

What would happen if one of the sub-directories in /path/to/my/plate.zarr/A/01/0 is not a zarr? But some custom object or even directory structure that should not be interpreted by the viewer?

@tischi tischi reopened this May 24, 2024
@tischi
Copy link
Contributor Author

tischi commented May 24, 2024

@tibuch that is in fact the case with the data that @jluethi creates, as there are also tables stored. Such data can be filtered by the fact that they are not OmeNgffMetadata. If you scroll a bit up, you will find a code snippet kindly provided by John Bogovic that shows how to do this. However, I decided for now to explicitly look for the labels sub-folder, because this should save some time in comparison to parsing the whole zarr directory structure (once this issue is fixed). I'd be grateful if you could have a look at my current implementation whether that looks sensible to you! If you even want to play with this and check whether it works for your data you could modify this function. Note that this currently only works for locally stored OME-Zarr since there seems to be some issue parsing S3 hosted OME-Zarr using the latest n5 libraries (at least I am getting null there).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants