Skip to content

Releases: AllenCellModeling/aicsimageio

Bugfixes and Stability

10 Aug 00:56
Compare
Choose a tag to compare

AICSImageIO 4.1.0

This minor release includes:

  • A specification for readers to convert the file format's metadata into the OME metadata model.
  • A utility function to process metadata translation with XSLT.
  • A bugfix to support tifffile>=2021.7.30. This is a breaking change only if you are using img.xarray_data.attrs["unprocessed"] (or any xarray_* variant). It changes the "unprocessed" metadata into a Dict[int , str] rather than a tifffile.TiffTags object.
  • A minor breaking change / bugfix for LIF scale metadata correction. LIF spatial scales should now be accurate.
  • An admin change of being less strict on dependencies so that aicsimageio can be installed in more environments.

For links to the PRs that produced these changes, see our full CHANGELOG

Contributors and Reviewers this Release (alphabetical)

Matte Bailey (@MatteBailey)
Jackson Maxfield Brown (@JacksonMaxfield)
Madison Swain-Bowden (@AetherUnbound)
Dan Toloudis (@toloudis)

Mosaics, Xarray, and Writers, Oh My!

07 Jun 16:03
Compare
Choose a tag to compare

AICSImageIO 4.0.0

We are happy to announce the release of AICSImageIO 4.0.0!

AICSImageIO is a library for image reading, metadata conversion, and image writing for microscopy formats in pure Python. It aims to be able to read microscopy images into a single unified API regardless of size, format, or location, while additionally writing images and converting metadata to a standard common format.

A lot has changed and this post will only have the highlights. If you are new to the library, please see our full documentation for always up-to-date usage and a quickstart README.

Highlights

Mosaic Tile Stitching

For certain imaging formats we will stitch together the mosaic tiles stored in the image prior to returning the data. Currently the two formats supported by this functionality are LIF and CZI.

from aicsimageio import AICSImage

stitched_image = AICSImage("tiled.lif")
stitched_image.dims  # very large Y and X

If you don't want the tiles to be stitched back together you can turn this functionality off with reconstruct_mosaic=False in the AICSImage object init.

Xarray

The entire library now rests on xarray. We store all metadata, coordinate planes, and imaging data all in a single xarray.DataArray object. This allows for not only selecting by indices with the already existing aicsimageio methods: get_image_data and get_image_dask_data, but additionally to select by coordinate planes.

Mosaic Images and Xarray Together

A prime example of where xarray can be incredibly powerful for microscopy images is in large mosaic images. For example, instead of selecting by index, with xarray you can select by spatial coordinates:

from aicsimageio import AICSImage

# Read and stitch together a tiled image
large_stitched_image = AICSImage("tiled.lif")

# Only load in-memory the first 300x300 micrometers (or whatever unit) in Y and X dimensions
larged_stitched_image.xarray_dask_data.loc[:, :, :, :300, :300].data.compute()

Writers

While we are actively working on metadata translation functions, in the meantime we are working on setting ourselves up for the easiest method for format converstion. For now, we have added a simple save function to the AICSImage object that will convert the pixel data and key pieces of metadata to OME-TIFF for all of (or a select set of) scenes in the file.

from aicsimageio import AICSImage

AICSImage("my_file.czi").save("my_file.ome.tiff")

For users that want greater flexibility and specificity in image writing, we have entirely reworked our writers module and this release contains three writers: TwoDWriter, TimeseriesWriter, and OmeTiffWriter. The objective of this rework was to make n-dimensional image writing much easier, and most importantly, to make metadata attachment as simple as possible. And, we now support multi-scene / multi-image OME-TIFF writing when providing a List[ArrayLike]. See full writer documentation here.

OME Metadata Validation

Our OmeTiffReader and OmeTiffWriter now validate the read or produced OME metadata. Thanks to the ome-types library for the heavy lifting, we can now ensure that all metadata produced by this library is valid to the OME specification.

Better Scene Management

Over time while working with 3.x, we found that many formats contain a Scene or Image dimension that can be entirely different from the other instances of that dimension in pixel type, channel naming, image shape, etc. To solve this we have changed the AICSImage and Reader objects to statefully manage Scene while all other dimensions are still available.

In practice, this means on the AICSImage and Reader objects the user no longer receives the Scene dimension back in the data or dimensions properties (or any other related function or property).

To change scene while operating on a file you can call AICSImage.set_scene(scene_id) while retrieving which scenes are valid by using AICSImage.scenes.

from aicsimageio import AICSImage

many_scene_img = AICSImage("my_file.ome.tiff")
many_scene_img.current_scene  # the current operating scene
many_scene_img.scenes  # returns tuple of available scenes
many_scene_img.set_scene("Image:2")  # sets the current operating scene to "Image:2"

RGB / BGR Support

Due to the scene management changes, we no longer use the "S" dimension to represent "Scene". We use it to represent the "Samples" dimension (RGB / BGR) which means we now have an isolated dimension for color data. This is great because it allows us to directly support multi-channel RGB data, where previously we would expand RGB data into channels, even when the file had a channel dimension.

So if you encounter a file with "S" in the dimensions, you can know that you are working with an RGB file.

FSSpec Adoption

Across the board we have adopted fsspec for file handling. With 4.x you can now provide any URI supported by the fsspec library or any implementations of fsspec (s3fs (AWS S3), gcsfs (Google Cloud Storage), adlfs (Azure Data Lake), etc.).

In many cases, this means we now support direct reading from local or remote data as a base part of our API. As well as preparing us for supporting OME-Zarr!

from aicsimageio import AICSImage

wb_img = AICSImage("https://www.your-site.com/your-file.ome.tiff")
s3_img = AICSImage("s3://your-bucket/your-dir/your-file.ome.tiff")
gs_img = AICSImage("gs://your-bucket/your-dir/your-file.ome.tiff")

To read from remote storage, you must install the related fsspec implementation library. For s3 for example, you must install s3fs.

Splitting up Dependencies

To reduce the size of AICSImageIO on fresh installs and to make it easier to manage environments, we have split up dependencies into specific format installations.

By default, aicsimageio supports TIFF and OME-TIFF reading and writing. If you would like to install support for reading CZI files, you would simply append [czi] to the aicsimageio pip install: pip install aicsimageio[czi]. For multiple formats, you would add them as a comma separated string: pip install aicsimageio[czi,lif]. And to simply install support for reading all format implentations: pip install aicsimageio[all].

A full list of supported formats can be found here.

Roadmap and Motivation

After many discussions with the community we have finally written a roadmap and accompanying documentation for the library. If you are interested, please feel free to read them. Roadmap and Accompanying Documentation

Full List of Changes

  • Added support for reading all image data (including metadata and coordinate information) into xarray.DataArray objects. This can be accessed with AICSImage.xarray_data, AICSImage.xarray_dask_data, or Reader equivalents. Where possible, we create and attach coordinate planes to the xarray.DataArray objects to support more options for indexed data selection such as timepoint selection by unit of time, or pixel selection by micrometers.
  • Added support for reading multi-channel RGB imaging data utilizing a new Samples (S) dimension. This dimension will only appear in the AICSImage, Reader, and the produced np.ndarray, da.Array, xr.DataArray if present in the file and will be the last dimension, i.e. if provided an RGB image, AICSImage and related object dimensions will be "...YXS", if provided a single sample / greyscale image, AICSImage and related object dimensions will be "...YX". (This change also applies to DefaultReader and PNG, JPG, and similar formats)
  • OmeTiffReader now validates the found XML metadata against the referenced specification. If your file has invalid OME XML, this reader will fail and roll back to the base TiffReader. (In the process of updating this we found many bugs in our 3.x series OmeTiffWriter, our 4.x OmeTiffReader fixes these bugs at read time but that doesn't mean the file contains valid OME XML. It is recommended to upgrade and start using the new and improved, and validated, OmeTiffWriter.)
  • DefaultReader now fully supports reading "many-image" formats such as GIF, MP4, etc.
  • OmeTiffReader.metadata is now returned as the OME object from ome-types. This change additionally removes the vendor.OMEXML object.
  • Dimensions received an overhaul -- when you use AICSImage.dims or Reader.dims you will be returned a Dimensions object. Using this object you can get the native order of the dimensions and each dimensions size through attributes, i.e. AICSImage.dims.X returns the size of the X dimension, AICSImage.dims.order returns the string native order of the dimensions such as "TCZYX". Due to these changes we have removed all size functions and size_{dim} properties from various objects.
  • Replaced function get_physical_pixel_size with attribute physical_pixel_sizes that returns a new PhysicalPixelSizes NamedTuple object. This now allows attribute axis for each physical dimension, i.e. PhysicalPixelSizes.X returns the size of each X dimension pixel. This object can be cast to a base tuple but be aware that we have reversed the order from XYZ to ZYX to match the rest of the library's standard dimension order. Additionally, `PhysicalPixelSizes...
Read more

Dask Must Be Delayed

09 Sep 17:09
Compare
Choose a tag to compare

AICSImageIO 3.3.0

We are happy to announce the release of AICSImageIO 3.3.0!

AICSImageIO is a library for delayed parallel image reading, metadata parsing, and image writing for microscopy formats in pure Python. It is built on top of Dask to allow for any size image to act as a normal array as well as allow for distributed reading in parallel on your local machine or an HPC cluster.

Highlights

Non-Dask Functions and Properties are Fully In-Memory

The only change made in this release is to the internal behavior of our API. We found that users were very confused and questioned why certain operations were incredible slow while others were incredibly fast when considering the behaviors together.

Specifically, why did the following:

img = AICSImage("my_file.ome.tiff")
img.data
my_chunk = img.get_image_data("ZYX", C=1)  # the actual data we want to retrieve

Complete faster than this:

img = AICSImage("my_file.ome.tiff")
my_chunk = img.get_image_data("ZYX", C=1)  # the actual data we want to retrieve

(the difference being: preloading the entire image into memory rather than the get_image_data function simply using the delayed array)

To resolve this we have made an internal change to the behavior of the library that we will hold consistent moving forward.
If the word dask is not found in the function or property name when dealing with image data, the entire image will be read into memory in full prior to the function or property completing it's operation.

* In essence this is simply moving that preload into any of the related functions and properties.

The end result is that the user should see much faster read times when using get_image_data.
If the user was using this function on a too-large-for-memory image, this will result in them having to change over to using get_image_dask_data and call .compute on the returned value.

Contributors and Reviewers this Release (alphabetical)

Madison Bowden (@AetherUnbound)
Jackson Maxfield Brown (@JacksonMaxfield)
Jamie Sherman (@heeler)
Dan Toloudis (@toloudis)

v3.2.0 LIF Support and Reader Optimizations

13 May 17:57
Compare
Choose a tag to compare

AICSImageIO 3.2.0

We are happy to announce the release of AICSImageIO 3.2.0!

AICSImageIO is a library for delayed parallel image reading, metadata parsing, and image writing for microscopy formats in pure Python. It is built on top of Dask to allow for any size image to act as a normal array as well as allow for distributed reading in parallel on your local machine or an HPC cluster.

Highlights

LifReader

We now support reading 6D STCZYX Leica Image Files (LIF) and their metadata. Like all readers, this is implemented in a way that can be used to read and interact with any size file.

from aicsimageio import AICSImage, readers, imread
img = AICSImage("my_file.lif")
img = readers.LifReader("my_file.lif")
data = imread("my_file.lif")

Optimized Readers

After releasing v3.1.0, we noticed that our single threaded full image read performance wasn't as fast as the base file readers or similar microscopy file readers (czifile, tifffile, imageio, etc.) and set about resolving and optimizing our readers.

We are happy to report that in release v3.2.0 we now have comparable single threaded performance to similar libraries. And, like before, single threaded reading, regardless of library is beaten out when using a distributed cluster for parallel reading.

aicsimageio read time benchmarks

See our documentation on benchmarks for more information.

napari-aicsimageio

We were so excited for napari's 0.3.0 release that we may have made a plugin early. If you haven't seen it, napari-aicsimageio has also been released!

pip install napari-aicsimageio

By simply installing the plugin you get both a delayed and in-memory version of aicsimageio to use in napari so you get all the benefits of aicsimageio in the wonderful application that is napari.

Other Additions and Changes

  • Allow sequences and iterables to be passed to AICSImage.get_image_data and related functions
from aicsimageio import AICSImage

img = AICSImage("my_file.czi")
data = img.get_image_data("CZYX", S=0, T=0, Z=slice(0, -1, 5))  # get every fifth Z slice
data = img.get_image_data("CZYX", S=0, T=0, Z=[0, -1])  # get first and last Z slices
  • Fix written out OME metadata to support loading in ZEN
  • Various package maintenance tasks
    • Convert package to use Black formatting
    • Update PR template for easier introduction to contributing
    • Move test resources to S3

Contributors and Reviewers this Release (alphabetical)

Madison Bowden (@AetherUnbound)
Jackson Maxfield Brown (@JacksonMaxfield)
Jamie Sherman (@heeler)
Dan Toloudis (@toloudis)

v3.1.0 Delayed Dask Readers

03 Feb 18:33
Compare
Choose a tag to compare

To support large file reading and parallel image processing, we have converted all image readers over to using dask for data management.

What this means for the user:

  • No breaking changes when using the 3.0.* API (AICSImage.data, AICSImage.get_image_data, imread) all still return full image file reads back as numpy.ndarray.
  • New properties and functions for dask specific handling (AICSImage.dask_data, AICSImage.get_image_dask_data, imread_dask) return delayed dask arrays (dask.array.core.Array)
  • When using either the dask properties and functions, data will not be read until requested. If you want just the first channel of an image AICSImage.get_image_dask_data("STZYX", C=0) will only read and return a five dimensional dask array instead of reading the entire image and then selecting the data down.

A single breaking change:

  • We no longer support handing in file pointers or buffers.

If you want multiple workers to read or process the image, the context manager for AICSImage and all Reader classes now spawns or connects to a Dask cluster and client for the duration of the context manager. If you want to keep it open for longer than a single image, use the context manager exposed from dask_utils.cluster_and_client.

Extras:
napari has been directly added as an "interactive" dependency and if installed, the AICSImage.view_napari function is available for use. This function will launch a napari viewer with some default settings that we find to be good for viewing the data that aicsimageio generally interacts with.