Releases: AllenCellModeling/aicsimageio
Bugfixes and Stability
AICSImageIO 4.1.0
This minor release includes:
- A specification for readers to convert the file format's metadata into the OME metadata model.
- A utility function to process metadata translation with XSLT.
- A bugfix to support
tifffile>=2021.7.30
. This is a breaking change only if you are usingimg.xarray_data.attrs["unprocessed"]
(or anyxarray_*
variant). It changes the"unprocessed"
metadata into aDict[int , str]
rather than atifffile.TiffTags
object. - A minor breaking change / bugfix for LIF scale metadata correction. LIF spatial scales should now be accurate.
- An admin change of being less strict on dependencies so that
aicsimageio
can be installed in more environments.
For links to the PRs that produced these changes, see our full CHANGELOG
Contributors and Reviewers this Release (alphabetical)
Matte Bailey (@MatteBailey)
Jackson Maxfield Brown (@JacksonMaxfield)
Madison Swain-Bowden (@AetherUnbound)
Dan Toloudis (@toloudis)
Mosaics, Xarray, and Writers, Oh My!
AICSImageIO 4.0.0
We are happy to announce the release of AICSImageIO 4.0.0!
AICSImageIO is a library for image reading, metadata conversion, and image writing for microscopy formats in pure Python. It aims to be able to read microscopy images into a single unified API regardless of size, format, or location, while additionally writing images and converting metadata to a standard common format.
A lot has changed and this post will only have the highlights. If you are new to the library, please see our full documentation for always up-to-date usage and a quickstart README.
Highlights
Mosaic Tile Stitching
For certain imaging formats we will stitch together the mosaic tiles stored in the image prior to returning the data. Currently the two formats supported by this functionality are LIF
and CZI
.
from aicsimageio import AICSImage
stitched_image = AICSImage("tiled.lif")
stitched_image.dims # very large Y and X
If you don't want the tiles to be stitched back together you can turn this functionality off with reconstruct_mosaic=False
in the AICSImage
object init.
Xarray
The entire library now rests on xarray. We store all metadata, coordinate planes, and imaging data all in a single xarray.DataArray
object. This allows for not only selecting by indices with the already existing aicsimageio
methods: get_image_data
and get_image_dask_data
, but additionally to select by coordinate planes.
Mosaic Images and Xarray Together
A prime example of where xarray can be incredibly powerful for microscopy images is in large mosaic images. For example, instead of selecting by index, with xarray
you can select by spatial coordinates:
from aicsimageio import AICSImage
# Read and stitch together a tiled image
large_stitched_image = AICSImage("tiled.lif")
# Only load in-memory the first 300x300 micrometers (or whatever unit) in Y and X dimensions
larged_stitched_image.xarray_dask_data.loc[:, :, :, :300, :300].data.compute()
Writers
While we are actively working on metadata translation functions, in the meantime we are working on setting ourselves up for the easiest method for format converstion. For now, we have added a simple save
function to the AICSImage
object that will convert the pixel data and key pieces of metadata to OME-TIFF for all of (or a select set of) scenes in the file.
from aicsimageio import AICSImage
AICSImage("my_file.czi").save("my_file.ome.tiff")
For users that want greater flexibility and specificity in image writing, we have entirely reworked our writers module and this release contains three writers: TwoDWriter
, TimeseriesWriter
, and OmeTiffWriter
. The objective of this rework was to make n-dimensional image writing much easier, and most importantly, to make metadata attachment as simple as possible. And, we now support multi-scene / multi-image OME-TIFF writing when providing a List[ArrayLike]
. See full writer documentation here.
OME Metadata Validation
Our OmeTiffReader
and OmeTiffWriter
now validate the read or produced OME metadata. Thanks to the ome-types library for the heavy lifting, we can now ensure that all metadata produced by this library is valid to the OME specification.
Better Scene Management
Over time while working with 3.x, we found that many formats contain a Scene
or Image
dimension that can be entirely different from the other instances of that dimension in pixel type, channel naming, image shape, etc. To solve this we have changed the AICSImage
and Reader
objects to statefully manage Scene
while all other dimensions are still available.
In practice, this means on the AICSImage
and Reader
objects the user no longer receives the Scene
dimension back in the data
or dimensions
properties (or any other related function or property).
To change scene while operating on a file you can call AICSImage.set_scene(scene_id)
while retrieving which scenes are valid by using AICSImage.scenes
.
from aicsimageio import AICSImage
many_scene_img = AICSImage("my_file.ome.tiff")
many_scene_img.current_scene # the current operating scene
many_scene_img.scenes # returns tuple of available scenes
many_scene_img.set_scene("Image:2") # sets the current operating scene to "Image:2"
RGB / BGR Support
Due to the scene management changes, we no longer use the "S"
dimension to represent "Scene". We use it to represent the "Samples" dimension (RGB / BGR) which means we now have an isolated dimension for color data. This is great because it allows us to directly support multi-channel RGB data, where previously we would expand RGB data into channels, even when the file had a channel dimension.
So if you encounter a file with "S"
in the dimensions, you can know that you are working with an RGB file.
FSSpec Adoption
Across the board we have adopted fsspec for file handling. With 4.x you can now provide any URI supported by the fsspec
library or any implementations of fsspec
(s3fs (AWS S3), gcsfs (Google Cloud Storage), adlfs (Azure Data Lake), etc.).
In many cases, this means we now support direct reading from local or remote data as a base part of our API. As well as preparing us for supporting OME-Zarr!
from aicsimageio import AICSImage
wb_img = AICSImage("https://www.your-site.com/your-file.ome.tiff")
s3_img = AICSImage("s3://your-bucket/your-dir/your-file.ome.tiff")
gs_img = AICSImage("gs://your-bucket/your-dir/your-file.ome.tiff")
To read from remote storage, you must install the related fsspec
implementation library. For s3
for example, you must install s3fs
.
Splitting up Dependencies
To reduce the size of AICSImageIO on fresh installs and to make it easier to manage environments, we have split up dependencies into specific format installations.
By default, aicsimageio
supports TIFF
and OME-TIFF
reading and writing. If you would like to install support for reading CZI files, you would simply append [czi]
to the aicsimageio
pip install: pip install aicsimageio[czi]
. For multiple formats, you would add them as a comma separated string: pip install aicsimageio[czi,lif]
. And to simply install support for reading all format implentations: pip install aicsimageio[all]
.
A full list of supported formats can be found here.
Roadmap and Motivation
After many discussions with the community we have finally written a roadmap and accompanying documentation for the library. If you are interested, please feel free to read them. Roadmap and Accompanying Documentation
Full List of Changes
- Added support for reading all image data (including metadata and coordinate information) into
xarray.DataArray
objects. This can be accessed withAICSImage.xarray_data
,AICSImage.xarray_dask_data
, orReader
equivalents. Where possible, we create and attach coordinate planes to thexarray.DataArray
objects to support more options for indexed data selection such as timepoint selection by unit of time, or pixel selection by micrometers. - Added support for reading multi-channel RGB imaging data utilizing a new
Samples
(S
) dimension. This dimension will only appear in theAICSImage
,Reader
, and the producednp.ndarray
,da.Array
,xr.DataArray
if present in the file and will be the last dimension, i.e. if provided an RGB image,AICSImage
and related object dimensions will be"...YXS"
, if provided a single sample / greyscale image,AICSImage
and related object dimensions will be"...YX"
. (This change also applies toDefaultReader
and PNG, JPG, and similar formats) OmeTiffReader
now validates the found XML metadata against the referenced specification. If your file has invalid OME XML, this reader will fail and roll back to the baseTiffReader
. (In the process of updating this we found many bugs in our 3.x seriesOmeTiffWriter
, our 4.xOmeTiffReader
fixes these bugs at read time but that doesn't mean the file contains valid OME XML. It is recommended to upgrade and start using the new and improved, and validated,OmeTiffWriter
.)DefaultReader
now fully supports reading "many-image" formats such as GIF, MP4, etc.OmeTiffReader.metadata
is now returned as theOME
object from ome-types. This change additionally removes thevendor.OMEXML
object.- Dimensions received an overhaul -- when you use
AICSImage.dims
orReader.dims
you will be returned aDimensions
object. Using this object you can get the native order of the dimensions and each dimensions size through attributes, i.e.AICSImage.dims.X
returns the size of theX
dimension,AICSImage.dims.order
returns the string native order of the dimensions such as"TCZYX"
. Due to these changes we have removed allsize
functions andsize_{dim}
properties from various objects. - Replaced function
get_physical_pixel_size
with attributephysical_pixel_sizes
that returns a newPhysicalPixelSizes
NamedTuple
object. This now allows attribute axis for each physical dimension, i.e.PhysicalPixelSizes.X
returns the size of eachX
dimension pixel. This object can be cast to a basetuple
but be aware that we have reversed the order fromXYZ
toZYX
to match the rest of the library's standard dimension order. Additionally, `PhysicalPixelSizes...
Dask Must Be Delayed
AICSImageIO 3.3.0
We are happy to announce the release of AICSImageIO 3.3.0!
AICSImageIO is a library for delayed parallel image reading, metadata parsing, and image writing for microscopy formats in pure Python. It is built on top of Dask to allow for any size image to act as a normal array as well as allow for distributed reading in parallel on your local machine or an HPC cluster.
Highlights
Non-Dask Functions and Properties are Fully In-Memory
The only change made in this release is to the internal behavior of our API. We found that users were very confused and questioned why certain operations were incredible slow while others were incredibly fast when considering the behaviors together.
Specifically, why did the following:
img = AICSImage("my_file.ome.tiff")
img.data
my_chunk = img.get_image_data("ZYX", C=1) # the actual data we want to retrieve
Complete faster than this:
img = AICSImage("my_file.ome.tiff")
my_chunk = img.get_image_data("ZYX", C=1) # the actual data we want to retrieve
(the difference being: preloading the entire image into memory rather than the get_image_data
function simply using the delayed array)
To resolve this we have made an internal change to the behavior of the library that we will hold consistent moving forward.
If the word dask
is not found in the function or property name when dealing with image data, the entire image will be read into memory in full prior to the function or property completing it's operation.
* In essence this is simply moving that preload into any of the related functions and properties.
The end result is that the user should see much faster read times when using get_image_data
.
If the user was using this function on a too-large-for-memory image, this will result in them having to change over to using get_image_dask_data
and call .compute
on the returned value.
Contributors and Reviewers this Release (alphabetical)
Madison Bowden (@AetherUnbound)
Jackson Maxfield Brown (@JacksonMaxfield)
Jamie Sherman (@heeler)
Dan Toloudis (@toloudis)
v3.2.0 LIF Support and Reader Optimizations
AICSImageIO 3.2.0
We are happy to announce the release of AICSImageIO 3.2.0!
AICSImageIO is a library for delayed parallel image reading, metadata parsing, and image writing for microscopy formats in pure Python. It is built on top of Dask to allow for any size image to act as a normal array as well as allow for distributed reading in parallel on your local machine or an HPC cluster.
Highlights
LifReader
We now support reading 6D STCZYX Leica Image Files (LIF) and their metadata. Like all readers, this is implemented in a way that can be used to read and interact with any size file.
from aicsimageio import AICSImage, readers, imread
img = AICSImage("my_file.lif")
img = readers.LifReader("my_file.lif")
data = imread("my_file.lif")
Optimized Readers
After releasing v3.1.0, we noticed that our single threaded full image read performance wasn't as fast as the base file readers or similar microscopy file readers (czifile
, tifffile
, imageio
, etc.) and set about resolving and optimizing our readers.
We are happy to report that in release v3.2.0 we now have comparable single threaded performance to similar libraries. And, like before, single threaded reading, regardless of library is beaten out when using a distributed cluster for parallel reading.
See our documentation on benchmarks for more information.
napari-aicsimageio
We were so excited for napari's 0.3.0 release that we may have made a plugin early. If you haven't seen it, napari-aicsimageio
has also been released!
pip install napari-aicsimageio
By simply installing the plugin you get both a delayed and in-memory version of aicsimageio
to use in napari
so you get all the benefits of aicsimageio
in the wonderful application that is napari
.
Other Additions and Changes
- Allow sequences and iterables to be passed to
AICSImage.get_image_data
and related functions
from aicsimageio import AICSImage
img = AICSImage("my_file.czi")
data = img.get_image_data("CZYX", S=0, T=0, Z=slice(0, -1, 5)) # get every fifth Z slice
data = img.get_image_data("CZYX", S=0, T=0, Z=[0, -1]) # get first and last Z slices
- Fix written out OME metadata to support loading in ZEN
- Various package maintenance tasks
- Convert package to use Black formatting
- Update PR template for easier introduction to contributing
- Move test resources to S3
Contributors and Reviewers this Release (alphabetical)
Madison Bowden (@AetherUnbound)
Jackson Maxfield Brown (@JacksonMaxfield)
Jamie Sherman (@heeler)
Dan Toloudis (@toloudis)
v3.1.0 Delayed Dask Readers
To support large file reading and parallel image processing, we have converted all image readers over to using dask for data management.
What this means for the user:
- No breaking changes when using the 3.0.* API (
AICSImage.data
,AICSImage.get_image_data
,imread
) all still return full image file reads back asnumpy.ndarray
. - New properties and functions for dask specific handling (
AICSImage.dask_data
,AICSImage.get_image_dask_data
,imread_dask
) return delayed dask arrays (dask.array.core.Array
) - When using either the
dask
properties and functions, data will not be read until requested. If you want just the first channel of an imageAICSImage.get_image_dask_data("STZYX", C=0)
will only read and return a five dimensional dask array instead of reading the entire image and then selecting the data down.
A single breaking change:
- We no longer support handing in file pointers or buffers.
If you want multiple workers to read or process the image, the context manager for AICSImage
and all Reader
classes now spawns or connects to a Dask cluster and client for the duration of the context manager. If you want to keep it open for longer than a single image, use the context manager exposed from dask_utils.cluster_and_client
.
Extras:
napari has been directly added as an "interactive" dependency and if installed, the AICSImage.view_napari
function is available for use. This function will launch a napari
viewer with some default settings that we find to be good for viewing the data that aicsimageio
generally interacts with.