A CWL pipeline for processing CODEX image data, using Cytokit.
- Collect required parameters from metadata files.
- Perform illumination correction with Fiji plugin BaSiC
- Find sharpest z-plane for each channel, using variation of Laplacian
- Perform stitching of tiles using Fiji plugin BigStitcher
- Create Cytokit YAML config file containing parameters from input metadata
- Run Cytokit's
processor
command to perform tile pre-processing, and nucleus and cell segmentation. - Run Cytokit's
operator
command to extract all antigen fluoresence images (discarding blanks and empty channels). - Generate OME-TIFF versions of TIFFs created by Cytokit.
- Stitch tiles with segmentation masks
- Perform downstream analysis using SPRM.
Please use HuBMAP Consortium fork of cwltool
to be able to run pipeline with GPU in Docker and Singularity containers.
For the list of python packages check environment.yml
.
cwltool pipeline.cwl subm.yaml
If you use Singularity containers add --singularity
. Example of submission file subm.yaml
is provided in the repo.
codex_dataset/
src_data OR raw
├── channelnames.txt
├── channelnames_report.csv
├── experiment.json
├── exposure_times.txt
├── segmentation.json
├── Cyc1_reg1 OR Cyc001_reg001
│ ├── 1_00001_Z001_CH1.tif
│ ├── 1_00001_Z001_CH2.tif
│ │ ...
│ └── 1_0000N_Z00N_CHN.tif
└── Cyc1_reg2 OR Cyc001_reg002
├── 2_00001_Z001_CH1.tif
├── 2_00001_Z001_CH2.tif
│ ...
└── 1_0000N_Z00N_CHN.tif
Images should be separated into directories by cycles and regions using the following pattern Cyc{cycle:d}_reg{region:d}
.
The file names must contain region, tile, z-plane and channel ids starting from 1, and follow this pattern
{region:d}_{tile:05d}_Z{zplane:03d}_CH{channel:d}.tif
.
Necessary metadata files that must be present in the input directory:
experiment.json
- acquisition parameters and data structure;segmentation.json
- which channel from which cycle to use for segmentation;channelnames.txt
- list of channel names, one per row;channelnames_report.csv
- which channels to use, and which to exclude;exposure_times.txt
- not used at the moment, but will be useful for background subtraction.
Examples of these files are present in the directory metadata_examples
.
Note: all fields related to regions, cycles, channels, z-planes and tiles start from 1,
and xyResolution, zPitch are measured in nm
.
pipeline_output/
├── expr
│ ├── reg001_expr.ome.tiff
│ └── reg002_expr.ome.tiff
└── mask
├── reg001_mask.ome.tiff
└── reg002_expr.ome.tiff
Where expr
directory contains processed images and mask
contains segmentation masks.
The output of SPRM will be different, see https://github.com/hubmapconsortium/sprm .
Code in this repository is formatted with black and isort, and this is checked via Travis CI.
A pre-commit hook configuration is provided, which runs black
and isort
before committing.
Run pre-commit install
in each clone of this repository which you will use for development (after pip install pre-commit
into an appropriate Python environment, if necessary).
Two Dockerfile
s are included in this repository. A docker_images.txt
manifest is included, which is intended
for use in the build_docker_containers
script provided by the
multi-docker-build
Python package. This package can be installed
with
python -m pip install multi-docker-build
The master
branch is intended to be production-ready at all times, and should always reference Docker containers
with the latest
tag.
Publication of tagged "release" versions of the pipeline is handled with the
HuBMAP pipeline release management Python package. To
release a new pipeline version, ensure that the master
branch contains all commits that you want to include in the release,
then run
tag_releae_pipeline v0.whatever
See the pipeline release managment script usage notes for additional options, such as GPG signing.