codex-pipeline

A CWL pipeline for processing CODEX image data, using Cytokit.

Pipeline steps

Collect required parameters from metadata files.
Perform illumination correction with Fiji plugin BaSiC
Find sharpest z-plane for each channel, using variation of Laplacian
Perform stitching of tiles using Fiji plugin BigStitcher
Create Cytokit YAML config file containing parameters from input metadata
Run Cytokit's processor command to perform tile pre-processing, and nucleus and cell segmentation.
Run Cytokit's operator command to extract all antigen fluoresence images (discarding blanks and empty channels).
Generate OME-TIFF versions of TIFFs created by Cytokit.
Stitch tiles with segmentation masks
Perform downstream analysis using SPRM.

Requirements

Please use HuBMAP Consortium fork of cwltool to be able to run pipeline with GPU in Docker and Singularity containers.
For the list of python packages check environment.yml.

How to run

cwltool pipeline.cwl subm.yaml

If you use Singularity containers add --singularity. Example of submission file subm.yaml is provided in the repo.

Expected input directory and file structure

codex_dataset/
src_data OR raw
    ├── channelnames.txt
    ├── channelnames_report.csv
    ├── experiment.json
    ├── exposure_times.txt
    ├── segmentation.json
    ├── Cyc1_reg1 OR Cyc001_reg001  
    │     ├── 1_00001_Z001_CH1.tif
    │     ├── 1_00001_Z001_CH2.tif
    │     │              ...
    │     └── 1_0000N_Z00N_CHN.tif
    └── Cyc1_reg2 OR Cyc001_reg002  
          ├── 2_00001_Z001_CH1.tif
          ├── 2_00001_Z001_CH2.tif
          │             ...
          └── 1_0000N_Z00N_CHN.tif

Images should be separated into directories by cycles and regions using the following pattern Cyc{cycle:d}_reg{region:d}. The file names must contain region, tile, z-plane and channel ids starting from 1, and follow this pattern {region:d}_{tile:05d}_Z{zplane:03d}_CH{channel:d}.tif.

Necessary metadata files that must be present in the input directory:

experiment.json - acquisition parameters and data structure;
segmentation.json - which channel from which cycle to use for segmentation;
channelnames.txt - list of channel names, one per row;
channelnames_report.csv - which channels to use, and which to exclude;
exposure_times.txt - not used at the moment, but will be useful for background subtraction.

Examples of these files are present in the directory metadata_examples. Note: all fields related to regions, cycles, channels, z-planes and tiles start from 1, and xyResolution, zPitch are measured in nm.

Output file structure

pipeline_output/
├── expr
│   ├── reg001_expr.ome.tiff
│   └── reg002_expr.ome.tiff
└── mask
    ├── reg001_mask.ome.tiff
    └── reg002_expr.ome.tiff

Where expr directory contains processed images and mask contains segmentation masks. The output of SPRM will be different, see https://github.com/hubmapconsortium/sprm .

Development

Code in this repository is formatted with black and isort, and this is checked via Travis CI.

A pre-commit hook configuration is provided, which runs black and isort before committing. Run pre-commit install in each clone of this repository which you will use for development (after pip install pre-commit into an appropriate Python environment, if necessary).

Building containers

Two Dockerfiles are included in this repository. A docker_images.txt manifest is included, which is intended for use in the build_docker_containers script provided by the multi-docker-build Python package. This package can be installed with

python -m pip install multi-docker-build

Release process

The master branch is intended to be production-ready at all times, and should always reference Docker containers with the latest tag.

Publication of tagged "release" versions of the pipeline is handled with the HuBMAP pipeline release management Python package. To release a new pipeline version, ensure that the master branch contains all commits that you want to include in the release, then run

tag_releae_pipeline v0.whatever

See the pipeline release managment script usage notes for additional options, such as GPG signing.

Name		Name	Last commit message	Last commit date
Latest commit History 436 Commits
bin		bin
cytokit-docker		cytokit-docker
metadata_examples		metadata_examples
steps		steps
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.travis.yml		.travis.yml
Dockerfile		Dockerfile
Dockerfile_fiji		Dockerfile_fiji
LICENSE		LICENSE
README.md		README.md
docker_images.txt		docker_images.txt
environment.yml		environment.yml
pipeline-manifest.json		pipeline-manifest.json
pipeline.cwl		pipeline.cwl
pipeline_release_mgmt.yaml		pipeline_release_mgmt.yaml
pyproject.toml		pyproject.toml
requirements-test.txt		requirements-test.txt
subm.yaml		subm.yaml
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codex-pipeline

Pipeline steps

Requirements

How to run

Expected input directory and file structure

Output file structure

Development

Building containers

Release process

About

Releases

Packages

Contributors 6

Languages

License

hubmapconsortium/codex-pipeline

Folders and files

Latest commit

History

Repository files navigation

codex-pipeline

Pipeline steps

Requirements

How to run

Expected input directory and file structure

Output file structure

Development

Building containers

Release process

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages