Skip to content

Commit

Permalink
Merge pull request #16 from UCL-ARC/km/docs-expansion
Browse files Browse the repository at this point in the history
Expand user/developer documentation
  • Loading branch information
K-Meech authored Sep 12, 2024
2 parents cad22bd + 7ff97da commit e23b16f
Show file tree
Hide file tree
Showing 5 changed files with 299 additions and 54 deletions.
199 changes: 145 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,35 @@ the same scanning session. The analysis steps (including pre- and post- processi
- [UNet-pgs](https://www.sciencedirect.com/science/article/pii/S1053811921004171?via%3Dihub) : A segmentation pipeline
for white matter hyperintensities (WMHs) using U-Net.

- [MRIcroGL](https://www.nitrc.org/projects/mricrogl) : A tool for converting DICOM images to NIfTI format.
- [MRIcroGL](https://www.nitrc.org/projects/mricrogl) : A tool including a graphical interface for dcm2niix to convert
DICOM images to NIfTI format.

For details of the processing steps, see the [pipeline documentation](/docs/pipeline.md).

The pipeline is available as a [Docker](https://www.docker.com/) or [Apptainer](https://apptainer.org/) container,
allowing it to be run on many different systems.

## Installation
## How to run the pipeline?

Setting up and running the pipeline requires the following steps, which are explained in detail in the sections below:

```mermaid
%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart TD
installation("`Install prerequisites`")
convert("`Convert images to NIfTI`")
structure("`Create standard directory structure`")
build("`Build Docker / Apptainer image`")
run("`Run the container with Docker / Apptainer`")
installation --> convert
convert --> structure
structure --> build
build --> run
```

## 1. Install prerequisites

If your MRI data isn't in NIfTI format, install [MRIcroGL from their website](https://www.nitrc.org/projects/mricrogl).
If your MRI data isn't in NIfTI format, download [MRIcroGL from their website](https://www.nitrc.org/projects/mricrogl).

If you want to run the container via Docker, install [Docker Desktop](https://docs.docker.com/get-started/get-docker/).
They have installation instructions for [Mac](https://docs.docker.com/desktop/install/mac-install/),
Expand All @@ -30,7 +51,65 @@ They have installation instructions for [Mac](https://docs.docker.com/desktop/in
If you want to use Apptainer instead, then follow the
[installation instructions on their website](https://apptainer.org/docs/user/main/quick_start.html).

## Build the Docker / Apptainer image
## 2. Convert images to NIfTI format (if required)

If your images aren't in NIfTI format, you can use [`dcm2niix`](https://github.com/rordenlab/dcm2niix) to convert them.
This is available via the command-line or via a graphical interface inside
[MRIcroGL](https://www.nitrc.org/projects/mricrogl). Steps for converting with the graphical interface are provided
below:

Open MRIcroGL and select `Import > Convert DICOM to NifTI`

![MRIcroGL user interface with the DICOM to NifTI option highlighted](/docs/images/MRIcroGL_window.png)

Set the the output file format to `%i_%p` to include the patient ID and series description in the filename.

![dcm2niix user interface](/docs/images/MRIcroGL_convert_dicom.png)

You can then drag/drop DICOM files and folders onto the right side of the window to convert them. Alternatively, you can
click the `Select Folder To Convert` button (at the bottom of the left side of the window) to select a folder to convert
directly.

## 3. Create standard directory structure

Create a directory (anywhere on your computer) to hold your input image data and the generated results.

Inside this directory:

- Create `code` and `data` directories. The `code` folder should remain empty.

- Inside the `data` folder, create a `subjects.txt` file that contains subject identifiers (one per line) e.g. these
could be numeric subject ids.

- For each subject id:
- Create a directory at `data/subject-id` (replacing 'subject-id' with the relevant id from your `subjects.txt` file)

- Create a sub-directory inside the 'subject-id' directory called `niftis`.

- Inside `niftis` place the subject's T1 MRI scan and FLAIR MRI scan. Both these files should be in nifti format
(ending `.nii.gz`) and contain either `T1` or `FLAIR` in their name respectively.

Your final file structure should look like below (for two example subject ids):

```bash
Enigma-PD-WML
├───code
└───data
├───subject-1
│ └───niftis
│ ├───T1.nii.gz
│ └───FLAIR.nii.gz
├───subject-2
│ └───niftis
│ ├───T1.nii.gz
│ └───FLAIR.nii.gz
└───subjects.txt
```

## 4. Build the Docker / Apptainer image

To build the image (in Docker or Apptainer), you have the following options:

Expand Down Expand Up @@ -76,55 +155,13 @@ docker image save enigma-pd-wml -o enigma-pd-wml.tar
apptainer build enigma-pd-wml.sif docker-archive:enigma-pd-wml.tar
```

## Prepare your image data

### Convert to NIfTI format

If your images aren't in NIfTI format, you can use [MRIcroGL](https://www.nitrc.org/projects/mricrogl) to convert them.

### Make directory structure

Create a directory (anywhere on your computer) to hold your input image data and the generated results.

Inside this directory:

- Create `code` and `data` directories. The `code` folder should remain empty.

- Inside the `data` folder, create a `subjects.txt` file that contains subject identifiers (one per line).

- For each subject id:
- Create a directory at `data/subject-id` (replacing 'subject-id' with the relevant id from your `subjects.txt` file)

- Create a sub-directory inside the 'subject-id' directory called `niftis`.

- Inside `niftis` place the subject's T1 MRI scan and FLAIR MRI scan. Both these files should be in nifti format
(ending `.nii.gz`) and contain either `T1` or `FLAIR` in their name respectively.

Your final file structure should look like below (for two example subject ids):

```bash
Enigma-PD-WML
├───code
└───data
├───subject-1
│ └───niftis
│ ├───T1.nii.gz
│ └───FLAIR.nii.gz
├───subject-2
│ └───niftis
│ ├───T1.nii.gz
│ └───FLAIR.nii.gz
└───subjects.txt
```

## Run the container
## 5. Run the container

Below are various ways to run the container. For each, make sure you run the command from the top level of the
directory you made in the last section. Note [there are some options](#options) you can add to the end of the
docker/apptainer command.
directory you made in the ['Create standard directory structure' section](#3-create-standard-directory-structure).
Note [there are some options](#options) you can add to the end of the docker/apptainer command.

If you encounter issues when running the pipeline, check the [output logs](#output-logs) for any errors.

### Via docker (using image from docker hub)

Expand All @@ -140,8 +177,9 @@ docker run -v "$(pwd)":/home -v "$(pwd)"/code:/code -v "$(pwd)"/data:/data enigm

### Via apptainer

You'll need to put the `.sif` file in the top level of the directory you made in the last section, or provide the full
path to its location.
You'll need to put the `.sif` file in the top level of the directory you made in the
['Create standard directory structure' section](#3-create-standard-directory-structure), or provide the full path to
its location.

```bash
apptainer run --bind ${PWD}:/home --bind ${PWD}/code:/code --bind ${PWD}/data:/data enigma-pd-wml.sif
Expand All @@ -159,11 +197,64 @@ apptainer run --bind ${PWD}:/home --bind ${PWD}/code:/code --bind ${PWD}/data:/d
-n 5
```

A good default value is the number of cores on your system, but be wary of increased memory usage.
A good default value is the number of cores on your system, but be
[wary of increased memory usage](#tensorflow-memory-usage).

- `-o` : whether to overwrite existing output files

By default (without `-o`), the pipeline will try to re-use any existing output files, skipping steps that are already
complete. This is useful if, for example, the pipeline fails at a late stage and you want to run it again, without
having to re-run time-consuming earlier steps. With `-o` the pipeline will run all steps again and ensure any previous
output is overwritten.

## Pipeline output

### Output images

The main pipeline output will be written to a zip file (per subject) at
`/data/UNet-pgs/subject-id/subject-id_results.zip`

This should contain six files within an `output` directory:

- `results2mni_lin.nii.gz`: WML segmentations linearly transformed to MNI space.

- `results2mni_lin_deep.nii.gz`: WML segmentations (deep white matter) linearly transformed to MNI space.

- `results2min_lin_perivent.nii.gz`: WML segmentations (periventricular) linearly transformed to MNI space.

- `results2mni_nonlin.nii.gz`: WML segmentations non-linearly transformed to MNI space.

- `results2min_nonlin_deep.nii.gz`: WML segmentations (deep white matter) non-linearly transformed to MNI space.

- `results2mni_nonlin_perivent.nii.gz`: WML segmentations (periventricular) non-linearly transformed to MNI space.

### Output logs

Pipeline logs can be found at:

- `/code/overall_log.txt`: contains minimal information about the initial pipeline setup.

- `/code/subject_logs`: one log per subject, containing information about the various processing steps.

## Common issues

### Tensorflow memory usage

A common issue is UNets-pgs failing due to high memory usage. You may see warnings / errors in your subject logs
similar to:

- `tensorflow/core/framework/allocator.cc:124] Allocation of 675840000 exceeds 10% of system memory.`

- `/WMHs_segmentation_PGS.sh: line 14: 69443 Killed`

You may want to try:

- Running the pipeline on a system with more memory

- Reducing the number of jobs passed to the `-n` option (if you're using it). This will slow down the pipeline, but
also reduce overall memory usage.
## For developers
Some brief notes on the development setup for this repository are provided in a
[separate developer docs file](/docs/developer.md).
60 changes: 60 additions & 0 deletions docs/developer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Developer docs

## Making new releases to docker hub

This repository has a github actions workflow to automate uploading to
[Docker Hub](https://hub.docker.com/r/hamiedaharoon24/enigma-pd-wml/tags) when a new release is made on github.

- Go to [the releases tab](https://github.com/UCL-ARC/Enigma-PD-WML/releases) and click 'Draft a new release'.

- Click 'Choose a tag' and enter a new version number e.g. `v1.0.0`

- Click 'Generate release notes'. This will add a summary of any commits since the last release.

- Click the green 'Publish release' button at the bottom left.

- This will trigger the action to run and upload the code on the `main` branch to Docker Hub. Note: as the image is very
large, this will take a while! (around 15 minutes)

## Linting setup (pre-commit)

This repository has another github actions workflow to run various linting checks on pull requests / commits to `main`.
This uses [`pre-commit`](https://pre-commit.com/), a python based tool. The enabled checks can be seen/updated in the
[pre-commit configuration file](https://github.com/UCL-ARC/Enigma-PD-WML/blob/main/.pre-commit-config.yaml).

Some of the main ones used are:

- [hadolint](https://github.com/hadolint/hadolint): for linting Dockerfiles
- [shellcheck](https://www.shellcheck.net/): for linting shell scripts

It can be useful to run `pre-commit` locally to catch issues early. To do so, you will need to have python installed
locally (for example, by installing [Miniforge](https://github.com/conda-forge/miniforge) or similar)

Then run:

```bash
pip install pre-commit
```

Then (from inside a local clone of this github repository), run:

```bash
pre-commit install
```

`pre-commit` should now run automatically every time you `git commit`, flagging any issues.

## Some notes on the Dockerfile

There are two main components to the Dockerfile:

- The requirements for UNets-pgs
- The requirements for FSL

All requirements for the UNets-pgs workflow come from the
[base pgs image](https://hub.docker.com/r/cvriend/pgs/tags), including the bash script and packages like tensorflow.

FSL is installed as detailed in their [installation docs](https://fsl.fmrib.ox.ac.uk/fsl/docs/#/install/container)
and [configuration docs](https://fsl.fmrib.ox.ac.uk/fsl/docs/#/install/configuration). We're using the `-V` option at
the end of the `fslinstaller` command to [fix it to a specific FSL version
](https://fsl.fmrib.ox.ac.uk/fsl/docs/#/install/index?id=installing-older-versions-of-fsl).
Binary file added docs/images/MRIcroGL_convert_dicom.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/MRIcroGL_window.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit e23b16f

Please sign in to comment.