Merge pull request #16 from UCL-ARC/km/docs-expansion

Expand user/developer documentation
UCL-ARC · Sep 12, 2024 · e23b16f · e23b16f
2 parents cad22bd + 7ff97da
commit e23b16f
Show file tree

Hide file tree

Showing 5 changed files with 299 additions and 54 deletions.
diff --git a/README.md b/README.md
@@ -13,14 +13,35 @@ the same scanning session. The analysis steps (including pre- and post- processi
 - [UNet-pgs](https://www.sciencedirect.com/science/article/pii/S1053811921004171?via%3Dihub) : A segmentation pipeline
   for white matter hyperintensities (WMHs) using U-Net.
 
-- [MRIcroGL](https://www.nitrc.org/projects/mricrogl) : A tool for converting DICOM images to NIfTI format.
+- [MRIcroGL](https://www.nitrc.org/projects/mricrogl) : A tool including a graphical interface for dcm2niix to convert
+  DICOM images to NIfTI format.
+
+For details of the processing steps, see the [pipeline documentation](/docs/pipeline.md).
 
 The pipeline is available as a [Docker](https://www.docker.com/) or [Apptainer](https://apptainer.org/) container,
 allowing it to be run on many different systems.
 
-## Installation
+## How to run the pipeline?
+
+Setting up and running the pipeline requires the following steps, which are explained in detail in the sections below:
+
+```mermaid
+%%{init: {"flowchart": {"htmlLabels": false}} }%%
+flowchart TD
+    installation("`Install prerequisites`")
+    convert("`Convert images to NIfTI`")
+    structure("`Create standard directory structure`")
+    build("`Build Docker / Apptainer image`")
+    run("`Run the container with Docker / Apptainer`")
+    installation --> convert
+    convert --> structure
+    structure --> build
+    build --> run
+```
+
+## 1. Install prerequisites
 
-If your MRI data isn't in NIfTI format, install [MRIcroGL from their website](https://www.nitrc.org/projects/mricrogl).
+If your MRI data isn't in NIfTI format, download [MRIcroGL from their website](https://www.nitrc.org/projects/mricrogl).
 
 If you want to run the container via Docker, install [Docker Desktop](https://docs.docker.com/get-started/get-docker/).
 They have installation instructions for [Mac](https://docs.docker.com/desktop/install/mac-install/),
@@ -30,7 +51,65 @@ They have installation instructions for [Mac](https://docs.docker.com/desktop/in
 If you want to use Apptainer instead, then follow the
 [installation instructions on their website](https://apptainer.org/docs/user/main/quick_start.html).
 
-## Build the Docker / Apptainer image
+## 2. Convert images to NIfTI format (if required)
+
+If your images aren't in NIfTI format, you can use [`dcm2niix`](https://github.com/rordenlab/dcm2niix) to convert them.
+This is available via the command-line or via a graphical interface inside
+[MRIcroGL](https://www.nitrc.org/projects/mricrogl). Steps for converting with the graphical interface are provided
+below:
+
+Open MRIcroGL and select `Import > Convert DICOM to NifTI`
+
+![MRIcroGL user interface with the DICOM to NifTI option highlighted](/docs/images/MRIcroGL_window.png)
+
+Set the the output file format to `%i_%p` to include the patient ID and series description in the filename.
+
+![dcm2niix user interface](/docs/images/MRIcroGL_convert_dicom.png)
+
+You can then drag/drop DICOM files and folders onto the right side of the window to convert them. Alternatively, you can
+click the `Select Folder To Convert` button (at the bottom of the left side of the window) to select a folder to convert
+directly.
+
+## 3. Create standard directory structure
+
+Create a directory (anywhere on your computer) to hold your input image data and the generated results.
+
+Inside this directory:
+
+- Create `code` and `data` directories. The `code` folder should remain empty.
+
+- Inside the `data` folder, create a `subjects.txt` file that contains subject identifiers (one per line) e.g. these
+  could be numeric subject ids.
+
+- For each subject id:
+  - Create a directory at `data/subject-id` (replacing 'subject-id' with the relevant id from your `subjects.txt` file)
+
+  - Create a sub-directory inside the 'subject-id' directory called `niftis`.
+
+  - Inside `niftis` place the subject's T1 MRI scan and FLAIR MRI scan. Both these files should be in nifti format
+  (ending `.nii.gz`) and contain either `T1` or `FLAIR` in their name respectively.
+
+Your final file structure should look like below (for two example subject ids):
+
+```bash
+Enigma-PD-WML
+│
+├───code
+│
+└───data
+    ├───subject-1
+    │   └───niftis
+    │       ├───T1.nii.gz
+    │       └───FLAIR.nii.gz
+    │
+    ├───subject-2
+    │   └───niftis
+    │       ├───T1.nii.gz
+    │       └───FLAIR.nii.gz
+    └───subjects.txt
+```
+
+## 4. Build the Docker / Apptainer image
 
 To build the image (in Docker or Apptainer), you have the following options:
 
@@ -76,55 +155,13 @@ docker image save enigma-pd-wml -o enigma-pd-wml.tar
 apptainer build enigma-pd-wml.sif docker-archive:enigma-pd-wml.tar
 ```
 
-## Prepare your image data
-
-### Convert to NIfTI format
-
-If your images aren't in NIfTI format, you can use [MRIcroGL](https://www.nitrc.org/projects/mricrogl) to convert them.
-
-### Make directory structure
-
-Create a directory (anywhere on your computer) to hold your input image data and the generated results.
-
-Inside this directory:
-
-- Create `code` and `data` directories. The `code` folder should remain empty.
-
-- Inside the `data` folder, create a `subjects.txt` file that contains subject identifiers (one per line).
-
-- For each subject id:
-  - Create a directory at `data/subject-id` (replacing 'subject-id' with the relevant id from your `subjects.txt` file)
-
-  - Create a sub-directory inside the 'subject-id' directory called `niftis`.
-
-  - Inside `niftis` place the subject's T1 MRI scan and FLAIR MRI scan. Both these files should be in nifti format
-  (ending `.nii.gz`) and contain either `T1` or `FLAIR` in their name respectively.
-
-Your final file structure should look like below (for two example subject ids):
-
-```bash
-Enigma-PD-WML
-│
-├───code
-│
-└───data
-    ├───subject-1
-    │   └───niftis
-    │       ├───T1.nii.gz
-    │       └───FLAIR.nii.gz
-    │
-    ├───subject-2
-    │   └───niftis
-    │       ├───T1.nii.gz
-    │       └───FLAIR.nii.gz
-    └───subjects.txt
-```
-
-## Run the container
+## 5. Run the container
 
 Below are various ways to run the container. For each, make sure you run the command from the top level of the
-directory you made in the last section. Note [there are some options](#options) you can add to the end of the
-docker/apptainer command.
+directory you made in the ['Create standard directory structure' section](#3-create-standard-directory-structure).
+Note [there are some options](#options) you can add to the end of the docker/apptainer command.
+
+If you encounter issues when running the pipeline, check the [output logs](#output-logs) for any errors.
 
 ### Via docker (using image from docker hub)
 
@@ -140,8 +177,9 @@ docker run -v "$(pwd)":/home -v "$(pwd)"/code:/code -v "$(pwd)"/data:/data enigm
 
 ### Via apptainer
 
-You'll need to put the `.sif` file in the top level of the directory you made in the last section, or provide the full
-path to its location.
+You'll need to put the `.sif` file in the top level of the directory you made in the
+['Create standard directory structure' section](#3-create-standard-directory-structure), or provide the full path to
+its location.
 
 ```bash
 apptainer run --bind ${PWD}:/home --bind ${PWD}/code:/code --bind ${PWD}/data:/data enigma-pd-wml.sif
@@ -159,11 +197,64 @@ apptainer run --bind ${PWD}:/home --bind ${PWD}/code:/code --bind ${PWD}/data:/d
   -n 5
   ```
 
-  A good default value is the number of cores on your system, but be wary of increased memory usage.
+  A good default value is the number of cores on your system, but be
+  [wary of increased memory usage](#tensorflow-memory-usage).
 
 - `-o` : whether to overwrite existing output files
 
   By default (without `-o`), the pipeline will try to re-use any existing output files, skipping steps that are already
   complete. This is useful if, for example, the pipeline fails at a late stage and you want to run it again, without
   having to re-run time-consuming earlier steps. With `-o` the pipeline will run all steps again and ensure any previous
   output is overwritten.
+
+## Pipeline output
+
+### Output images
+
+The main pipeline output will be written to a zip file (per subject) at
+`/data/UNet-pgs/subject-id/subject-id_results.zip`
+
+This should contain six files within an `output` directory:
+
+- `results2mni_lin.nii.gz`: WML segmentations linearly transformed to MNI space.
+
+- `results2mni_lin_deep.nii.gz`: WML segmentations (deep white matter) linearly transformed to MNI space.
+
+- `results2min_lin_perivent.nii.gz`: WML segmentations (periventricular) linearly transformed to MNI space.
+
+- `results2mni_nonlin.nii.gz`: WML segmentations non-linearly transformed to MNI space.
+
+- `results2min_nonlin_deep.nii.gz`: WML segmentations (deep white matter) non-linearly transformed to MNI space.
+
+- `results2mni_nonlin_perivent.nii.gz`: WML segmentations (periventricular) non-linearly transformed to MNI space.
+
+### Output logs
+
+Pipeline logs can be found at:
+
+- `/code/overall_log.txt`: contains minimal information about the initial pipeline setup.
+
+- `/code/subject_logs`: one log per subject, containing information about the various processing steps.
+
+## Common issues
+
+### Tensorflow memory usage
+
+A common issue is UNets-pgs failing due to high memory usage. You may see warnings / errors in your subject logs
+similar to:
+
+- `tensorflow/core/framework/allocator.cc:124] Allocation of 675840000 exceeds 10% of system memory.`
+
+- `/WMHs_segmentation_PGS.sh: line 14: 69443 Killed`
+
+You may want to try:
+
+- Running the pipeline on a system with more memory
+
+- Reducing the number of jobs passed to the `-n` option (if you're using it). This will slow down the pipeline, but
+  also reduce overall memory usage.
+
+## For developers
+
+Some brief notes on the development setup for this repository are provided in a
+[separate developer docs file](/docs/developer.md).
diff --git a/docs/developer.md b/docs/developer.md
@@ -0,0 +1,60 @@
+# Developer docs
+
+## Making new releases to docker hub
+
+This repository has a github actions workflow to automate uploading to
+[Docker Hub](https://hub.docker.com/r/hamiedaharoon24/enigma-pd-wml/tags) when a new release is made on github.
+
+- Go to [the releases tab](https://github.com/UCL-ARC/Enigma-PD-WML/releases) and click 'Draft a new release'.
+
+- Click 'Choose a tag' and enter a new version number e.g. `v1.0.0`
+
+- Click 'Generate release notes'. This will add a summary of any commits since the last release.
+
+- Click the green 'Publish release' button at the bottom left.
+
+- This will trigger the action to run and upload the code on the `main` branch to Docker Hub. Note: as the image is very
+  large, this will take a while! (around 15 minutes)
+
+## Linting setup (pre-commit)
+
+This repository has another github actions workflow to run various linting checks on pull requests / commits to `main`.
+This uses [`pre-commit`](https://pre-commit.com/), a python based tool. The enabled checks can be seen/updated in the
+[pre-commit configuration file](https://github.com/UCL-ARC/Enigma-PD-WML/blob/main/.pre-commit-config.yaml).
+
+Some of the main ones used are:
+
+- [hadolint](https://github.com/hadolint/hadolint): for linting Dockerfiles
+- [shellcheck](https://www.shellcheck.net/): for linting shell scripts
+
+It can be useful to run `pre-commit` locally to catch issues early. To do so, you will need to have python installed
+locally (for example, by installing [Miniforge](https://github.com/conda-forge/miniforge) or similar)
+
+Then run:
+
+```bash
+pip install pre-commit
+```
+
+Then (from inside a local clone of this github repository), run:
+
+```bash
+pre-commit install
+```
+
+`pre-commit` should now run automatically every time you `git commit`, flagging any issues.
+
+## Some notes on the Dockerfile
+
+There are two main components to the Dockerfile:
+
+- The requirements for UNets-pgs
+- The requirements for FSL
+
+All requirements for the UNets-pgs workflow come from the
+[base pgs image](https://hub.docker.com/r/cvriend/pgs/tags), including the bash script and packages like tensorflow.
+
+FSL is installed as detailed in their [installation docs](https://fsl.fmrib.ox.ac.uk/fsl/docs/#/install/container)
+and [configuration docs](https://fsl.fmrib.ox.ac.uk/fsl/docs/#/install/configuration). We're using the `-V` option at
+the end of the `fslinstaller` command to [fix it to a specific FSL version
+](https://fsl.fmrib.ox.ac.uk/fsl/docs/#/install/index?id=installing-older-versions-of-fsl).
diff --git a/docs/images/MRIcroGL_convert_dicom.png b/docs/images/MRIcroGL_convert_dicom.png
diff --git a/docs/images/MRIcroGL_window.png b/docs/images/MRIcroGL_window.png