-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UK - Biobank New BIDS dataset #29
Comments
thank you for initiating this @mpompolas, few precisions:
|
@mpompolas So I can add to this branch a modified version of my script for manual corrections manual_correction.py so the output name of manual correction would be for example |
Thanks @jcohenadad , just edited my instructions.
@sandrinebedard exactly. For creating this dataset, we will solely use code from this branch. |
I had some thoughts about the datasets we want to create. We talked about the fact that the derivatives folder would only be in the ideas:
@jcohenadad do you have some thoughts on this? |
@sandrinebedard good point.
I would lean towards this approach. You could e.g. break down your shell script and create a
I would advise against it. I'm afraid we will end up with out-of-sync derivatives (eg. segmentation manually corrected in dataset1 but we forgot to update it in dataset2). |
I agree with @jcohenadad on splitting the script into two parts.
The idea is to completely separate the original from the preprocessed dataset. If we put segmentations within the same folder from multiple datasets (I assume you would differentiate them with a suffix) it will become complicated later on to differentiate which ones we will use for training since we tend to have a standardized suffix in all datasets |
@jcohenadad @mpompolas I agree, splitting the script seems like the best idea, I will get into it! |
ULTIMATE GOAL - Create a new REPO of UK-BioBank
For the purpose of this new BIDS dataset, we want to keep the final preprocessed files, and the derivatives that correspond to them (a gradient-corrected scan has a different segmentation than the original).
The new BIDS folder should appear as an identical copy of UK-Biobank (same number of files AND same LABELS) but within a different folder name: e.g.
UK_BioBank_processed
, and also have the derivatives that were manually checked.BEFORE MANUAL CHECK
Sandrine's pipeline seems ready to go.
At this stage, I suggest we keep all the intermediate files for easy identification of potential problems. If space becomes an issue on Joplin we reevaluate: maybe do it in batches.
AFTER MANUAL CHECK
We should have files within the
/UK_BioBank_processed/derivatives
folder. Labels should be withoutRPI
,gradcorr
etc. suffixes, so on your code when you add the suffix_manual
, make sure your strip those off.Regarding the anatomy files (not the derivatives), we want to keep the last file of the pre-processing only, with the same name as the original:
e.g. Instead of:
sub-1000252_T2w_RPI_r_gradcorr.nii.gz
it should besub-1000252_T2w.nii
.This will make things very easy for later processing through the
Ivadomed
pipeline.So to sum it up:
*RPI
,*RPI_r_gradcorr
etc..NOTES
A few more files are needed for a complete BIDS folder: dataset_description.json and participants.json (you only have participants.tsv) - Maybe a README.TXT as well(?). Just copy these from the original UK-BioBank dataset.
The preprocessing steps should be documented somewhere: The easiest place would in the dataset_description.
Document git-version of SpinalCordToolbox and the function calls that were used with their parameters.
Another place could be the .json that is associated to each .nii.gz but that is a bit more work.
There is also the gradcorr file that needs to be documented somehow.... Don't have any input on that. As a start, maybe document which facility it came from(?)
The text was updated successfully, but these errors were encountered: