Enhancing NSCLC Histological Subtype Classification: A Federated Learning Approach Using Triplet Loss
Preprocess the original images
python preprocess.py -mp /path/to/metadata -dp /path/to/original/data -op /path/to/output/folder -ts target_spacing
is the path to a CSV file with at least three columns 'Subject ID', 'Modality', and 'File Location'.
'File Location' should be relative to /path/to/original/data
. 'Modality' should be 'CT' for the images and 'RTSTRUCT' for segmentation.
should be a tuple of floats, default is (0.977, 0.977)
The preprocessed files will be saved with the name <Subject ID>_<slice_idx>.nrrd
to the /path/to/output/folder
Additionally, another file will be written to the same output folder with the name center_info.csv
containing the center points of tumor bounding boxes.
Create folds in a stratified fashion with the augmentations for train, val and test sets.
python prepare_folds.py -dn dataset_name -dp /path/to/preprocessed/data -lp /path/to/label/data -cp /path/to/clinical/data -op /path/to/output/folder -dl desired_len
should be 'NSCLC-Radiomics' for NSCLC-Radiomics dataset. It is not necessary for others.
is the path to the preprocessed images.
is the path to the CSV file with at least two columns 'X' and 'y' for subject ids and labels.
is the path to the clinical data file (required for NSCLC-Radiomics).
is the folder path to the output Excel file with several sheets with the following names : Fold<fold_no>_<phase>
is the desired number of instances for one class after the augmentations.
python main.py -rp /paths/to/preprocesed/data -fp /paths/to/fold/files -resp result_file_name -exp exp_id -mode triplet/softmax -freeze
should have the paths to the folders of preprocessed folders for all datasets. Paths should be seperated with a whitespace.
should have the paths to the Excel files with the folds. Paths should be seperated with a whitespace.
The file should have several sheets with the following names : Fold<fold_no>_<phase>
The results of the experiment will be saved to ./results/<result_file_name>.csv
It should be either 'triplet' or 'softmax'. If it is 'triplet', the model will be trained with triplet loss.
If it is 'softmax', the model will be trained with softmax loss.
is an optional argument. If it is used, some layers of the model will be frozen.
Other parameters are set as default to the values used in the paper.