Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets to add for the next version of the model #116

Open
5 of 10 tasks
naga-karthik opened this issue Jul 25, 2024 · 2 comments
Open
5 of 10 tasks

Datasets to add for the next version of the model #116

naga-karthik opened this issue Jul 25, 2024 · 2 comments

Comments

@naga-karthik
Copy link
Collaborator

naga-karthik commented Jul 25, 2024

Datasets included in v2.4 model:

  • "basel-mp2rage"
  • "canproco"
  • "data-multi-subject"
  • "dcm-zurich"
  • "lumbar-epfl"
  • "lumbar-vanderbilt"
  • "sct-testing-large"

Datasets to be included next:

  • Beijing MS (has several different contrasts)
  • Karolinska (mainly T1w, T2w, T2star)
  • lumbar-marseille
  • dcm-brno
  • philadelphia-paediatric (at the moment, T1w and T2w)
  • sci-colorado
  • sci-paris
  • sci-zurich
  • nih-ms-mp2rage

Optional (NOTE - this could destroy the model):

  • stanford-epi (open-neuro epi spinalcord; good to use as testing data)

EDIT: updated with correct list of datasets in v2.4 (cross-ref'd from datasplits.md in the v2.4 release)

@naga-karthik
Copy link
Collaborator Author

Update: Here is the current list of datasets used for training the model (potentially v2.5). This function outputs the following info in a txt file gathering all the datasets statistics.

datasets stats
DATASETS USED FOR MODEL TRAINING (n=14):

	- datasplit_dcm-zurich-lesions-20231115_seed50.json
	- datasplit_sci-paris_seed50.json
	- datasplit_lumbar-epfl_seed50.json
	- datasplit_data-multi-subject_seed50.json
	- datasplit_dcm-brno_seed50.json
	- datasplit_basel-mp2rage_seed50.json
	- datasplit_sct-testing-large_seed50.json
	- datasplit_canproco_seed50.json
	- datasplit_sci-colorado_seed50.json
	- datasplit_dcm-zurich_seed50.json
	- datasplit_nih-ms-mp2rage_seed50.json
	- datasplit_lumbar-vanderbilt_seed50.json
	- datasplit_dcm-zurich-lesions_seed50.json
	- datasplit_sci-zurich_seed50.json


SPLITS ACROSS DIFFERENT CONTRASTS (n=11):

| Contrast   |   #train_images |   #validation_images |   #test_images |
|:-----------|----------------:|---------------------:|---------------:|
| dwi        |             176 |                   37 |             50 |
| mp2rage    |             142 |                   16 |             18 |
| mt-off     |             158 |                   36 |             49 |
| mt-on      |             167 |                   37 |             50 |
| psir       |             217 |                   49 |             67 |
| stir       |              56 |                   15 |             18 |
| t1map      |              71 |                    8 |              9 |
| t1w        |             248 |                   45 |             63 |
| t2star     |             653 |                  102 |            111 |
| t2w        |            1434 |                  226 |            254 |
| unit1      |             152 |                   19 |             20 |
| TOTAL      |            3474 |                  590 |            709 |


PATHOLOGY-WISE SPLIT:

| Pathology           |   Number of Subjects |
|:--------------------|---------------------:|
| MS                  |                  894 |
| HC                  |                  407 |
| MildCompression     |                   57 |
| DCM                 |                  239 |
| MildCompression/DCM |                   88 |
| SCI                 |                   19 |
| ALS                 |                   32 |
| NMO                 |                   19 |
| TOTAL               |                 1755 |


@naga-karthik
Copy link
Collaborator Author

naga-karthik commented Oct 22, 2024

Datasets that could be used for evaluation:

  • whole-spine
  • lumbar-vanderbilt sagittal T2w images
  • dcm-oklahoma (maybe)
  • EPI datasets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant