FHIRmaker

Purpose: Crawl public, annotated, medical imaging archives, based on those annotations create Patient, Condition, diagnosticReport and imagingStudy FHIR resources which in turn are discoverable via a FHIR API

The Challenge

Let's say you are a Machine Learning researcher in medical imaging. You have a Keras model you would like to try out on many high resolution Lung CT images. Ideally the resulting datasets (training and inference) would have a mix of normals and positive findings that mimic the real world statistics. However, if one does not work at Ginormous Hospital, how can one locate all the publically available datasets that could provide statistical power - and how much time would it take?

Enter FHIRmaker

Theory

Many people have image sets: NIH, other govt. agencies, foundations, etc. The issue is not images per se, but curation, annotation and -machine- discoverability. Imagine having a web page or API you could go to and type in the patient condition you are interested in (in ICD10 or SNOMED codes). Then, for those patients, find the studies that display relevent findings for that condition acquired via modalities (i.e MR) and imaging paramters (i.e Pulse sequence) your ML Model can use. Press "Submit" and within seconds get a return list of URL's where the target studies on public archives exist.

How?

There are several approaches one can take; some are easy and automated, others require significant human effort. But at the end of the day, the goal is to index every patient/study on public sites and

a) create from that information a patient FHIR resource that matches the names used on the source site
b) determine (via numerous methods) the "AI interesting" patient conditions and list them in the Patient resource then
c) locate the studies for that patient, represent their findings in ACR-RADS (or similar) terms in the Diagnostic Report resource and acquisition parameters (e.g. coded in LOINC-RADLEX) in the imagingStudy resources and then
d) aggregate the above resources on a single FHIR server with the above described Web page and API

Status

Week 1: created the first 3 files indicated below, validating FHIR object output, TCIA API
Week 2: validated download_data.py "get" functions work as expected on Hackathon FHIR server
Week 3: build out the DICOMweb API class to enable retrieving source images from DICOMweb VNA's
Week 4: build the web page (pending, provides candidate patients lists for now matching body-part and condition)

To Dos

1) send FHIR objects to FHIR git repo (manually for now)
1) complete the study level search on the web page

download_data.py :: suite of classes to abstract RESTful calls to image archives
read_dump.py :: class to read JSON dump of a dbase after it annotates an Archive (should include findings)
FHIRmaker.py :: class to call the above 2, and based on info create Patient and Report FHIR objects
FHIRfetch.py :: class to take command line arguments (body part, Condition) and return pointers to Patient/Reports/StudyURLs that match

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
skel		skel
tcia		tcia
FHIRfetch.py		FHIRfetch.py
FHIRmaker.py		FHIRmaker.py
LICENSE		LICENSE
README.md		README.md
download_data.py		download_data.py
read_dump.py		read_dump.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FHIRmaker

The Challenge

Theory

Status

To Dos

Contents

About

Releases

Packages

Languages

License

sglanger/FHIRmaker

Folders and files

Latest commit

History

Repository files navigation

FHIRmaker

The Challenge

Theory

Status

To Dos

Contents

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages