Clinical data in CTSA hubs are not readily queryable in a federated fashion. Many efforts exist to address this, including TriNetX, ACT, PCORNet, and OHDSI among others. Unifying these with an HL7 FHIR framework is an aspiration.
Enabling the CTSA to function as a federated network of clinical data, supporting multicenter research is among the core goals of the program. This project advances that agenda through common data model harmonization.
Data respositories across CTSA hubs need to have semantic and syntactic alignment to support federated query. This must impose a minimal maintenence burden on CTSA hub sites. Leveraging the native FHIR APIs, no proposed as required for US EHRs by CMS, would mitigate ETL costs and maintenence issues.
Harmonize the data ecosystem. An improved data ecosystem will enhance and extend existing work being performed on the NCATS Data Translator system, which integrates clinical and translational data at scale for mechanistic discovery, as well as other emergent systems such as the NIH Commons. We will apply our strengths and existing activities to make data FAIR-TLC: Findable, Accessible, Interoperable, and Reusable, as well as Traceable, Licensable, and Connected. We will assist contributors and users to develop and apply data standards, Common Data Elements (CDEs), and other commonly utilized data models such as FHIR and OHDSI. We will extend and supplement infrastructure, training, and collaborative environments to enable data to be shared openly, so that groups can collaborate on its harmonization based on specific needs or standards. The data ecosystem will provision CTSA-wide quality assurance reports and data quality assessment, as well as gold-standard datasets and synthetic clinical data sets. Fundamentally, we aim to develop an open-science ethos and unite CTSA community data sharing with broader global efforts.
TODO see here
Point person (github handle) | Site | Program Director |
---|---|---|
Tricia Francis (@tricfran) | JHU | Chris Chute (@cgchute) |
Project scientific leadership:
Lead(s) (github handle) | Site |
---|---|
Chris Chute (@cgchute) | JHU |
Team members are listed here.
Many repositories could be listed here, including FHIR sites and CDM data models. However, for parsimony, we presently list the main FHIR project and the NCATS supported clinicalprofiles.org.
Key long-term deliverables
- A coherent common data model across CTSA hubs, arising naturally from their EHR sources.
- Shared terminology services across the CTSA community
Milestones are listed, though at present are quite general.
Evaluation of data harmonization will ultimately rest with its impact on our community. The goal is to enable federated query and inferencing at scale across the CTSA community. There are likley to be many lesser advantages and consequences. Several evaluation issues are in place, though we expect they will evolve with time.
We anticipate substantial need to educated the CTSA community about elements of the well-known FHIR specification relevent to translational research. In particular, the notion of managing FHIR as a canonical model, with migration paths to traditional common data models (e.g. OMOP/OHDSI, PCORNet, ACT, etc.)
- Federated Data Query Workshop May 20 & 21st, 2019: Subject Matter Expert Slide & Video Presentations
- Clinical Data Harmonization and Federated Query for Translational Research: Reflections and Report on a CD2H Workshop
- This report, developed by a small team of representatives across the CTSA community, is ready for input across the wider CTSA community. Your review and input is requested in this collaborative document.
- CDM-FHIR Mappings
- The CDM-FHIR Gap Analaysis Task Team has complied the most comprehensive mappings assembled to date. If you know of any other mappings that you do not see here, please add.
- Data Harmonization Maturity Model
- The Sustainability and Change Managenet Task Team has developed a data harmonization maturity model and it is ready for wider CTSA Community input. Are there other factors that should be considered?
We encourage the community to get involved.
We are looking for community participation in the following areas:
If you are interested in participating, please onboard here or contact Tricia Francis at pfranci4@jhu.edu with any questions.
Documentation for the various data harmonization task teams can be found at this Google drive folder and project specific work may be in this GitHub using the wiki or .md files.