Skip to content

This tool enables DIC and biobanks to move or sync fhir data while mapping to MII KDS or bbmri.de profiles

License

Notifications You must be signed in to change notification settings

DavidCroftDKFZ/TransFAIR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransFAIR

A ready-to-use tool (turnkey solution) for data integration for medical institutions. Instead of creating own ETL processes by hand, this tool facilitates certain data integration tasks like:

  • Extraction from source systems
  • Transformation into target schemata
  • Loading into target systems
  • Linkage of IDs / Pseudonymization
  • Filtering of datasets

TransFAIR allows low-effort, fully automatic data transfer among software systems and data structures used in network medical research in Germany, in particular:

TransFAIR is designed to

  • minimize effort for personnel at the sites (since they no longer have to do the data integration themselves)
  • continuously update itself with new dataset/mapping definitions
  • thus accelerate and facilitate rollout of new features and dataset extensions
  • provide more consistent data quality (because as long as the source data is okay, errors within TransFAIR's mappings can be fixed centrally)

Quickstart (for Bridgehead sites)

If you are part of a German University Hospital with a Bridgehead (e.g. via BBMRI-ERIC, GBN, DKTK, CCP/C4 or nNGM), you already have TransFAIR as part of your Bridgehead, usually preconfigured with sane default values and mappings by the respective network. The most straightforward way to use it is to just activate it.

To do so, specify the required configuration (see Configuration) in a new environment file (e.g. my.transfair). Then, execute bridgehead transfair mytransfair and observe the output on the screen.

Configuration

TransFAIR is configured using environment variables:

Variable Description Default
TF_FHIR_SERVER_SOURCE_ADDRESS HTTP Address of the SOURCE datastore (required)
TF_FHIR_SERVER_TARGET_ADDRESS HTTP Address of the TARGET datastore (required)
TF_FHIR_SERVER_(SOURCE/TARGET)_USERNAME Basic Auth User
TF_FHIR_SERVER_(SOURCE/TARGET)_PASSWORD Basic Auth Password
TF_PROFILE Identifier of the TransFAIR profile to execute (see Profiles) (required)
TF_RESOURCES_START (Patient/Specimen) Starts collection resources on the specified level. Patient
TF_RESOURCES_FILTER Set to export only the specified resources. none, will export all ressources
TF_RESOURCES_WHITELIST Transfers only resources according to the Filters.
TF_RESOURCES_BLACKLIST ignores resources according to the Filters.
TF_PSEUDONYMIZATION_ADDR HTTP Address pointing to a service to map SOURCE IDs to TARGET IDs (see Pseudonymization) none, IDs will be unchanged

Profiles

As of now, TransFAIR supports the following transformation profiles:

  • FHIR2FHIR will transfer all ressources from SOURCE to TARGET unchanged. This can be used to perform filtering and/or pseudonymization across FHIR servers.
  • MII2BBMRI will read the MII Core Dataset from SOURCE (usually a FHIR server/fassade providing the MII Core Dataset) and transfer all data required by BBMRI-ERIC into TARGET (= BBMRI-ERIC Bridgehead)
  • BBMRI2MII will load biosample information from SOURCE (BBMRI-ERIC Bridgehead), transform into MII Core Dataset to TARGET (e.g. FHIR Store with MII Core Dataset)

Filters

TransFAIR supports many filters to customize the ETL process. Filters are coded with json. For example here we provide a filter that either bans or only transfers the ids.

{"patient": {
  "ids": ["1"]
  }
}

Pseudonymization

TransFAIR supports various ways to map patient/sample IDs between source and target stores, e.g. pseudonymization solutions (Mainzelliste, GPAS) or a plain mapping file in CSV format. Mapping works as follows:

Whenever TransFAIR encounters an ID from the SOURCE system, it will ask the service defined in PSEUDONYMIZATION_ADDR for the corresponding ID in the TARGET system (or vice-versa). We are currently defining a simple, implementation-independent API format in cooperation with pilot biobanks and will update this section once finished.

Outlook

We have created TransFAIR with the specific use-case of bringing German biobanks and data integration centers closer together. Perspectively, we intend TransFAIR to become a toolbox with easily reusable components for use with HL7 FHIR, OMOP and other well-known SQL, CSV and XML schemata.

License

Copyright 2021 - 2022 The Samply Community Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

This tool enables DIC and biobanks to move or sync fhir data while mapping to MII KDS or bbmri.de profiles

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •