-
Notifications
You must be signed in to change notification settings - Fork 86
Try out the FHIR Pipelines Controller
The FHIR Pipelines Controller makes it easy to schedule and manage the transformation of data from a HAPI FHIR server to a collection of Apache Parquet files. It uses FHIR Data Pipes JDBC pipeline to run either full or incremental transformations to a Parquet data warehouse.
The FHIR Pipelines Controller only works with HAPI FHIR servers using Postgres. You can see an example of configuring a HAPI FHIR server to use Postgres here.
This guide will show you how to set up the FHIR Pipelines Controller with a test HAPI FHIR server. It assumes you are using Linux, but should work with other environments with minor adjustments.
Clone the fhir-data-pipes GitHub repository using your preferred method. After cloned, open a terminal window and cd
to the directory where you cloned it. Later terminal commands will assume your working directory is the repository root.
The repository includes a Docker Compose configuration to bring up a HAPI FHIR server configured to use Postgres.
To set up the test server, follow these instructions. At step two, follow the instructions for "HAPI source server with Postgres".
First, open pipelines/controller/config/application.yml
in a text editor.
Change fhirServerUrl to be:
fhirServerUrl: "http://localhost:8091/fhir"
Read through the rest of the file to see other settings. The other lines may remain the same. Note the value of dwhRootPrefix
, as it will be where the Parquet files are written. You can also adjust this value if desired. Save and close the file.
Next, open pipelines/controller/config/hapi-postgres-config.json
in a text editor.
Change databaseHostName
to be:
"databaseHostName" : "localhost"
Save and close the file.
From the terminal run:
cd pipelines/controller/
mvn spring-boot:run
Open a web browser and visit http://localhost:8080. You should see the FHIR Pipelines Control Panel.
Before automatic incremental runs can occur, you must manually trigger a full run. Under the Run Full Pipeline section, click on Run Full. Wait for the run to complete.
The Control Panel shows the options being used by the FHIR Pipelines Controller.
This section corresponds to the settings in the application.yml
file.
This section calls out FHIR Data Pipes batch pipeline settings that are different from their default values. These are also mostly derived from application.yml
. Use these settings if you want to run the batch pipeline manually.
On your machine, look for the Parquet files created in the directory specified by dwhRootPrefix
in the application.yml file. FHIR Data Pipes includes query libraries to help explore the data.