-
Notifications
You must be signed in to change notification settings - Fork 86
Analytics on a single machine using Docker
The repository includes a "Single Machine" Docker Compose configuration which brings up the FHIR Pipelines Controller plus a Spark Thrift server on a single machine so you can more easily run Spark SQL queries on the Parquet files output by the Pipelines Controller. Before using this single machine configuration, see Try out the FHIR Pipelines Controller to learn how the Pipelines Controller works on its own.
This guide assumes you already have a HAPI FHIR server configured to use Postgres as its database. Alternatively, you can try it out with a local test server following the instructions for a HAPI source server with Postgres. You also need Docker Compose installed on the host machine. All file paths are relative to the root of the FHIR Data Pipes repository cloned on the host machine.
- Open
docker/config/application.yaml
and edit the value offhirServerUrl
to match the FHIR server you are connecting to. - Open
docker/config/hapi-postgres-config_local.json
and edit the values to match the FHIR server you are connecting to.
If you are trying the Single Machine configuration using the provided local test servers, things should work with the default values. Alternatively, use the ip address of the Docker default bridge network. To find it, run the following command and use the "Gateway" value:
docker network inspect bridge --format='{{json .IPAM.Config}}'
To bring up the configuration, run:
docker-compose -f docker/compose-controller-spark-sql.yaml up --force-recreate
If you have run this container in the past and want to include new changes pulled into the repo, add the --build
flag to rebuild the binaries.
Once fully up, the Pipelines Controller is available at http://localhost:8090
and the Spark Thrift server is at http://localhost:10001
.
The first time you run the Pipelines Controller, you must manually start a Full Pipeline run. In a browser go to http://localhost:8090
and click the Run Full button.
Connect to the Spark Thrift server using a client that supports Apache Hive. For example, if using the JDBC driver, the URL should be jdbc:hive2://localhost:10001
.