Skip to content

Data exchange between datacat instances

SeeRoSee edited this page Jan 9, 2022 · 1 revision

For local testing of datacat, for testing new functionalities or for exchange in workgroups the transfer of datasets can be crucial. This guide describes how to exchange datacat datasets between local installations and how to load datacat.org datasets into the local installation. Prerequisite is the installation via the Guide to the local installation of datacat using the variant for developers (with development environment).

Exchange between local instances

  1. Start back-end via console or docker interface
  2. Get CONTAINER_ID of 'neo4j:4.1' image in console (cmd)
docker ps
  1. Create a copy of the local dataset
docker cp CONTAINER_ID:/var/lib/neo4j/data/. PATH/TO/FOLDER/docker_backup
  1. Verification

There should now be three folders in the specified directory:

  • databases
  • dbms
  • transactions
  1. Exchange

The directory with the folders listed in step 4. can be packed (.zip) on the start system and unpacked on the target system.

  1. Target system: create structure

In order to be able to load the data stock, a new folder 'volumes' must first be created in the backend directory. This folder must be on the same level as the 'src' and 'backups' folders. The directory packed in step 5 can be unpacked into the folder created.

  1. Target system: loading the data

The loading replaces the previous dataset completely! If desired, also create a backup of this as in steps 1 to 3.

In the backend directory, on one level with the 'volumes' folder, is the 'docker-compose.yml' file. In this, the 'db' section is relevant:

  db:
    container_name: "db"
    image: neo4j:4.1
    ports:
      - "7687:7687"
      - "7474:7474"
    environment:
      - NEO4J_AUTH=neo4j/s3cret
    volumes:
      - "dbdata:/data"
      - "dblogs:/logs"

The volume 'data' is replaced here by the inserted dataset. For this

- "dbdata:/data"

must be replaced by

- ./volumes:/data
  1. target system: restart

In order for the dataset to be taken over, the docker container 'datacat' must be deleted via the docker interface. Via the console, if you are in the backend directory, you can use

docker-compose up -d

to restart the container. After a short loading time, the loaded dataset should now be visible via the front-end at 'localhost:3000'.

Loading the server side dataset (datacat.org)

  1. Download dataset

A cronjob is used to regularly create a backup copy of the database on the server. This is stored on the server side in the 'backups' directory. There are several ways to download this:

  • Access server via WinSCP
  • Access server via SSH client (in Windows: PuTTY)
  1. Target system: create structure

In order to be able to load the dataset, a new folder 'volumes' must first be created in the backend directory. This folder must be on the same level as the 'src' and 'backups' directories. The directory downloaded in step 1 can be saved in the created folder.

**Note: The backup of datacat.org contains besides the three required folders a '.gitignore' file and a directory 'dumps'. These can be deleted or ignored.

  1. Target system: loading the data

Loading completely replaces the previous dataset! If desired, create a backup analogous to steps 1 to 3 from the first part of the instructions. In the backend directory, on one level with the 'volumes' folder, is the 'docker-compose.yml' file. In this file the section 'db' is relevant:

  db:
    container_name: "db"
    image: neo4j:4.1
    ports:
      - "7687:7687"
      - "7474:7474"
    environment:
      - NEO4J_AUTH=neo4j/s3cret
    volumes:
      - "dbdata:/data"
      - "dblogs:/logs"

The volume 'data' is replaced here by the inserted dataset. For this

- "dbdata:/data"

must replaced by

- ./volumes:/data

In order to load the dataset, it is imperative to know the credentials to the server-side neo4j database. The password associated with this must replace the default password 's3cret' (in lines 20 and 37) in the 'docker-compose.yml' file in the following places:

- spring.data.neo4j.password=s3cret
- NEO4J_AUTH=neo4j/s3cret
  1. Target system: reboot

In order for the dataset to be taken over, the docker container 'datacat' must be deleted via the docker interface. Via the console, if you are in the backend directory, you can use

docker-compose up -d

to restart the container. After a short loading time, the loaded dataset should now be visible via the front end at 'localhost:3000'.