Skip to content

Latest commit

 

History

History

dsw-data-seeder

Data Stewardship Wizard: Data Seeder

User Guide GitHub release (latest SemVer) Docker Pulls LICENSE CII Best Practices Python Version

Worker for seeding DSW data

Usage

  • You can use identical DSW configuration dsw.yml file as for DSW server itself (see config.example.yml).
  • You need a directory that contains recipe(s) described in json files (see example/seed.example.json), usually one seed recipe is enough.
  • From a recipe file, you can link SQL scripts and S3 app directory (paths are relative to the json file).
  • To verify recipes, use dsw-seeder -c config.example.yml -w example/ list.
  • To run directly seeder, use dsw-seeder -c config.example.yml -w seed -r "example" (example is the recipe name).
  • To run worker, use dsw-seeder -c config.example.yml -w run -r "example".
  • For more information, use dsw-seeder --help.

Docker

Docker image is prepared with basic dependencies and worker installed. It is available though Docker Hub: datastewardshipwizard/data-seeder.

Build image

You can easily build the image yourself:

$ docker build . -t datastewardshipwizard/data-seeder:local

Environment variables

  • DSW_CONFIG (default: /app/config.yml)
  • SEEDER_DATA_DIR (default: /app/data)
  • SEEDER_RECIPE (default: example)

Mount points

  • /app/config.yml (DSW_CONFIG) = configuration file (see example)
  • /app/data (SEEDER_DATA_DIR) = directory with recipe(s)

License

This project is licensed under the Apache License v2.0 - see the LICENSE file for more details.