Skip to content
This repository has been archived by the owner on May 5, 2020. It is now read-only.

LongTailBio/pangea-server

Repository files navigation

Pangea Server

Pangea is a system to improve bioinformatics pipelines. Key features include:

  • Organize projects, samples, and the results of analyses
  • Automatically Sync results with S3 cloud storage
  • Coordinate pipelines running across multiple sites

Pangea server is currently in alpha and being heavily developed.

Getting Started

This readme documents how to run and test the Pangea server as a standalone application. Pangea server is based on the earlier MetaGenScope server metagenscope-server was a part of metagenscope-main and was usually be run as part of the complete stack.

Prerequisites

You will also need to have PostgreSQL running locally with the following databases:

CREATE DATABASE metagenscope_prod;
CREATE DATABASE metagenscope_dev;
CREATE DATABASE metagenscope_test;

And plugins:

\c metagenscope_prod;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
\c metagenscope_dev;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
\c metagenscope_test;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

All local interactions with the server (developing, running, testing) should be done in a virtual environment:

$ python3.6 -m venv env  # Create the environment (only need be performed once)
$ source env/bin/activate  # Activate the environment

Set application configuration:

# Environment: development | testing | staging | production
$ export APP_SETTINGS=development
$ export SECRET_KEY=my_precious
$ export DATABASE_URL=postgres://username:password@localhost:5432/metagenscope_dev
$ export DATABASE_TEST_URL=postgres://username:password@localhost:5432/metagenscope_test

Running Locally

Spin up server (runs on http://127.0.0.1:5000/):

$ python manage.py runserver

A startup script is provided to ensure that the application does not attempt to start before all service dependencis are accepting connections. It can be used like so:

$ ./startup.sh [host:port[, host:port, ...]] -- [command]

An example of waiting for Postgres and Mongo DBs running on localhost before starting the application would look like this:

$ ./startup.sh localhost:5435 localhost:27020 -- python manage.py runserver

Testing

The entry point to test suite tools is the Makefile.

Linting

Code quality is enforced using pylint, pycodestyle, and pydocstyle. The rules are defined in .pylintrc.

These tools may be run together using:

$ make lint

Running Test Suite

To run the test suite (will execute lint prior to running tests):

$ make test

You may also run tests checking their coverage:

$ make cov

Development

MetaGenScope uses the GitFlow branching strategy along with Pull Requests for code reviews. Check out this post by the Dwarves Foundation for more information.

Analysis Modules

AnalysisModules are the core of MetaGenScope extensibility. They are in charge of:

  • Providing the data model for the visualization backing data
  • Enumerating other AnalysisModule types that are valid data sources (WIP)
  • The Middleware task that transforms a set of Samples into the module's data model (WIP)

The modules live in the pangea_modules namespace and are self-contained: all models, processing tasks, and tests live within each module. The core set is defined in LongtailBio/pangea_modules.

To add a new AnalysisModule module: Write your new namespace package pangea_modules.my_new_module following existing conventions. Make sure the main module class inherits from pangea_modules.base.AnalysisModule.

API Documentation

The API for metagenscope-server is documented in swagger.yml in the OpenAPI v3.0 spec.

Viewing

Swagger UI can be used to view an API spec URL. You can use the public demo, or run it locally:

docker run -p 80:8080 -e API_URL=https://raw.githubusercontent.com/longtailbio/metagenscope-server/master/swagger.yml swaggerapi/swagger-ui:v3.19.5

Editing

Copying and pasting between your local editor and the Swagger Editor seems to be the easiest way to edit.

Continuous Integration

The test suite is run automatically on CircleCI for each push to Github. You can skip this behavior for a commit by appending [skip ci] to the commit message.

Custom Docker Database Images

CircleCI does not allow running commands on secondary containers (eg. the database). To get around this, we use custom images for our database images. Changes to either image need to be built, tagged, and pushed to Docker Hub before CI can succeed.

  • Postgres - Stock Postgres image with the uuid-ossp extension enabled. Located at ./database_docker/postgres_db.
  • Mongo - Stock Mongo image with a healthcheck script added. Located at ./database_docker/mongo_db.

Steps

From the appropriate database docker subdirectory, build and tag the image:

$ export COMMIT_SHA=`git rev-parse HEAD`
$ docker build -t imagebuildinprocess .
$ docker tag imagebuildinprocess "metagenscope/postgres:${COMMIT_SHA::8}"

Push the image:

$ docker login
$ docker push "metagenscope/postgres:${COMMIT_SHA::8}"

Clean up:

$ docker rmi imagebuildinprocess "metagenscope/postgres:${COMMIT_SHA::8}"

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository.

Release History

See CHANGELOG.md.

Authors

  • Benjamin Chrobot - Initial work - bchrobot

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.