Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds seqcolapi endpoint service to refget package #6

Merged
merged 13 commits into from
Feb 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -79,4 +79,4 @@ refget.egg-info/
*ipynb_checkpoints*
*.egg-info*


coverage.xml
20 changes: 19 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,27 @@
# Refget

[![Build Status](https://travis-ci.com/refgenie/refget.svg?branch=master)](https://travis-ci.com/refgenie/refget)
![Run pytests](https://github.com/pepkit/looper/workflows/Run%20pytests/badge.svg)

The refget package provides a Python interface to both remote and local use of the refget protocol.

This package provides clients and functions for both refget sequences and refget sequence collections (seqcol).

Documentation is hosted at [refgenie.org/refget](https://refgenie.org/refget/).

## Testing

### Local unit tests of refget package

- `pytest` to test `refget` package, local unit tests

### Compliance testing

Under `/test_api` are compliance tests for a service implementing the sequence collections API. This will test your collection and comparison endpoints to make sure the comparison function is working.

- `pytest test_api` to tests API compliance
- `pytest test_api --api_root http://127.0.0.1:8100` to customize the API root URL to test

1. Load the fasta files from the `test_fasta` folder into your API database.
2. Run `pytest test_api --api_root <API_URL>`, pointing to your URL to test


105 changes: 105 additions & 0 deletions README_seqcolapi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# seqcolapi

This repository contains:

1. Sequence collections API software (the `seqcolapi` package). This package is based on the `refget` package. It simply provides an wrapper to implement the Sequence Collections API.
2. Configuration and GitHub Actions for demo server instance ([servers subfolder](/servers)).

## Instructions

### Run locally for development

First, configure env vars:
- To run a local server with a **local database**:`source servers/localhost/dev_local.env`
- To run a local server with **the production database**:`source servers/seqcolapi.databio.org/production.env`

Then, run service:

```
uvicorn seqcolapi.main:app --reload --port 8100
```

### Running with docker

To build the docker file, from the root of this repository:

First you build the general-purpose image

```
docker build -f deployment/dockerhub/Dockerfile -t databio/seqcolapi seqcolapi
```

Next you build the wrapped image (this just wraps the config into the app):

```
docker build -f deployment/seqcolapi.databio.org/Dockerfile -t seqcolapi.databio.org deployment/seqcolapi.databio.org
```

To run in a container:

```
source deployment/seqcolapi.databio.org/production.env
docker run --rm -p 8000:80 --name seqcolapi seqcolapi.databio.org
```


docker run --rm -p 8000:8000 --name sccon \
--env "POSTGRES_PASSWORD" \
--volume $CODE/seqcolapi.databio.org/config/seqcolapi.yaml:/config.yaml \
scim
```

To deploy container to dockerhub:

Use github action in this repo which deploys on release, or through manual dispatch.


Left to do:
- [x] it already retrieves from a refget server.
- [x] let me insert stuff using only checksums.
- [ ] make it take 2 refget servers correctly.


## To load new data into seqcolapi.databio.org

```
cd analysis
source ../servers/localhost/dev_local.env
ipython3
```

Now run `load_fasta.py`

## Deploy to AWS ECS

### Testing locally first

Build the seqcolapi image

```
cd
docker build -t docker.io/databio/seqcolapi:latest .
```

```
docker pull docker.io/databio/seqcolapi:latest
cd servers/seqcolapi.databio.org
docker build -t scim .
docker run \
-e POSTGRES_HOST=$POSTGRES_HOST \
-e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
--network=host \
scim
```

### Deploying

To upgrade the software:

Use config file located in `/servers/seqcolapi.databio.org`. This will use the image in docker.io://databio/seqcolapi, github repo: [refgenie/seqcolapi](https://github.com/refgenie/seqcolapi) as base, bundle it with the above config, and deploy to the shefflab ECS.

1. Ensure the [refget](https://github.com/refgenie/refget/) package master branch is as you want it.
2. Deploy the updated [secqolapi](https://github.com/refgenie/seqcolapi/) app to dockerhub (using manual dispatch, or deploy on github release).
3. Finally, deploy the instance with manual dispatch using the included GitHub action.


11 changes: 11 additions & 0 deletions deployment/dockerhub/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM tiangolo/uvicorn-gunicorn:python3.11-slim
LABEL authors="Nathan Sheffield"
RUN pip install https://github.com/databio/yacman/archive/dev.zip
RUN pip install https://github.com/refgenie/refget/archive/seqcolapi.zip

COPY requirements/requirements-seqcolapi.txt requirements/requirements-seqcolapi.txt
RUN pip install -r requirements/requirements-seqcolapi.txt --no-cache-dir

COPY . /app/seqcolapi

CMD ["uvicorn", "seqcolapi.main:app", "--host", "0.0.0.0", "--port", "80"]
9 changes: 9 additions & 0 deletions deployment/localhost/dev_local.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
export POSTGRES_HOST=`pass databio/seqcol/postgres_host`
export POSTGRES_DB=`pass databio/seqcol/postgres_db`
export POSTGRES_USER=`pass databio/seqcol/postgres_user`
export POSTGRES_PASSWORD=`pass databio/seqcol/postgres_password`
export POSTGRES_TABLE="seqcol"

export SEQCOLAPI_PORT="8100"
export SERVER_ENV="dev"
export SEQCOLAPI_CONFIG="servers/localhost/seqcolapi.yaml"
13 changes: 13 additions & 0 deletions deployment/localhost/seqcolapi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
refget_provider_apis: https://www.ebi.ac.uk/ena/cram/sequence/
schemas:
- https://schema.databio.org/refget/SeqColArraySetInherent.yaml
database:
host: $POSTGRES_HOST
user: $POSTGRES_USER
password: $POSTGRES_PASSWORD
port: 5432
name: $POSTGRES_DB
table: seqcol
server:
host: 0.0.0.0
port: 80
3 changes: 3 additions & 0 deletions deployment/seqcolapi.databio.org/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
FROM databio/seqcolapi
COPY seqcolapi.yaml /seqcolapi_config.yaml
ENV SEQCOLAPI_CONFIG /seqcolapi_config.yaml
105 changes: 105 additions & 0 deletions deployment/seqcolapi.databio.org/primary_task_def.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
{
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::235728444054:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
"dnsSearchDomains": null,
"environmentFiles": null,
"logConfiguration": null,
"entryPoint": null,
"portMappings": [
{
"hostPort": 8105,
"protocol": "tcp",
"containerPort": 80
}
],
"command": null,
"linuxParameters": null,
"cpu": 0,
"environment": [],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [],
"workingDirectory": null,
"secrets": [
{
"valueFrom": "SEQCOLAPI_POSTGRES_PASSWORD",
"name": "POSTGRES_PASSWORD"
},
{
"valueFrom": "BEDBASE_POSTGRES_HOST",
"name": "POSTGRES_HOST"
}
],
"dockerSecurityOptions": null,
"memory": 2048,
"memoryReservation": 512,
"volumesFrom": [],
"stopTimeout": null,
"image": "235728444054.dkr.ecr.us-east-1.amazonaws.com/my-ecr-repo:170afd5cf39d9799e926e1d0ebf40b9051fb731f",
"startTimeout": null,
"firelensConfiguration": null,
"dependsOn": null,
"disableNetworking": null,
"interactive": null,
"healthCheck": null,
"essential": true,
"links": null,
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "seqcolapi"
}
],
"placementConstraints": [],
"memory": null,
"taskRoleArn": "ecsTaskExecutionRole",
"compatibilities": [
"EC2"
],
"family": "seqcolapi-task",
"requiresAttributes": [
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.task-iam-role"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.execution-role-ecr-pull"
}
],
"pidMode": null,
"requiresCompatibilities": [
"EC2"
],
"networkMode": "bridge",
"cpu": "128",
"revision": 1,
"status": "ACTIVE",
"inferenceAccelerators": null,
"proxyConfiguration": null,
"volumes": []
}
9 changes: 9 additions & 0 deletions deployment/seqcolapi.databio.org/production.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
export POSTGRES_HOST=`pass databio/seqcol/postgres_host`
export POSTGRES_DB=`pass databio/seqcol/postgres_db`
export POSTGRES_USER=`pass databio/seqcol/postgres_user`
export POSTGRES_PASSWORD=`pass databio/seqcol/postgres_password`
export POSTGRES_TABLE="seqcol"

export SEQCOLAPI_PORT="5432"
export SERVER_ENV="production"
export SEQCOLAPI_CONFIG="deployment/seqcolapi.databio.org/seqcolapi.yaml"
13 changes: 13 additions & 0 deletions deployment/seqcolapi.databio.org/seqcolapi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
refget_provider_apis: https://www.ebi.ac.uk/ena/cram/sequence/
schemas:
- https://schema.databio.org/refget/SeqColArraySetInherent.yaml
database:
host: $POSTGRES_HOST
user: seqcol_admin
password: $POSTGRES_PASSWORD
port: 5432
name: seqcol
table: seqcol
server:
host: 0.0.0.0
port: 80
7 changes: 7 additions & 0 deletions refget/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,10 @@
from .seqcol import *
from .utilities import *
from .seqcol_client import *

try:
# Requires optional dependencies, so we catch the ImportError
from .seqcol_router import seqcol_router
except ImportError:
seqcol_router = None
pass
Loading
Loading