Skip to content

Commit

Permalink
Merge pull request #295 from pepkit/dev
Browse files Browse the repository at this point in the history
release 0.11.6
  • Loading branch information
khoroshevskyi authored Feb 8, 2024
2 parents 9aa337a + 4a21aa8 commit dacb3d1
Show file tree
Hide file tree
Showing 12 changed files with 262 additions and 199 deletions.
185 changes: 12 additions & 173 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,185 +1,24 @@
<img src="https://img.shields.io/badge/fastapi-109989?style=for-the-badge&logo=FASTAPI&logoColor=white" /> <img src="https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue" /> <img src="https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge&logo=postgresql&logoColor=white" />
<p align="center"><img src="https://img.shields.io/badge/fastapi-109989?style=for-the-badge&logo=FASTAPI&logoColor=white" /> <img src="https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue" /> <img src="https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge&logo=postgresql&logoColor=white" />
</p>

# pephub
<p align="center">
<a href="https://pephub.databio.org"><img src="./docs/imgs/pephub_logo_big.svg" alt="PEPhub"></a>

**pephub** is a biological metadata server that lets you view, store, and share your sample metadata in form of [PEPs](https://pep.databio.org/en/latest/). It has 3 components: 1) a _database_ where PEPs are stored; 2) an _API_ to programmatically read and write PEPs in the database; and 3) a web-based _user interface_ to view and manage these PEPs via a front-end.

## Organization
</p>

## Setting up a development environment
**PEPhub** is a biological metadata server that lets you view, store, and share your sample metadata in form of [PEPs](https://pep.databio.org/en/latest/). It has 3 components: 1) a _database_ where PEPs are stored; 2) an _API_ to programmatically read and write PEPs in the database; and 3) a web-based _user interface_ to view and manage these PEPs via a front-end.

PEPhub consists of 3 components: 1) A postgres database; 2) the PEPhub API; 3) the PEPhub UI.
---

### 1. Database setup
**Deployed public instance**: <a href="https://pephub.databio.org/" target="_blank">https://pephub.databio.org/</a>

_pephub_ stores PEPs in a [POSTGRES](https://www.postgresql.org/) database. Create a new pephub-compatible postgres instance locally:
**API**: <a href="https://pephub-api.databio.org/api/v1/docs" target="_blank">https://pephub-api.databio.org/api/v1/docs</a>

```
docker pull postgres
docker run \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=docker \
-e POSTGRES_DB=pep-db \
-p 5432:5432 \
postgres
```
**Documentation**: <a href="https://pep.databio.org/pephub" target="_blank">https://pep.databio.org/pephub</a>

You should now have a pephub-compatible postgres instance running at http://localhost:5432.
You can use [load_db.py](scripts/load_db.py) to load a directory of PEPs into the database.
**Source Code**: <a href="https://github.com/pepkit/pephub" target="_blank">https://github.com/pepkit/pephub</a>

### 2. `pephub` API setup
---

#### Install

Install dependencies using `pip` (_We suggest using virtual environments_):

```
python -m venv venv && source venv/bin/activate
pip install -r requirements/requirements-all.txt
```

#### Running

_pephub_ may be run in several ways. In every case, pephub requires configuration. Configuration settings are supplied to pephub through environment variables. The following settings are **required**. While pephub has built-in defaults for these settings, you should provide them to ensure compatability:

- `POSTGRES_HOST`: The hostname of the PEPhub database server
- `POSTGRES_DB`: The name of the database inside the postgres server
- `POSTGRES_USER`: Username for the database
- `POSTGRES_PASSWORD`: Password for the user
- `POSTGRES_PORT`: Port for postgres database
- `GH_CLIENT_ID`: Client ID for the GitHub application that authenticates users
- `GH_CLIENT_SECRET`: Client secret for the GitHub application that authenticates users
- `BASE_URI`: A BASE URI of the PEPhub (e.g. localhost:8000)

You must set these environment variables prior to running PEPhub. We've provided `env` files inside [`environment`](./environment) which you may `source` to load your environment. Alternatively, you may store them locally in a `.env` file. This file will get loaded and exported to your environment when the server starts up. We've included an [example](environment/template.env) `.env` file with this repository. You can read more about server settings and configuration [here](docs/server-settings.md).

Once the configuration variables are set, run pephub natively with:

```
uvicorn pephub.main:app --reload
```

The _pephub_ API should now be running at http://localhost:8000.

### 3. React PEPhub UI setup

_Important:_ To make the development server work, you must include a `.env.local` file inside `web/` with the following contents:

```
VITE_API_HOST=http://localhost:8000
```

This ensures that the frontend development server will proxy requests to the backend server. You can now run the frontend development server:

```bash
cd web
npm install # yarn install
npm start # yarn dev
```

The pephub frontend development server should now be running at http://localhost:5173/.

### 3. (_Optional_) GitHub Authentication Client Setup

_pephub_ uses GitHub for namespacing and authentication. As such, a GitHub application capable of logging in users is required. We've included [instructions for setting up GitHub authentication locally](https://github.com/pepkit/pephub/blob/master/docs/authentication.md#setting-up-github-oauth-for-your-own-server) using your own GitHub account.

### 4. (_Optional_) Vector Database Setup

We've added [semantic-search](https://huggingface.co/course/chapter5/6?fw=tf#using-embeddings-for-semantic-search) capabilities to pephub. Optionally, you may host an instance of the [qdrant](https://qdrant.tech/) **vector database** to store embeddings computed using a sentence transformer that has mined and processed any relevant metadata from PEPs. If no qdrant connection settings are supplied, pephub will default to SQL search. Read more [here](docs/semantic-search.md). To run qdrant locally, simply run the following:

```
docker pull qdrant/qdrant
docker run -p 6333:6333 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrant
```

## Running with docker:

### Option 1. Standalone `docker`:

If you already have a public database instance running, you can choose to build and run the server container only. **A note to Apple Silicon (M1/M2) users**: If you have issues running, try setting your default docker platform with `export DOCKER_DEFAULT_PLATFORM=linux/amd64` to get the container to build and run properly. See [this issue](https://github.com/pepkit/pephub/issues/87) for more information.

**1. Environment:**
Ensure that you have your [environment](docs/server-settings.md) properly configured. To manage secrets in your environment, we leverage `pass` and curated [`.env` files](environment/production.env). You can use our `launch_docker.sh` script to start your container with these `.env` files.

**2. Build and start container:**

```
docker build -t pephub .
./launch_docker.sh
```

Alternatively, you can inject your environment variables one-by-one:

```
docker run -p 8000:8000 \
-e POSTGRES_HOST=localhost \
-e POSTGRES_DB=pep-db \
...
pephub
```

Or, provide your own `.env` file:

```
docker run -p 8000:8000 \
--env-file path/to/.env \
pephub
```

### Option 2. `docker compose`:

The server has been Dockerized and packaged with a [postgres](https://hub.docker.com/_/postgres) image to be run with [`docker compose`](https://docs.docker.com/compose/). This lets you run everything at once and develop without having to manage database instances.

You can start a development environment in two steps:

**1. Curate your environment:**
Since we are running in `docker`, we need to supply environment variables to the container. The `docker-compose.yaml` file is written such that you can supply a `.env` file at the root with your configurations. See the [example env file](environment/template.env) for reference. See [here](docs/server-settings.md) for a detailed explanation of all configurable server settings. For now, you can simply copy the `env` file:

```
cp environment/template.env .env
```

**2. Build and start the containers:**

```console
docker compose up --build
```

`pephub` now runs/listens on http://localhost:8000
`postgres` now runs/listens on http://localhost:5432

**3. (_Optional_) Utilize the [`load_db`](scripts/load_db.py) script to populate the database with `examples/`:**

```console
cd scripts
python load_db.py \
--username docker \
--password password \
--database pephub
../examples
```

**4. (_Optional_) GitHub Authentication Client Setup**

_pephub_ uses GitHub for namespacing and authentication. As such, a GitHub application capable of logging in users is required. We've [included instructions](https://github.com/pepkit/pephub/blob/master/docs/authentication.md#setting-up-github-oauth-for-your-own-server) for setting this up locally using your own GitHub account.

**5. (_Optional_) Vector Database Setup**

We've added [semantic-search](https://huggingface.co/course/chapter5/6?fw=tf#using-embeddings-for-semantic-search) capabilities to pephub. Optionally, you may host an instance of the [qdrant](https://qdrant.tech/) **vector database** to store embeddings computed using a sentence transformer that has mined and processed any relevant metadata from PEPs. If no qdrant connection settings are supplied, pephub will default to SQL search. Read more [here](docs/semantic-search.md). To run qdrant locally, simply run the following:

```
docker pull qdrant/qdrant
docker run -p 6333:6333 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrant
```

_Note: If you wish to run the development environment with a pubic database, curate your `.env` file as such._

12 changes: 12 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@

This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) and [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format.

## [0.11.6] - 02-08-2024

### Fixed

- Docs and docs links
- Bug in handsontable
- Response errors in samples and views

### Added

- Namespace endpoint

## [0.11.5] - 02-02-2024

### Fixed
Expand Down
Loading

0 comments on commit dacb3d1

Please sign in to comment.