Skip to content

Commit

Permalink
Merge pull request #8 from PennChopMicrobiomeProgram/dev
Browse files Browse the repository at this point in the history
Add CI and conda env
  • Loading branch information
kylebittinger authored Apr 18, 2023
2 parents be037e7 + fc2bb4a commit dd85c65
Show file tree
Hide file tree
Showing 16 changed files with 354 additions and 112 deletions.
35 changes: 35 additions & 0 deletions .github/workflows/codecov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: CodeCov

on:
pull_request:
branches: [master, main]
push:
branches: [master, main]
schedule:
- cron: "0 13 * * 1"

jobs:
run:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Set up Python 3.10
uses: actions/setup-python@v2
with:
python-version: '3.10'
- name: setup-conda
uses: s-weigand/setup-conda@v1.1.0
with:
conda-channels: ''
- name: Install dependencies
run: |
conda env update --file primertrim_env.yml
python -m pip install --upgrade pip
python -m pip install pytest
python -m pip install pytest-cov
python -m pip install -e .
- name: Run tests and collect coverage
run: pytest --cov tests/
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
26 changes: 26 additions & 0 deletions .github/workflows/linter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: Super-Linter

on:
pull_request:
branches: [master, main, dev]
push:
branches: [master, main]

jobs:
super-linter:
name: Lint Codebase
runs-on: ubuntu-latest

steps:
- name: Checkout Code
uses: actions/checkout@v3

- name: Run Super-Linter
uses: github/super-linter@v4
env:
VALIDATE_ALL_CODEBASE: true
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

VALIDATE_PYTHON_BLACK: true

FILTER_REGEX_INCLUDE: primertrim/.*|tests/.*|setup.py
39 changes: 39 additions & 0 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

name: Upload Python Package

on:
release:
types: [published]

permissions:
contents: read

jobs:
deploy:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build
- name: Build package
run: python -m build
- name: Publish package
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
45 changes: 45 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: Tests

on:
pull_request:
branches: [master, main, dev]
push:
branches: [master, main]
schedule:
- cron: "0 13 * * 1"

jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10"]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: setup-conda
uses: s-weigand/setup-conda@v1.1.0
with:
conda-channels: ''
- name: Install dependencies
run: |
conda env update --file primertrim_env.yml
python -m pip install --upgrade pip
python -m pip install pytest
python -m pip install -e .
- name: Unit tests
run: |
pytest -vvl tests/
- name: Other tests
if: matrix.python-version == '3.10'
run: |
echo "Other tests"
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,23 @@
# Primer trim

<!-- Begin badges -->
[![Tests](https://github.com/PennChopMicrobiomeProgram/primertrim/actions/workflows/tests.yml/badge.svg)](https://github.com/PennChopMicrobiomeProgram/primertrim/actions/workflows/tests.yml)
[![CodeCov](https://github.com/PennChopMicrobiomeProgram/primertrim/actions/workflows/codecov.yml/badge.svg)](https://github.com/PennChopMicrobiomeProgram/primertrim/actions/workflows/codecov.yml)
[![Super-Linter](https://github.com/PennChopMicrobiomeProgram/primertrim/actions/workflows/linter.yml/badge.svg)](https://github.com/PennChopMicrobiomeProgram/primertrim/actions/workflows/linter.yml)
<!-- End badges -->

Detect short primer sequences in FASTQ reads and trim the reads accordingly.

## Installation

Primertrim requires vsearch, our recommended method of installation is through conda and is shown here:

```bash
git clone https://github.com/PennChopMicrobiomeProgram/primertrim.git
cd primertrim
pip install -e .
conda env create -f primertrim_env.yml -n primertrim
conda activate primertrim
pip install .
```

## Algorithm
Expand Down
58 changes: 42 additions & 16 deletions primertrim/align.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,21 @@
import tempfile

DEFAULT_BLAST_FIELDS = [
"qseqid", "sseqid", "pident", "length", "mismatch", "gapopen",
"qstart", "qend", "sstart", "send", "qlen", "slen", "qseq", "sseq", "qstrand",
"qseqid",
"sseqid",
"pident",
"length",
"mismatch",
"gapopen",
"qstart",
"qend",
"sstart",
"send",
"qlen",
"slen",
"qseq",
"sseq",
"qstrand",
]

BLAST_FIELD_TYPES = {
Expand Down Expand Up @@ -43,10 +56,12 @@
"qstrand": "qstrand",
}


def write_fasta(f, seqs):
for seq_id, seq in seqs:
f.write(">{}\n{}\n".format(seq_id, seq))


class VsearchAligner:
def __init__(self, ref_seqs_fp):
self.ref_seqs_fp = ref_seqs_fp
Expand All @@ -58,7 +73,8 @@ def search(self, seqs, input_fp=None, output_fp=None, **kwargs):
"""Search seqs and return hits"""
if input_fp is None:
infile = tempfile.NamedTemporaryFile(
suffix=".fasta", mode="w+t", encoding="utf-8")
suffix=".fasta", mode="w+t", encoding="utf-8"
)
write_fasta(infile, seqs)
infile.seek(0)
input_fp = infile.name
Expand All @@ -67,8 +83,7 @@ def search(self, seqs, input_fp=None, output_fp=None, **kwargs):
write_fasta(f, seqs)

if output_fp is None:
outfile = tempfile.NamedTemporaryFile(
suffix=".txt", mode="wt")
outfile = tempfile.NamedTemporaryFile(suffix=".txt", mode="wt")
output_fp = outfile.name

self._call(input_fp, output_fp, **kwargs)
Expand All @@ -82,18 +97,29 @@ def _call(self, query_fp, output_fp, min_id=0.85, threads=None):
userfields_arg = "+".join(BLAST_TO_VSEARCH[f] for f in self.fields)
args = [
"vsearch",
"--usearch_global", query_fp,
"--minseqlength", "10",
"--mincols", "10",
"--id", id_arg,
"--wordlength", "4",
"--strand", "both",
"--maxaccepts", "4",
"--minwordmatches", "3",
"--usearch_global",
query_fp,
"--minseqlength",
"10",
"--mincols",
"10",
"--id",
id_arg,
"--wordlength",
"4",
"--strand",
"both",
"--maxaccepts",
"4",
"--minwordmatches",
"3",
"--top_hits_only",
"--userfields", userfields_arg,
"--db", self.ref_seqs_fp,
"--userout", output_fp,
"--userfields",
userfields_arg,
"--db",
self.ref_seqs_fp,
"--userout",
output_fp,
]
if threads is not None:
threads_arg = "{:d}".format(threads)
Expand Down
Loading

0 comments on commit dd85c65

Please sign in to comment.