Skip to content

Commit

Permalink
Merge branch 'main' into annotation-speedups
Browse files Browse the repository at this point in the history
  • Loading branch information
“Marcel-Mueck” committed Feb 2, 2024
2 parents 6225278 + 21a8e46 commit 8ba9c47
Show file tree
Hide file tree
Showing 415 changed files with 5,062 additions and 1,269 deletions.
13 changes: 13 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# What

*What does this PR do, and preferably how*

# Testing
*Testing this PR involves X,Y,Z*

## Test scenarios
*Test these things*

1. Instructions
2. ...
3. ...
35 changes: 35 additions & 0 deletions .github/workflows/autoblack_pull_request.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# GitHub Action that uses Black to reformat the Python code in an incoming pull request.
# If all Python code in the pull request is complient with Black then this Action does nothing.
# Othewrwise, Black is run and its changes are committed back to the incoming pull request.
# https://github.com/cclauss/autoblack

name: autoblack_pull_request
on: [ pull_request ]
jobs:
black-code:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
ref: ${{ github.head_ref }}
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- run: pip install black
- run: black --check .
- name: If needed, commit black changes to the pull request
if: failure()
run: |
printenv | grep GITHUB
git config --global user.name 'PMBio'
git config --global user.email 'PMBio@users.noreply.github.com'
git remote set-url origin https://x-access-token:${{ secrets.GITHUB_TOKEN }}@github.com/$GITHUB_REPOSITORY
git remote -v
git branch
git status
black .
git status
echo ready to commit
git commit -am "fixup! Format Python code with psf/black pull_request"
echo ready to push
git push
23 changes: 23 additions & 0 deletions .github/workflows/docs-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: "Pull Request Docs Check"
run-name: "Docs Check 📑📝"

on:
- pull_request

jobs:
docs-build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: ammaraskar/sphinx-action@0.4
with:
docs-folder: "docs/"

docs-link-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: ammaraskar/sphinx-action@0.4
with:
docs-folder: "docs/"
build-command: "make linkcheck"
136 changes: 111 additions & 25 deletions .github/workflows/github-actions.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,26 +8,33 @@ jobs:
steps:
- name: Check out repository code
uses: actions/checkout@v3
- name: Training Association Testing smoke test
uses: snakemake/snakemake-github-action@v1.24.0
- uses: mamba-org/setup-micromamba@v1.4.3
with:
directory: 'example'
snakefile: 'pipelines/training_association_testing.snakefile'
args: '-j 2 -n'
environment-name: deeprvat-gh-action
environment-file: ${{ github.workspace }}/deeprvat_env_no_gpu.yml
cache-environment: true
cache-downloads: true
- name: Smoketest training_association_testing pipeline
run: |
python -m snakemake -n -j 2 --directory ${{ github.workspace }}/example \
--snakefile ${{ github.workspace }}/pipelines/training_association_testing.snakefile --show-failed-logs
shell: micromamba-shell {0}
- name: Link pretrained models
run: cd ${{ github.workspace }}/example && ln -s ../pretrained_models
- name: Association Testing Pretrained Smoke Test
uses: snakemake/snakemake-github-action@v1.24.0
with:
directory: 'example'
snakefile: 'pipelines/association_testing_pretrained.snakefile'
args: '-j 2 -n'
- name: Seed Gene Discovery Smoke Test
uses: snakemake/snakemake-github-action@v1.24.0
with:
directory: 'example'
snakefile: 'pipelines/seed_gene_discovery.snakefile'
args: '-j 2 -n'
shell: bash -el {0}
- name: Smoketest association_testing_pretrained pipeline
run: |
python -m snakemake -n -j 2 --directory ${{ github.workspace }}/example \
--snakefile ${{ github.workspace }}/pipelines/association_testing_pretrained.snakefile --show-failed-logs
shell: micromamba-shell {0}
- name: Copy seed gene discovery snakemake config
run: cd ${{ github.workspace }}/example && cp ../deeprvat/seed_gene_discovery/config.yaml .
shell: bash -el {0}
- name: Smoketest seed_gene_discovery pipeline
run: |
python -m snakemake -n -j 2 --directory ${{ github.workspace }}/example \
--snakefile ${{ github.workspace }}/pipelines/seed_gene_discovery.snakefile --show-failed-logs
shell: micromamba-shell {0}

DeepRVAT-Pipeline-Tests:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -76,15 +83,94 @@ jobs:
steps:
- name: Check out repository code
uses: actions/checkout@v3
- name: Preprocessing Smoke Test
uses: snakemake/snakemake-github-action@v1.24.0
- uses: mamba-org/setup-micromamba@v1.4.3
with:
environment-name: deeprvat-preprocess-gh-action
environment-file: ${{ github.workspace }}/deeprvat_preprocessing_env.yml
cache-environment: true
cache-downloads: true

- name: Fake fasta data
if: steps.cache-fasta.outputs.cache-hit != 'true'
run: |
cd ${{ github.workspace }}/example/preprocess && touch workdir/reference/GRCh38.primary_assembly.genome.fa
- name: Run preprocessing pipeline no qc Smoke Test
run: |
python -m snakemake -n -j 2 --directory ${{ github.workspace }}/example/preprocess \
--snakefile ${{ github.workspace }}/pipelines/preprocess_no_qc.snakefile \
--configfile ${{ github.workspace }}/pipelines/config/deeprvat_preprocess_config.yaml --show-failed-logs
shell: micromamba-shell {0}


- name: Preprocessing pipeline with qc Smoke Test
run: |
python -m snakemake -n -j 2 --directory ${{ github.workspace }}/example/preprocess \
--snakefile ${{ github.workspace }}/pipelines/preprocess_with_qc.snakefile \
--configfile ${{ github.workspace }}/pipelines/config/deeprvat_preprocess_config.yaml --show-failed-logs
shell: micromamba-shell {0}


DeepRVAT-Annotation-Pipeline-Smoke-Tests:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v3
- uses: mamba-org/setup-micromamba@v1.4.3
with:
environment-name: deeprvat-preprocess-gh-action
environment-file: ${{ github.workspace }}/deeprvat_preprocessing_env.yml
cache-environment: true
cache-downloads: true
- name: Annotations Smoke Test
run: |
python -m snakemake -n -j 2 --directory ${{ github.workspace }}/example/annotations \
--snakefile ${{ github.workspace }}/pipelines/annotations.snakefile \
--configfile ${{ github.workspace }}/pipelines/config/deeprvat_annotation_config.yaml --show-failed-logs
shell: micromamba-shell {0}


DeepRVAT-Preprocessing-Pipeline-Tests-No-QC:
runs-on: ubuntu-latest
needs: DeepRVAT-Preprocessing-Pipeline-Smoke-Tests
steps:
- name: Check out repository code
uses: actions/checkout@v3
- uses: mamba-org/setup-micromamba@v1.4.3
with:
directory: 'example/preprocess'
snakefile: 'pipelines/preprocess.snakefile'
args: '-j 2 -n --configfile pipelines/config/deeprvat_preprocess_config.yaml'
stagein: 'touch example/preprocess/workdir/reference/GRCh38.primary_assembly.genome.fa'
environment-name: deeprvat-preprocess-gh-action
environment-file: ${{ github.workspace }}/deeprvat_preprocessing_env.yml
cache-environment: true
cache-downloads: true

- name: Install DeepRVAT
run: pip install -e ${{ github.workspace }}
shell: micromamba-shell {0}

- name: Cache Fasta file
id: cache-fasta
uses: actions/cache@v3
with:
path: example/preprocess/workdir/reference
key: ${{ runner.os }}-reference-fasta

- name: Download and unpack fasta data
if: steps.cache-fasta.outputs.cache-hit != 'true'
run: |
cd ${{ github.workspace }}/example/preprocess && \
wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_44/GRCh38.primary_assembly.genome.fa.gz \
-O workdir/reference/GRCh38.primary_assembly.genome.fa.gz \
&& gzip -d workdir/reference/GRCh38.primary_assembly.genome.fa.gz
- name: Run preprocessing pipeline
run: |
python -m snakemake -j 2 --directory ${{ github.workspace }}/example/preprocess \
--snakefile ${{ github.workspace }}/pipelines/preprocess_no_qc.snakefile \
--configfile ${{ github.workspace }}/pipelines/config/deeprvat_preprocess_config.yaml --show-failed-logs
shell: micromamba-shell {0}


DeepRVAT-Preprocessing-Pipeline-Tests:
DeepRVAT-Preprocessing-Pipeline-Tests-With-QC:
runs-on: ubuntu-latest
needs: DeepRVAT-Preprocessing-Pipeline-Smoke-Tests
steps:
Expand Down Expand Up @@ -120,6 +206,6 @@ jobs:
- name: Run preprocessing pipeline
run: |
python -m snakemake -j 2 --directory ${{ github.workspace }}/example/preprocess \
--snakefile ${{ github.workspace }}/pipelines/preprocess.snakefile \
--snakefile ${{ github.workspace }}/pipelines/preprocess_with_qc.snakefile \
--configfile ${{ github.workspace }}/pipelines/config/deeprvat_preprocess_config.yaml --show-failed-logs
shell: micromamba-shell {0}
23 changes: 21 additions & 2 deletions .github/workflows/test-runner.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,25 @@ on: [ push ]

jobs:
DeepRVAT-Tests-Runner:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v3
- uses: mamba-org/setup-micromamba@v1.4.3
with:
environment-name: deeprvat-preprocess-gh-action
environment-file: ${{ github.workspace }}/deeprvat_env_no_gpu.yml
cache-environment: true
cache-downloads: true

- name: Install DeepRVAT
run: pip install -e ${{ github.workspace }}
shell: micromamba-shell {0}
- name: Run pytest deeprvat
run: pytest -v ${{ github.workspace }}/tests/deeprvat
shell: micromamba-shell {0}

DeepRVAT-Tests-Runner-Preprocessing:
runs-on: ubuntu-latest
steps:

Expand All @@ -20,6 +39,6 @@ jobs:
run: pip install -e ${{ github.workspace }}
shell: micromamba-shell {0}

- name: Run pytest
run: pytest -v ${{ github.workspace }}/tests
- name: Run pytest preprocessing
run: pytest -v ${{ github.workspace }}/tests/preprocessing
shell: micromamba-shell {0}
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,4 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
/docs/apidocs/
20 changes: 20 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.12"

sphinx:
configuration: docs/conf.py
fail_on_warning: true

python:
install:
- requirements: docs/requirements.txt
68 changes: 68 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
cff-version: 1.2.0
title: DeepRVAT
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Brian
family-names: Clarke
orcid: 'https://orcid.org/0000-0002-6695-286X'
- given-names: Eva
family-names: Holtkamp
orcid: 'https://orcid.org/0000-0002-2129-9908'
- given-names: Hakime
family-names: Öztürk
- given-names: Marcel
family-names: Mück
orcid: 'https://orcid.org/0009-0000-3129-2630'
- given-names: Magnus
family-names: Wahlberg
orcid: 'https://orcid.org/0009-0001-9140-2392'
- given-names: Kayla
family-names: Meyer
orcid: 'https://orcid.org/0009-0003-5063-5266'
- given-names: Felix
family-names: Munzlinger
orcid: 'https://orcid.org/0009-0005-1407-8145'
- given-names: Felix
family-names: Brechtmann
orcid: 'https://orcid.org/0000-0002-0110-152X'
- given-names: Florian Rupert
family-names: Hölzlwimmer
orcid: 'https://orcid.org/0000-0002-5522-2562'
- given-names: Julien
family-names: Gagneur
orcid: 'https://orcid.org/0000-0002-8924-8365'
- given-names: Oliver
family-names: Stegle
orcid: 'https://orcid.org/0000-0002-8818-7193'
identifiers:
- type: doi
value: 10.1101/2023.07.12.548506
repository-code: 'https://github.com/PMBio/deeprvat'
abstract: >-
Integration of variant annotations using deep set networks
boosts rare variant association genetics.
Rare genetic variants can strongly predispose to disease,
yet accounting for rare variants in genetic analyses is
statistically challenging. While rich variant annotations
hold the promise to enable well-powered rare variant
association tests, methods integrating variant annotations
in a data-driven manner are lacking. Here, we propose
DeepRVAT, a set neural network-based approach to learn
burden scores from rare variants, annotations and
phenotypes. In contrast to existing methods, DeepRVAT
yields a single, trait-agnostic, nonlinear gene impairment
score, enabling both risk prediction and gene discovery in
a unified framework. On 21 quantitative traits and
whole-exome-sequencing data from UK Biobank, DeepRVAT
offers substantial increases in gene discoveries and
improved replication rates in held-out data. Moreover, we
demonstrate that the integrative DeepRVAT gene impairment
score greatly improves detection of individuals at high
genetic risk. We show that pre-trained DeepRVAT scores
generalize across traits, opening up the possibility to
conduct highly computationally efficient rare variant
tests.
license: MIT
Loading

0 comments on commit 8ba9c47

Please sign in to comment.