Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
HolEv authored May 16, 2024
1 parent da04904 commit 7e3c63b
Showing 1 changed file with 12 additions and 20 deletions.
32 changes: 12 additions & 20 deletions association_testing/README.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,38 @@
# Instructions for reproducing results related to Fig. 3 and associated supplementary figures
# Instructions for reproducing results related to association testing figures from the DeepRVAT paper (Fig. 2 and Fig.4) and associated supplementary figures

All steps should be carried out using the deeprvat conda environment available in the [main DeepRVAT repository](https://github.com/PMBio/deeprvat/).

Preprocess and annotate the UKBB WES 200k release following the instructions in the [main DeepRVAT repository](https://github.com/PMBio/deeprvat/).
Preprocess and annotate the UKBB WES data following the instructions in the [main DeepRVAT repository](https://github.com/PMBio/deeprvat/).

## Run Burden/SKAT baselines (seed gene discovery)
First run the seed gene discovery/burden & skat tests with their `config.yaml` files in `burden_skat` and `burden_skat_binary` using the [seed gene discovery pipeline](https://github.com/PMBio/deeprvat/blob/main/pipelines/seed_gene_discovery.snakefile).
First run the seed gene discovery/burden & skat tests with their `config.yaml` files in `burden_skat` using the [seed gene discovery pipeline](https://github.com/PMBio/deeprvat/blob/main/pipelines/seed_gene_discovery.snakefile).

## Run DeepRVAT experiments for main figures

Once this is finsihed run the [DeepRVAT training/association testing pipeline](https://github.com/PMBio/deeprvat/blob/main/pipelines/training_association_testing.snakefile), following the instructions in the [main DeepRVAT repository](https://github.com/PMBio/deeprvat/) in the `paper_experiment` folder (uses `paper_experiment/config.yaml` as the config file).
Once this is finsihed run the [DeepRVAT CV training/association testing pipeline](https://github.com/PMBio/deeprvat/blob/main/pipelines/cv_training/cv_training_association_testing.snakefile), following the instructions in the [main DeepRVAT repository](https://github.com/PMBio/deeprvat/) in the `deeprvat_main_exp` folder (uses `deeprvat_main_exp/config.yaml` as the config file).

To use the pre-trained DeepRVAT model from `paper_experiment` on quantitative and binary phenotypes that the model was not trained on, create a simlink `pretrained_models` pointing to `paper_experiment/models` in `paper_experiment` & `paper_experiment_binary`.
Then run the [DeepRVAT association testing pipeline](https://github.com/PMBio/deeprvat/blob/main/pipelines/association_testing_pretrained.snakefile) in `deeprvat_pretrained_quantitative` and `deeprvat_pretrained_binary`
Consider the instructions for the [CV training](https://deeprvat.readthedocs.io/en/latest/deeprvat.html#training-and-association-testing-using-cross-validation) to run the experiment.

Then run
```
for x in $(\ls deeprvat_pretrained_quantitative | grep "^[A-Z]")
do
ln -rs deeprvat_pretrained_quantitative/$x paper_experiment/$x
done
```
to link all the results for all quantitative phenotypes into one folder (required for the plotting markdown).

## Run DeepRVAT experiments for supplementary figures

For the subdirectories `plof_missense_anno`, `linear_model`, `repeat_analysis`, with their association `config.yaml` files, run the [DeepRVAT training/association testing pipeline](https://github.com/PMBio/deeprvat/blob/main/pipelines/training_association_testing.snakefile).
For the subdirectories `plof_missense_anno`, `linear_model`, with their association `config.yaml` files, run the [DeepRVAT CV training/association testing pipeline](https://github.com/PMBio/deeprvat/blob/main/pipelines/cv_training/cv_training_association_testing.snakefile).

Before running the `permutation_analysis` experiment, run the script `permute_phenotypes.sh` to get a phenotypes dataframe with permuted phenotypes.

## Prepare additional data required for plotting
To compute the replication results, run:
```
for EXP in paper_experiment linear_model plof_missense_anno
for EXP in deeprvat_main_exp linear_model plof_missense_anno
do
python compute_replication.py --out-dir $EXP $EXP
done
EXP=repeat_analysis && python compute_replication.py --analyze-all-repeats --out-dir $EXP $EXP
```
## Experiments for Figure 4
The DeepRVAT model used to compute DeepRVAT gene impairment scores for all analyses in Figure 4 is the one trained in `deeprvat_main_exp`.
Follow the instructions from the main DeepRVAT instructions for using [DeepRVAT with REGENIE](https://deeprvat.readthedocs.io/en/latest/deeprvat.html#running-the-association-testing-pipeline-with-regenie) with precomputed gene impairment scores for quantiative and binary traits and on diverse subsets of samples.
For running the default REGENIE RVAT test (burden/SKAT) see the [REGENIE documentation](https://rgcgithub.github.io/regenie/).

## Get the paper figures

Use the notebooks `figure_3_main.Rmd` and `figure_3_supp.Rmd` to analyze the results. For this, the experiments in `../comparison_methods/{monti/staar}/experiments` also have to be run.


Use the makrdowns in this directory to generate all figures. For this, the experiments in `../comparison_methods/{monti/staar}/experiments` also have to be run.

0 comments on commit 7e3c63b

Please sign in to comment.