custom_gwas

Python Code using CyVCF2 to run custom GWAS models on imputed VCF files

AUTHOR

Post Doc at Institute of Biological Psychiatry

PURPOSE

This is code written in python, that uses the CyVCF2 library to parse imputed, bgzip'ed and tabix indexed VCF files and accepts custom models written in R as a command line option to run a genome wide association study, using rpy2, an interface, that facilitates r-functions to be seamlessly run in python code.

USAGE

Usage can be obtained by running the following command:

python custom_gwas.py -h

which prints:

usage: custom_gwas.py [-h] --vcf VCF --pheno PHENO --covar COVAR --rscript RSCRIPT --model MODEL --out OUT

where

VCF => The bgzip, tab indexed VCF file containing dosage values in the format column 'DS'

PHENO => File of phenotypes with samples listed under the column name IID and a phenotype name of your choice

COVAR => File of covariates with samples listed under the column name IID and your choice of covariates, weights etc.

RSCRIPT => some file.R that contains one or more functions pertaining to the association models you wish to fit.

MODEL => Name of the particular function in the file.R that you wish to fit

OUT => Output file name

An example for the VCF, PHENO, COVAR and OUT files are provided in the test/ folder.

An example of a custom R-script and function is provided at /bin/custom_function.R

Ultimately the freedom is given to the user to specify columns of their choice in the phenotype and covariate files and thereby using them in the Rscript they provide in specifying the model. The only major requirements are that:

VCF contains DS column in the FORMAT fields
The pheno and covariate files can be interesected using an IID column
The custom gwas Rscript returns a dataframe with the column names: estimate se p

RUNNING A TEST

The script can be tested on the test data using the following command:

python /bin/custom_gwas.py \
--vcf /test/test.vcf.gz \
--pheno /test/test_pheno.txt \
--covar /test/test_covar.txt \
--rscript /bin/custom_function.R \ 
--model custom_fit \
--out test_out.txt

DEPENDENCIES

The code has the following dependencies:

python
cyvcf2
numpy
pandas
rpy2
R
dplyr
broom

All dependencies can be installed using conda or the requisite environment can be created using the environment.yml file with the following command:

conda create --file environment.yml

For any questions or bug reports, please create an issue on github.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

custom_gwas

AUTHOR

PURPOSE

USAGE

RUNNING A TEST

DEPENDENCIES

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
bin		bin
test		test
README.md		README.md
environment.yml		environment.yml

vaqm2/custom_gwas

Folders and files

Latest commit

History

Repository files navigation

custom_gwas

AUTHOR

PURPOSE

USAGE

RUNNING A TEST

DEPENDENCIES

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages