Simple-genetic-correlation is a simple tool designed for calculating and visualizing genomic trait correlations using summary statistics from GWAS (Genome-Wide Association Studies).
- R (with
GenomicSEM
,utils
, andMatrix
packages) - Python 3 (with
pandas
andmatplotlib
packages)
-
Clone the repository:
git clone git@github.com:gpmerola/Simple-genetic-correlation.git cd GenomicCorrelator
-
Install R packages:
install.packages(c("GenomicSEM", "utils", "Matrix"))
-
Install Python packages:
pip install pandas matplotlib
Run the main.R
script to compute the trait correlations and generate a results table.
Rscript main.R
Run the corrplot.py script to create a visual representation of the trait correlations.
python corrplot.py
- Correlation_input.csv: A CSV file containing the traits name, traits file name, sample prevalence, and population prevalence.
trait,code,sampleprev,popprev
Trait1.gz,trait1,0.1,0.01
Trait2.gz,trait2,0.2,0.02
Trait3.gz,trait3,0.15,0.015
-
Update these paths in the script according to your data storage locations:
-
Summary Statistics Path (paths_corr): Set to the directory holding the GWAS summary statistics.
-
LD Score Path (ld): Set to the directory with the LD score files used for analysis.
-
- results_table.csv: A CSV file containing the calculated correlations and standard errors.
- correlation_plot.png: A PNG file visualizing the correlations with error bars.
This project is licensed under the MIT License - see the LICENSE file for details.