Skip to content

Step 00 ‐ check data

gibran hemani edited this page Oct 20, 2024 · 7 revisions

This step checks the input genotype data, and performs some basic formatting. Notes:

  • Make sure that you have your genotype data in plink bed/bim/fam format, with a different dataset for each chromosome.
  • This ran in 25 minutes on 450k samples in UK Biobank with env_threads=100 and used approx 10Gb RAM

To run

./00-check-data.sh

Check and archive the results

./utils/archive.sh 00

Upload the results to the SFTP

./utils/upload.sh 00

(Note that this will prompt you for your SFTP password - contact us if you haven't received this.

(Optional) Running in containers

Docker

Make sure you've pulled the latest image:

docker pull mrcieu/lifecourse-gwas:latest

and then run:

./utils/run_docker.sh ./00-check-data.sh

Apptainer

Make sure you've pulled the latest image:

apptainer pull docker://mrcieu/lifecourse-gwas:latest
./utils/run_apptainer.sh ./00-check-data.sh