-
Notifications
You must be signed in to change notification settings - Fork 4
Configuration
Anton Ligterink edited this page Jun 25, 2021
·
5 revisions
Below, you can find the full list of customizable parameters included in the configuration file. Note that before running the toolkit, you will also need to change the SLURM settings at the top of the prstoolkit.sh
file. Also, make sure to remove the '/' (forward-slash) at the end of any directory variable.
Parameter | Description |
---|---|
PRSMETHOD |
Indicate what method to use [PLINK/RAPIDOPGS/PRSCS/PRSICE/NONE]. Pick NONE if you only whish to perform quality control. |
PROJECTNAME |
Name of the project. |
PROJECT_DIR |
Path to where the main analysis directory resides. |
OUTPUT_DIRNAME |
Name of the output directory within the PROJECT_DIR directory. |
SUBPROJECT_DIR_NAME |
Name of (sub)project -- this will be used to create subfolders within the OUTPUTDIR . |
MAIN_WORKDIR_NAME |
Name of the working directory within the main analysis directory, used for temporary files. |
LOG_DIRNAME |
Name of the subdirectory of the PROJECT_DIR directory used for storing log files. |
QC |
Indicate whether quality control should be applied according to the MAF and INFO parameters. [YES/NO] |
MAF |
Minimum minor allele frequency to keep variants, e.g. "0.005". |
INFO |
Minimum imputation quality score to keep variants, e.g. "0.3". |
KEEP_TEMP_FILES |
Keep the files temporarily generated by the toolkit at the end of the job. [TRUE/FALSE] |
SAVE_CONFIG |
Save a copy of this configuration file along with the results. [TRUE/FALSE] |
Parameter | Description | RapidoPGS | PRS-CS | PRSice | PLINK |
---|---|---|---|---|---|
BASEDATA |
Path to the file containing the base data. | R | R | R | R |
BF_BUILD |
Build of the base file, e.g. "hg19" or "hg38". | R | |||
BF_ID_COL |
Name of the SNP ID column in the base file. | R | R | R | R |
BF_CHR_COL |
Name of the chromosome column in the base file. | R | R | ||
BF_POS_COL |
Name of the position column in the base file. | R | R | ||
BF_EFFECT_COL |
Name of the effect allele column in the base file. | R | R | R | R |
BF_NON_EFFECT_COL |
Name of the non-effect allele column in the base file. | R | R | R | |
BF_STAT |
Type of measure in the BF_STAT_COL, either "beta" or "or". | * | R | R | |
BF_STAT_COL |
Name of the beta/OR/effect size column in the base file. | R | R | R | R |
BF_FRQ_COL |
Name of the effect allele frequency column in the base file. | R/O** | |||
BF_SE_COL |
Name of the column of the standard error of the beta/OR value. | R | |||
BF_PVALUE_COL |
Name of the column containing the P-values of the assocation test. | R | R | R | |
BF_SBJ_COL |
Name of the column containing the sample size for each variant. | R/O*** | |||
BF_SAMPLE_SIZE |
Sample size of the GWAS | R/O*** | R | ||
BF_TARGET_TYPE |
"cc" for a case control trait, "quant" for a quantative trait | R | |||
LDDATA |
Path to the linkage disequilibrium reference data. PRS-CS and PRSice require a different format. | R**** | O***** | ||
VALIDATIONDATA |
Path to the directory containing the validation data, e.g. /hpc/data/_ae_originals . |
R | R | R | R |
VALIDATIONPREFIX |
Prefix of the validation files in BGEN format v1.2, excluding the chr-number and extension, e.g. aegs_combo_1kGp3GoNL5_RAW_chr . |
R | R | R | R |
VAL_REF_POS |
Position of the reference allele in the BGEN files relative to the alternative allele, ref-first, ref-last or ref-unknown. | R | R | R | |
SAMPLE_FILE |
Path to the sample file. A description of the sample file format can be found here. | R | R | R | R |
PRSICE_PHENOTYPE |
Phenotype which will be used by PRSice to find the best fitted set of polygenic scores, this phenotype must be present in the sample file. | R | |||
PRSICE_PHENOTYPE_BINARY |
[TRUE/FALSE] indicating whether PRSICE_PHENOTYPE contains a binary phenotype. |
R | |||
STATS_FILE |
Path to the stats file. | O | O | O | O |
STATS_ID_COL |
Name of the stats file column containing the SNP IDs, these IDs must match the IDs that occur in the base file. | O | O | O | O |
STATS_MAF_COL |
Name of the stats file column containing the minor allele frequency. | O | O | O | O |
STATS_INFO_COL |
Name of the stats file column containing the imputation score. | O | O | O | O |
Parameter | Description | RapidoPGS | PRS-CS | PRSice | PLINK |
---|---|---|---|---|---|
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` | |||||
`` |