You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I have been having an issue converting a GWAS file from genome build 38 to genome build 37 (hg19). In order to do this, I specified hg38 for "build" in the .json file of the GWAS, and then provided the two prepared DBSNP files from hg19 (--dbsnp-1 and --dbsnp-2), as well as the --chain-file that specifies conversion from hg38 to hg19.
This results in the program finishing at Step 3 without performing LiftOver as below:
SumStatsRehab v1.2.1 - fix command
input build: hg38
=== Step 1: Format the GWAS SS file ===
the SumStats file is a gzip. Unpacking
Step 1 finished in 67.09121966362 seconds
=== Step 2: Validate entries in the formatted GWAS SS file and save the report ===
number of lines in the file: 20155434
validating entries : 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [02:08<00:00, 157126.27it/s]
calculating reports: 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [01:15<00:00, 268250.15it/s]
generating reports
found issues:
rsID: 790496/20155433 (3.92%)
Step 2 finished in 205.60818552970886 seconds
=== Step 3: Analyze the report and prepare for REHAB ===
Step 3 finished in 0.0001633167266845703 seconds
The input file has nothing to resolve
To see if it would do anything, I decided to instead specify the "build" of my hg38 GWAS to hg19 in the .json file, and provide the same --dbsnp and --chain-file files as above, which resulted in LiftOver being performed successfully:
SumStatsRehab v1.2.1 - fix command
input build: hg19
=== Step 1: Format the GWAS SS file ===
the SumStats file is a gzip. Unpacking
Step 1 finished in 68.19040560722351 seconds
=== Step 2: Validate entries in the formatted GWAS SS file and save the report ===
number of lines in the file: 20155434
validating entries : 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [02:07<00:00, 157908.08it/s]
calculating reports: 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [01:15<00:00, 265598.63it/s]
generating reports
found issues:
rsID: 790496/20155433 (3.92%)
Step 2 finished in 205.75064539909363 seconds
=== Step 3: Analyze the report and prepare for REHAB ===
lifting over : 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [01:29<00:00, 226464.01it/s]
finished liftover to hg38 (saved report)
number of lines in the file: 20155434
validating entries : 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [02:10<00:00, 154159.25it/s]
calculating reports: 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [01:16<00:00, 262338.98it/s]
generating reports
found issues:
rsID: 790496/20155433 (3.92%)
Chr: 834211/20155433 (4.14%)
BP: 826569/20155433 (4.1%)
790496/20155433 entries are missing rsID
Going to sort the GWAS SS file by Chr and BP
Sorted by Chr and BP
Step 3 finished in 341.96107244491577 seconds
=== Step 4: REHAB: loopping through the GWAS SS file and fixing entries ===
loop-fix : 100%|████████████████████████████████████████████████████████████████████████| 20155433/20155433 [26:07<00:00, 12858.91it/s]
Step 4 finished in 1567.431545972824 seconds
=== Step 5: Validate entries in the fixed GWAS SS file and save the report ===
number of lines in the file: 20155434
validating entries : 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [02:11<00:00, 153484.10it/s]
calculating reports: 100%|███████████████████████████████████████████████████████████████████████| 20155433/20155433 [01:17<00:00, 259726.21it/s]
generating reports
found issues:
rsID: 784818/20155433 (3.89%)
Chr: 834211/20155433 (4.14%)
BP: 826569/20155433 (4.1%)
Step 5 finished in 211.63463640213013 seconds
=== Step 6: Analyze the report after REHAB ===
lost 834211 (4.14%) "Chr" fields after liftover
lost 826569 (4.1%) "BP" fields after liftover
restored 5678 (0.03%) "rsID" fields
Step 6 finished in 0.00021147727966308594 seconds
Those issues which were possible to resolve have been resolved
This produces a file which appears to be successfully transferred from hg38 to hg19. Is the LiftOver function of SumStatsRehab only intended to be from lower genome builds to hg38?
It seems like specifying an hg38 GWAS as hg19, and then matching the dbsnp files to that lower build with a chain file that goes from hg38 to hg19 works. However, specifying the hg38 GWAS as its actual build does nothing if you want to convert it to build hg19.
The text was updated successfully, but these errors were encountered:
Hello, I have been having an issue converting a GWAS file from genome build 38 to genome build 37 (hg19). In order to do this, I specified hg38 for "build" in the .json file of the GWAS, and then provided the two prepared DBSNP files from hg19 (--dbsnp-1 and --dbsnp-2), as well as the --chain-file that specifies conversion from hg38 to hg19.
This results in the program finishing at Step 3 without performing LiftOver as below:
To see if it would do anything, I decided to instead specify the "build" of my hg38 GWAS to hg19 in the .json file, and provide the same --dbsnp and --chain-file files as above, which resulted in LiftOver being performed successfully:
This produces a file which appears to be successfully transferred from hg38 to hg19. Is the LiftOver function of SumStatsRehab only intended to be from lower genome builds to hg38?
It seems like specifying an hg38 GWAS as hg19, and then matching the dbsnp files to that lower build with a chain file that goes from hg38 to hg19 works. However, specifying the hg38 GWAS as its actual build does nothing if you want to convert it to build hg19.
The text was updated successfully, but these errors were encountered: