You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I was using cactus-progressive to create a pangenome from multiple species. Now that I want to analyze the variations between samples, I find myself a bit stuck.
I used the cactus-hal2maf using a reference to an ancestor. Next, I tried to get vcf file using maf2vcf. Yet I'm getting the following error:
ERROR: Couldn't find a header line (must start with Hugo_Symbol, Chromosome, or Tumor_Sample_Barcode): cactus.maf
and this is the maf file:
##maf version=1 scoring=N/A
a
s Anc3.Anc3refChr0 0 81 + 2129 G----------ctaaccctaaccct--aaccctaaccctaaccc-taaccccaaaccctaac-cctaccccaaaccctaacctaaaccctaaccc
s Thunnus_orientalis.scaffold1 0 82 + 34771766 G----------ctaaccctaaccct--aaccctaaccctaaccc-taacacctaaccctaacgcctacccaaaaccctaacctaaaccctaaccc
s Thunnus_thynnus.scaffold1 0 94 + 34640894 TACCCTAAACCCaaaccctaaccctcaaaccctaaccctaacccctaaccccaaaccctaac-cctacccctacccctaacccaaaccctaaccc
a
s Anc3.Anc3refChr0 81 408 + 2129 taaccctaa-ccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaccccaaacccta-ccctaaccctaaccctaacccgaacccaac---cctaaccc-taaccctaacccgaaccctaacccta-ccctaacccaaaccctacccctaaccctaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccc-taaccctaa-ccctaaccctaaccctacccctaaccctaaccc-taaccctaaccctaaccctaaccc------------tatccctaaccctaaccctaaccc--aaccctaacccaaccctaaccctaaccctaaccctaaccctaaccctaaccctca---------ccctaaccctaaccctaaccctaaccctaaccctaaccc
s Thunnus_orientalis.scaffold1 82 409 + 34771766 taaccctaa-cccgaaccctaacccgaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaacccgaacccta-cccgaaccctaaccctaacccgaacccaac---cctaaccc-taaccctaacccgaaccctaaccctaaccctaaccctaaccctaaccctaacccgaacctaaccctaaccctaaccctaaccctaaccctaaccctaaccc-taaccctaa-ccctaaccctaaccctaaccctaaccctaaccc-taaccctaaccctaaccctaaccc------------tatccctaaccctaaccctaccct--aaccctaacctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaa---------ccctaaccctaaccctaaccctaaccctaaccctaaccc
I would much appreciate any help. Maybe you could recommend a different method to getting the vcf file?
Thank you
The text was updated successfully, but these errors were encountered:
The MAF format output by cactus-hal2maf is described here, where there is no mention of Hugo_Symbol etc. I guess you will need to consult with the authors or documentation of maf2vcf and add in the appropriate header yourself.
If you want a VCF directly from Cactus, you need to use the pangenome pipeline, but that only works for samples from the same or very closely related species (I'd also argue the VCF format itself is mostly suited for this type of data)
Thank you for your response.
When you are saying "closely related species", what might be a distance cut off? For example, I'm working on species from the same genus which separated around 3MYA.
Hello,
I was using cactus-progressive to create a pangenome from multiple species. Now that I want to analyze the variations between samples, I find myself a bit stuck.
I used the cactus-hal2maf using a reference to an ancestor. Next, I tried to get vcf file using maf2vcf. Yet I'm getting the following error:
and this is the maf file:
I would much appreciate any help. Maybe you could recommend a different method to getting the vcf file?
Thank you
The text was updated successfully, but these errors were encountered: