Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

halPhyloP no output or error message? #296

Open
astarr97 opened this issue Feb 15, 2024 · 2 comments
Open

halPhyloP no output or error message? #296

astarr97 opened this issue Feb 15, 2024 · 2 comments

Comments

@astarr97
Copy link

astarr97 commented Feb 15, 2024

Hello,

I've been playing around with using halPhyloP to compute PhyloP scores on the 447-way placental mammal alignment. Rather than computing PhyloP scores for entire genomes, we would like to compute PhyloP scores for an arbitrary subset of sites in any genome of interest. It seems like halPhyloP using the --refBed argument should be able to do this and in some cases it can. However, in other cases it outputs a blank wig file with no error message or other explanation.

For example, using this command:
./cactus-bin-v2.6.13/bin/halPhyloP --refBed test_works.bed hg38.447way.hal Orcinus_orca fullTreeAnc239.100kb.mod test_worked.wig
successfully outputs the PhyloP scores for the 5 sites into test_worked.wig (see attached, I added ".txt" so I can upload it to github).
test_works.bed.txt
test_worked.wig.txt

However, when I do the same thing on a different set of sites with:
./cactus-bin-v2.6.13/bin/halPhyloP --refBed test_fails.bed hg38.447way.hal Orcinus_orca fullTreeAnc239.100kb.mod test_failed.wig
there is no output. It results in a blank .wig file and does not print any error message.
test_fails.bed.txt
test_failed.wig.txt

Any ideas on what might be going on here or how to resolve this? I have other examples showing it is not specific to this species or contig. It doesn't seem like the input bed files are different in any discernible way either. Any help would be much appreciated!

Edit: I should also add that no matter how large the "test_fails.bed" file is, everything finishes running in a few seconds.

@glennhickey
Copy link
Collaborator

That is strange. halPhyloP hasn't been maintained in a while, unfortunately. I recommend exporting to MAF and running regular phyloP directly on that.

Note that you can extract subregions of the MAF using taffy which is included in cactus.

@astarr97
Copy link
Author

Thanks for the quick reply! As halPhyloP is pretty fast, it isn't too bad to compute all the PhyloP scores for a genome so I will likely stick to that as it seems like halPhyloP works well for computing it for the whole alignment.

Sorry to hear about halPhyloP not being maintained. I've been computing PhyloP scores on a lot of different versions (i.e. masking different subsets of species) of the 447-way alignment using MAFs and the rate limiting step has actually been that I can only store 100 terabytes of data (each masked MAF file is massive) so it would definitely be very helpful for me, but my use case is probably pretty rare. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants