Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mafPairCoverage output bug #10

Open
ens-bwalts opened this issue Jun 11, 2015 · 3 comments
Open

mafPairCoverage output bug #10

ens-bwalts opened this issue Jun 11, 2015 · 3 comments

Comments

@ens-bwalts
Copy link

I'm seeing some strange output in mafPairCoverage when looking at coverage in regions specified from a .bed file. It looks like In Regions and Out of Regions are reversed (which is causing coverage over 100%):

# BED Regions
#           Sequence      Region Length      In Regions  Out of Regions        Coverage
              A_tha*           59894120        77296918        20069410    1.290559e+00
             A_tha.1           15946509        22110347         5122972    1.386532e+00
@dentearl
Copy link
Owner

Thanks for the report, this sounds interesting. The code has unit tests —
feel free to add your examples of failures and issue a pull request and
I'll evaluate. Or send me links to your failure examples so that I can
reproduce them.

On Thursday, June 11, 2015, ens-bwalts notifications@github.com wrote:

I'm seeing some strange output in mafPairCoverage when looking at coverage
in regions specified from a .bed file. It looks like In Regions and Out of
Regions are reversed (which is causing coverage over 100%):

BED Regions

Sequence Region Length In Regions Out of Regions Coverage

          A_tha*           59894120        77296918        20069410    1.290559e+00
         A_tha.1           15946509        22110347         5122972    1.386532e+00


Reply to this email directly or view it on GitHub
#10.

@ens-bwalts
Copy link
Author

It looks like it's summing all aligned positions within the bed region.

If I provide this .maf:

##maf version=1 scoring=N/A

a
s Anc0.Anc0refChr4307          30  33 +     10240 CTCACCTCCTAAAGAAACCAAGTACAAATTAAA
s Anc2.Anc2refChr1726         125  33 -     10260 CTCACCTCCTAAAGGAGCCAAGTACAAATTAAA
s Anc2.Anc2refChr4731       20097  33 -     64532 ATCACCTTCTAAAGAAACCAAGTACAAATTGAA
s A_lyr.2                17885422  33 -  19320864 CTCACCTCCTAAAGGAGCCAAGTACAAATTAAA
s A_lyr.1                 5031915  33 +  33132539 CTCACCTTCTAAAGAAACCAAGTACAAATTGAA
s A_tha.1                 4214339  33 +  30427671 ATCACCTTCTAAAGAAACCAAGTACAAATTGAA
s A_tha.1                23284080  33 +  30427671 CTCACCTCCTAAAGGAGCCAAGTACAAATTGAA

And look for the regions in this .bed:

A_tha.1 4214340 4214343
A_tha.1 5000000 5001000
mafPairCoverage -m test.maf --seq1 A_tha* --seq2 A_lyr* --bed test.bed -v 1

gives this output

# Overall
#           Sequence      Source Length      Obs Length    Aligned Pos.        Coverage    Obs Coverage
              A_tha*           30427671              66             132    4.338157e-06    2.000000e+00
              A_lyr*           52453403              66             132    2.516519e-06    2.000000e+00

# Individual Sequences
#           Sequence      Source Length      Obs Length    Aligned Pos.        Coverage    Obs Coverage
             A_tha.1           30427671              66             132    4.338157e-06    2.000000e+00
             A_lyr.1           33132539              33              66    1.991999e-06    2.000000e+00
             A_lyr.2           19320864              33              66    3.415996e-06    2.000000e+00

# BED Regions
#           Sequence      Region Length      In Regions  Out of Regions        Coverage
              A_tha*               1003               6             126    5.982054e-03
             A_tha.1               1003               6             126    5.982054e-03

Counting 3 bases for the alignment to A_lyr.1 and 3 bases for the alignment to A_lyr.2 as being in regions.

@dentearl
Copy link
Owner

Thanks for posting an example! Okay, I'll try to eek out some time to investigate this in the next few weeks (my full time job doesn't allow for much hacking around currently).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants