Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in mpileup. #27

Merged
merged 2 commits into from
Feb 10, 2017
Merged

Fix bugs in mpileup. #27

merged 2 commits into from
Feb 10, 2017

Conversation

alumi
Copy link
Member

@alumi alumi commented Feb 9, 2017

Summary

This PR fixes some problems in mpileup command.

Fixes

  • Fix: may cause StackOverflow when piling up long sequence.
  • Fix: quality score may be incorrect for bases with complex cigar.
  • Fix: output deletions.
  • Memoize parsing cigar string into sequence of indices. Parsing itself also get slightly faster.

Known issues

There is a difference between cljam and samtools for adjacent indels.

input SAM

@SQ SN:ref  LN:20
r001 0 ref 2 60  1M5D5I  * 0 0 ATGCAT  IIIIII

samtools 1.3.1 output

ref	2	N	1	^]A-5NNNNN	I
ref	3	N	1	*	I
ref	4	N	1	*	I
ref	5	N	1	*	I
ref	6	N	1	*	I
ref	7	N	1	*+5GCATC$	I

cljam output

ref	2	N	1	A-5NNNNN	I
ref	3	N	1	*	~
ref	4	N	1	*	~
ref	5	N	1	*	~
ref	6	N	1	*	~
ref	7	N	1	*+5TGCAT	~

I'm not certain about the spec, but I guess the last pileup should be *+5TGCAT$.
The last C at 7th base is an artifact from first nibble of quality IIIIII caused by buffer overrun.
Note that we're omitting ^, mapq and $ to mark start / end of reads.

@alumi alumi requested a review from totakke February 9, 2017 09:49
@alumi alumi changed the title Fix bugs in mpileup and optimization. Fix bugs in mpileup. Feb 10, 2017
Copy link
Member

@totakke totakke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in this fix domain.

@totakke
Copy link
Member

totakke commented Feb 10, 2017

The pointed INDEL problem has existed since before. It should be noted in cljam issue.

@totakke totakke merged commit eef7841 into master Feb 10, 2017
totakke added a commit that referenced this pull request Feb 10, 2017
@totakke totakke deleted the fix/mpileup-so branch February 10, 2017 07:22
@alumi
Copy link
Member Author

alumi commented Feb 10, 2017

Thank you. I opened an issue #28.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants