Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up kputll. #1805

Merged
merged 1 commit into from
Jul 16, 2024
Merged

Speed up kputll. #1805

merged 1 commit into from
Jul 16, 2024

Commits on Jul 11, 2024

  1. Speed up kputll.

    The kputuw function is considerably faster as it encodes 2 digits at a
    time and also utilises __builtin_clz.  This changes kputll to use the
    same 2 digits at a time trick.  I have a __builtin_clzll variant too,
    but with longer numbers it's not the main bottleneck and we fall back
    to kputuw for small numbers.  This avoids complicating the code with
    builtin checks and alternate versions.
    
    An alternative, purely for sam_format1_append would be something like:
    
        static inline int kputll_fast(long long c, kstring_t *s) {
            return c <= INT_MAX && c >= INT_MIN ? kputw(c, s) : kputll(c, s);
        }
        #define kputll kputll_fast
    
    This works as BAM/CRAM only support 32-bit numbers for POS, PNEXT and
    TLEN anyway, so ll vs w is an irrelevant distinction.  However I chose
    to modify the header file so it fixes other callers.
    
    Overall compressed BAM to uncompressed SAM conversion is about 5%
    quicker (tested on 10 million short-read seqs; it'll be minimal on
    long seqs).  This includes decode time and other functions too.  The
    sam_format1_append only component of that is about 15-25% quicker
    depending on compiler and version.
    jkbonfield committed Jul 11, 2024
    Configuration menu
    Copy the full SHA
    6b9d7f1 View commit details
    Browse the repository at this point in the history