Skip to content

Commit

Permalink
Speed up faidx.
Browse files Browse the repository at this point in the history
- bgzf_getc is slow as it's a heavy function and not inlined.  Most of
  the time though it's just an array fetch, so inline the basic form and
  revert to the function call for the complex form.

- isgraph and all other ctype functions are slow.  We assume ASCII and
  just do a naive implementation.

The speed benefits are (seconds):

                      Old     New
    Index GRCh38      13.4    8.4
    Query chr1	       1.7    0.9

Given a significant speed change for a small localised modification it
seems worth while having.
  • Loading branch information
jkbonfield authored and whitwham committed Jul 8, 2024
1 parent b8145e6 commit f3d401c
Showing 1 changed file with 25 additions and 1 deletion.
26 changes: 25 additions & 1 deletion faidx.c
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,29 @@ DEALINGS IN THE SOFTWARE. */
#include "htslib/kstring.h"
#include "hts_internal.h"

// Faster isgraph; assumes ASCII
static inline int isgraph_(unsigned char c) {
return c > ' ' && c <= '~';
}

#ifdef isgraph
# undef isgraph
#endif
#define isgraph isgraph_

// An optimised bgzf_getc.
// We could consider moving this to bgzf.h, but our own code uses it here only.
static inline int bgzf_getc_(BGZF *fp) {
if (fp->block_offset+1 < fp->block_length) {
int c = ((unsigned char*)fp->uncompressed_block)[fp->block_offset++];
fp->uncompressed_address++;
return c;
}

return bgzf_getc(fp);
}
#define bgzf_getc bgzf_getc_

typedef struct {
int id; // faidx_t->name[id] is for this struct.
uint32_t line_len, line_blen;
Expand Down Expand Up @@ -727,7 +750,8 @@ static char *fai_retrieve(const faidx_t *fai, const faidx1_t *val,
return NULL;
}

while ( l < end - beg && (c=bgzf_getc(fai->bgzf))>=0 )
BGZF *fp = fai->bgzf;
while ( l < end - beg && (c=bgzf_getc(fp))>=0 )
if (isgraph(c)) s[l++] = c;
if (c < 0) {
hts_log_error("Failed to retrieve block: %s",
Expand Down

0 comments on commit f3d401c

Please sign in to comment.