You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GenBank file parsing is a major bottleneck for domain_search.py on large databases. The current GenBank parser is a fork of the BioPython GenBank parser, which is pure python, uses some regexes, and is slow. It would be great to integrate something like the rust parser: https://github.com/althonos/gb-io.py
A complication is that Domainator internals are quite reliant on BioPython SeqRecord objects, which might be hard to interface with or replicate with a faster genbank parser.
The text was updated successfully, but these errors were encountered:
GenBank file parsing is a major bottleneck for
domain_search.py
on large databases. The current GenBank parser is a fork of the BioPython GenBank parser, which is pure python, uses some regexes, and is slow. It would be great to integrate something like the rust parser: https://github.com/althonos/gb-io.pyA complication is that Domainator internals are quite reliant on BioPython SeqRecord objects, which might be hard to interface with or replicate with a faster genbank parser.
The text was updated successfully, but these errors were encountered: