Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support circular GFF3 features in hotspot detection? #31

Open
3 tasks
fedarko opened this issue Aug 11, 2022 · 0 comments
Open
3 tasks

Support circular GFF3 features in hotspot detection? #31

fedarko opened this issue Aug 11, 2022 · 0 comments
Labels
backburner Low-priority things that are still good to keep track of

Comments

@fedarko
Copy link
Owner

fedarko commented Aug 11, 2022

Version 1.26 of the GFF3 spec mentions that circular features can be encoded in a GFF3 file by setting the end coordinate of such a feature to a position greater than the rightmost position in a contig.

We currently don't support this sort of feature in our code, and will raise an error if we see something like this. FWIW, prodigal's gene predictions on the SheepGut dataset don't have this problem at all (although this might be a result of us using the -c option).

Anyway, handling this sort of case is definitely feasible, but will require a bit of extra work. So I'm putting this issue on the backburner for now, in lieu of more important issues; I can address this if there is desire for it.

Things to do to implement support for circular features

  • Replace set(feature_range) with just a set of all positions in these features (probably makes sense to create two ranges -- positions from feature start to contig end, and from contig start to feature start -- and merge these into a single set)

  • Check for the weird case where the end loops around the contig more than once, and raise an error in this case. Given a 1-indexed start s coordinate in the range [1, n] (for a contig of length n), the only valid "circular" end coordinates should be in the range [n + 1, n + s - 1]. (Anything past that, and positions would start being represented more than once in this feature.)

  • Test

@fedarko fedarko added the backburner Low-priority things that are still good to keep track of label Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backburner Low-priority things that are still good to keep track of
Projects
None yet
Development

No branches or pull requests

1 participant