Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disha/primary transc fix #358

Merged
merged 10 commits into from
May 14, 2024
Merged

Disha/primary transc fix #358

merged 10 commits into from
May 14, 2024

Conversation

Dishalodha
Copy link
Contributor

I have changed the biotype for primary transcript --> miRNA_primary_transcript and the associated gene --> ncRNA_gene for loading them correctly.

@Dishalodha Dishalodha requested a review from MatBarba May 9, 2024 13:18
Copy link
Contributor

@MatBarba MatBarba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as intended, minor nitpicking as usual

@@ -457,7 +457,7 @@ def normalize_mirna(self, gene: SeqFeature) -> List[SeqFeature]:
"""Returns gene representations from a miRNA gene that can be loaded in an Ensembl database.

Change the representation from the form `gene[ primary_transcript[ exon, miRNA[ exon ] ] ]`
to `gene[ primary_transcript[ exon ] ]` and `gene[ miRNA[ exon ] ]`
to `gene[ miRNA_primary_transcript[ exon ] ]` and `gene[ miRNA[ exon ] ]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to `gene[ miRNA_primary_transcript[ exon ] ]` and `gene[ miRNA[ exon ] ]`
to `ncRNA_gene[ miRNA_primary_transcript[ exon ] ]` and `gene[ miRNA[ exon ] ]`

logging.debug(f"Formatting miRNA gene {gene.id}")

new_genes = []
new_primary_subfeatures = []
num = 1
for sub in primary.sub_features:
if sub.type == "exon":
gene.type = "ncRNA_gene"
primary.type = "miRNA_primary_transcript"
new_primary_subfeatures.append(sub)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the biotype change at the earlier stage where we create the primary and the gene (472-473)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes makes sense ! I have forgotten to make that change.

Comment on lines 458 to 490
does_not_raise(),
id="gene + primary_transcript + miRNA",
id="ncRNA_gene + miRNA_primary_transcript + miRNA",
),
param(
"mirna/pseudogene.gff",
"mirna/pseudogene_simped.gff",
does_not_raise(),
id="gene + primary_transcript - miRNA",
id="ncRNA_gene + miRNA_primary_transcript - miRNA",
),
param(
"mirna/nogene.gff",
"mirna/nogene_simped.gff",
does_not_raise(),
id="primary_transcript + miRNA",
id="miRNA_primary_transcript + miRNA",
),
param(
"mirna/pseudo_nogene.gff",
"mirna/pseudo_nogene_simped.gff",
does_not_raise(),
id="primary_transcript - miRNA",
id="miRNA_primary_transcript - miRNA",
),
param(
"mirna/unsupported_tr.gff",
"",
raises(GFFParserError, match="Unknown subtype"),
id="gene + primary_transcript + mRNA, not supported",
id="ncRNA_gene + miRNA_primary_transcript + mRNA, not supported",
),
param(
"mirna/two_primary.gff",
"",
raises(GFFParserError, match="too many sub_features"),
id="gene + 2x primary_transcript, not supported",
id="gene + 2x miRNA_primary_transcript, not supported",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I would keep the old names, because they correspond to the input biotypes, not the results (the input files have not changed)

@Dishalodha Dishalodha merged commit 3cf5104 into main May 14, 2024
1 check passed
@JAlvarezJarreta JAlvarezJarreta deleted the disha/primary_transc_fix branch June 4, 2024 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants