Skip to content

Commit

Permalink
Merge pull request #302 from monarch-initiative/fix-gt-category-refactor
Browse files Browse the repository at this point in the history
Update tests to reflect the genotype category label changes
  • Loading branch information
ielis authored Oct 7, 2024
2 parents 5bb0e0b + b0a3aee commit d5dcbbf
Show file tree
Hide file tree
Showing 8 changed files with 64 additions and 55 deletions.
6 changes: 3 additions & 3 deletions docs/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -315,10 +315,10 @@ and these are the tested HPO terms ordered by the p value corrected with the Ben
:file: report/tbx5_frameshift_vs_missense.csv
:header-rows: 2

.. doctest:: tutorial
:hide:
.. doctest:: tutorial
:hide:

>>> summary_df.to_csv('docs/report/tbx5_frameshift_vs_missense.csv') # doctest: +SKIP
>>> summary_df.to_csv('docs/report/tbx5_frameshift_vs_missense.csv') # doctest: +SKIP

We see that several HPO terms are significantly associated
with presence of a frameshift variant in *TBX5*.
Expand Down
22 changes: 15 additions & 7 deletions docs/user-guide/analyses/phenotype-groups.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,10 +126,10 @@ we expect the autosomal dominant mode of inheritance:
>>> from gpsea.analysis.predicate.genotype import autosomal_dominant
>>> gt_predicate = autosomal_dominant(is_frameshift)
>>> gt_predicate.display_question()
'What is the genotype group: HOM_REF, HET'
'What is the genotype group: No allele, Monoallelic'

`gt_predicate` will assign the patients with no frameshift variant allele into `HOM_REF` group
and the patients with one frameshift allele will be assigned into `HET` group.
`gt_predicate` will assign the patients with no frameshift variant allele into `No allele` group
and the patients with one frameshift allele will be assigned into `Monoallelic` group.
Note, any patient with 2 or more alleles will be *omitted* from the analysis.

.. note::
Expand Down Expand Up @@ -239,13 +239,16 @@ We can learn more by showing the MTC filter report:
>>> from gpsea.view import MtcStatsViewer
>>> mtc_viewer = MtcStatsViewer()
>>> mtc_report = mtc_viewer.process(result)
>>> with open('docs/user-guide/report/tbx5_frameshift.mtc_report.html', 'w') as fh: # doctest: +SKIP
... _ = fh.write(mtc_report)

>>> mtc_report # doctest: +SKIP

.. raw:: html
:file: report/tbx5_frameshift.mtc_report.html

.. doctest:: phenotype-groups
:hide:

>>> mtc_report.write('docs/user-guide/analyses/report/tbx5_frameshift.mtc_report.html') # doctest: +SKIP


Genotype phenotype associations
===============================
Expand All @@ -255,12 +258,17 @@ ordered by the corrected p value (Benjamini-Hochberg FDR):

>>> from gpsea.view import summarize_hpo_analysis
>>> summary_df = summarize_hpo_analysis(hpo, result)
>>> summary_df.to_csv('docs/user-guide/report/tbx5_frameshift.csv') # doctest: +SKIP
>>> summary_df # doctest: +SKIP

.. csv-table:: *TBX5* frameshift vs rest
:file: report/tbx5_frameshift.csv
:header-rows: 2

.. doctest:: phenotype-groups
:hide:

>>> summary_df.to_csv('docs/user-guide/analyses/report/tbx5_frameshift.csv') # doctest: +SKIP


The table shows that several HPO terms are significantly associated
with presence of a heterozygous (`HET`) frameshift variant in *TBX5*.
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/analyses/report/tbx5_frameshift.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
What is the genotype group,HOM_REF,HOM_REF,HET,HET,,
What is the genotype group,No allele,No allele,Monoallelic,Monoallelic,,
,Count,Percent,Count,Percent,Corrected p values,p values
Ventricular septal defect [HP:0001629],42/71,59%,19/19,100%,0.003870827221893892,0.00024192670136836825
Abnormal atrioventricular conduction [HP:0005150],1/23,4%,3/3,100%,0.01230769230769231,0.0015384615384615387
Expand Down
42 changes: 21 additions & 21 deletions docs/user-guide/predicates/mode_of_inheritance_predicate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,15 @@ to assign an individual into one of the following categories:

.. table:: Autosomal dominant predicate categories

+------------------+-------------+
| Allele count | Category |
+==================+=============+
| 0 | `HOM_REF` |
+------------------+-------------+
| 1 | `HET` |
+------------------+-------------+
| :math:`\ge 2` | ``None`` |
+------------------+-------------+
+------------------+----------------+
| Allele count | Category |
+==================+================+
| 0 | `No allele` |
+------------------+----------------+
| 1 | `Monoallelic` |
+------------------+----------------+
| :math:`\ge 2` | ``None`` |
+------------------+----------------+

Examples
========
Expand All @@ -50,7 +50,7 @@ We can create the predicate with the :func:`~gpsea.analysis.predicate.genotype.a
>>> from gpsea.analysis.predicate.genotype import autosomal_dominant
>>> gt_predicate = autosomal_dominant()
>>> gt_predicate.display_question()
'What is the genotype group: HOM_REF, HET'
'What is the genotype group: No allele, Monoallelic'


Use a variant subset
Expand Down Expand Up @@ -80,18 +80,18 @@ an individual into one of the genotype categories:
+------------------+-------------------+----------------+
| Allele count | Category | Category index |
+==================+===================+================+
| 0 | `HOM_REF` | 0 |
| 0 | `No allele` | 0 |
+------------------+-------------------+----------------+
| 1 | `HET` | 1 |
| 1 | `Monoallelic` | 1 |
+------------------+-------------------+----------------+
| 2 | `BIALLELIC_ALT` | 2 |
| 2 | `Biallelic` | 2 |
+------------------+-------------------+----------------+
| :math:`\ge 3` | ``None`` | |
+------------------+-------------------+----------------+

.. note::

`BIALLELIC_ALT` includes both homozygous and compound heterozygous genotypes.
`Biallelic` includes both homozygous and compound heterozygous genotypes.


Partitions
Expand All @@ -100,7 +100,7 @@ Partitions
Sometimes we are interested in lumping several genotype categories into a group or and then comparing the groups.
For instance, to compare phenotype of the individuals with *at least one* frameshift allele
with those with *no* frameshift allele. Alternatively, we may only want to analyze a subset of the genotype categories,
such as `HET` vs. `BIALLELIC_ALT`.
such as `Monoallelic` vs. `Biallelic`.

The `partitions` option of the :func:`~gpsea.analysis.predicate.genotype.autosomal_recessive` function
lets us do this.
Expand All @@ -127,7 +127,7 @@ with the :func:`~gpsea.analysis.predicate.genotype.autosomal_recessive` function
>>> from gpsea.analysis.predicate.genotype import autosomal_recessive
>>> gt_predicate = autosomal_recessive()
>>> gt_predicate.display_question()
'What is the genotype group: HOM_REF, HET, BIALLELIC_ALT'
'What is the genotype group: No allele, Monoallelic, Biallelic'


Use a variant subset
Expand All @@ -148,22 +148,22 @@ and then use it to create the autosomal recessive predicate:

>>> gt_predicate = autosomal_recessive(is_missense)
>>> gt_predicate.display_question()
'What is the genotype group: HOM_REF, HET, BIALLELIC_ALT'
'What is the genotype group: No allele, Monoallelic, Biallelic'

This predicate will assign the individuals into one of the listed genotype categories
based on the allele counts of the missense variants.


Compare `HET` vs. `BIALLELIC_ALT`
---------------------------------
Compare `Monoallelic` vs. `Biallelic`
-------------------------------------

We can provide ``partitions`` to only compare the heterozygotes with those carrying
biallelic alt mutations (homozygous alternate or compound heterozygous):

We consult the *Autosomal recessive predicate categories* table for the category indices
and we create the genotype group partitions:

>>> # `1` for `HET` and `2` for `BIALLELIC_ALT`
>>> # `1` for `Monoallelic` and `2` for `Biallelic`
>>> partitions = ({1,}, {2,})

which we use to create the autosomal recessive predicate:
Expand All @@ -172,4 +172,4 @@ which we use to create the autosomal recessive predicate:
... partitions=partitions,
... )
>>> gt_predicate.display_question()
'What is the genotype group: HET, BIALLELIC_ALT'
'What is the genotype group: Monoallelic, Biallelic'
14 changes: 7 additions & 7 deletions src/gpsea/analysis/predicate/genotype/_test__gt_predicates.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,25 +66,25 @@ def test_build_count_to_cat(
[(0,), (1,)],
ModeOfInheritanceInfo.autosomal_dominant(),
{
0: "HOM_REF",
1: "HET",
0: "No allele",
1: "Monoallelic",
},
),
(
[(0,), (1,), (2,)],
ModeOfInheritanceInfo.autosomal_recessive(),
{
0: "HOM_REF",
1: "HET",
2: "BIALLELIC_ALT",
0: "No allele",
1: "Monoallelic",
2: "Biallelic",
},
),
(
[(1,), (2,)],
ModeOfInheritanceInfo.autosomal_recessive(),
{
1: "HET",
2: "BIALLELIC_ALT",
1: "Monoallelic",
2: "Biallelic",
},
),
],
Expand Down
2 changes: 1 addition & 1 deletion src/gpsea/preprocessing/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,7 @@ def load_phenopacket_files(


def load_phenopackets(
phenopackets: typing.Iterator[Phenopacket],
phenopackets: typing.Iterable[Phenopacket],
cohort_creator: CohortCreator[Phenopacket],
validation_policy: typing.Literal["permissive", "lenient", "strict"] = "permissive",
) -> typing.Tuple[Cohort, PreprocessingValidationResult]:
Expand Down
3 changes: 2 additions & 1 deletion src/gpsea/view/_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,12 @@ def html(self) -> str:

def write(self, fh: typing.Union[io.IOBase, str]):
should_close = isinstance(fh, str)
fout = None
try:
fout = open_text_io_handle_for_writing(fh)
fout.write(self._html)
except Exception:
if should_close:
if should_close and fout is not None:
fout.close()

def _repr_html_(self) -> str:
Expand Down
28 changes: 14 additions & 14 deletions tests/analysis/predicate/genotype/test_gt_predicates.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,9 @@ def variant_predicate(self) -> VariantPredicate:
@pytest.mark.parametrize(
"patient_name,name",
[
("adam", "HOM_REF"),
("eve", "HET"),
("cain", "HET"),
("adam", "No allele"),
("eve", "Monoallelic"),
("cain", "Monoallelic"),
],
)
def test_autosomal_dominant(
Expand All @@ -115,9 +115,9 @@ def test_autosomal_dominant(
@pytest.mark.parametrize(
"patient_name,name",
[
("adam", "HET"), # 0/0 & 0/1
("eve", "HET"), # 0/1 & 0/0
("cain", "HET"), # 0/1 & 0/0
("adam", "Monoallelic"), # 0/0 & 0/1
("eve", "Monoallelic"), # 0/1 & 0/0
("cain", "Monoallelic"), # 0/1 & 0/0
],
)
def test_autosomal_dominant__with_default_predicate(
Expand All @@ -138,10 +138,10 @@ def test_autosomal_dominant__with_default_predicate(
@pytest.mark.parametrize(
"patient_name,name",
[
("walt", "HET"),
("skyler", "HET"),
("flynn", "BIALLELIC_ALT"),
("holly", "HOM_REF"),
("walt", "Monoallelic"),
("skyler", "Monoallelic"),
("flynn", "Biallelic"),
("holly", "No allele"),
],
)
def test_autosomal_recessive(
Expand All @@ -164,10 +164,10 @@ def test_autosomal_recessive(
"patient_name,name",
[
# The White family has two variants:
("walt", "BIALLELIC_ALT"), # 0/1 & 0/1
("skyler", "BIALLELIC_ALT"), # 0/1 & 0/1
("flynn", "BIALLELIC_ALT"), # 1/1 & 0/0
("holly", "BIALLELIC_ALT"), # 0/0 & 1/1
("walt", "Biallelic"), # 0/1 & 0/1
("skyler", "Biallelic"), # 0/1 & 0/1
("flynn", "Biallelic"), # 1/1 & 0/0
("holly", "Biallelic"), # 0/0 & 1/1
],
)
def test_autosomal_recessive__with_default_predicate(
Expand Down

0 comments on commit d5dcbbf

Please sign in to comment.