Streamline predicate/analysis workflow #88

ielis · 2023-10-23T18:10:55Z

Fixes #87 , #92

Depends on #94

…lyPredicate`, since it is its specialization. Improve documentation.

…have that!

…ic method.

…`CommunistCohortAnalysis`.

…ypeAnalysisResult`.

ielis · 2023-10-26T16:09:09Z

@lnrekerle

I'm proposing a revamp to the CohortAnalyzer.

The CohortAnalyzer is an abstraction - a promise what CohortAnalyzer can do for the user. To get CohortAnalyzer we use a similar pattern to configuring PhenopacketPatientCreator. There is a config method that will give you CohortAnalyzer:

from genophenocorr.analysis import configure_cohort_analysis

analysis = configure_cohort_analysis(cohort, hpo)

You'll get an analysis with default options. If you want to tweak the options, build the CohortAnalysisConfiguration:

from genophenocorr.analysis import CohortAnalysisConfiguration

configuration = CohortAnalysisConfiguration.builder()
  .include_sv(True)
  .pval_correction('fdr_bh')
  .build()

analysis = configure_cohort_analysis(cohort, hpo, configuration)

Then we run the analysis, e.g. to compare MISSENSE vs others:

from genophenocorr.model import VariantEffect
from genophenocorr.analysis.predicate import BooleanPredicate

results = analysis.compare_by_variant_effect(VariantEffect.MISSENSE_VARIANT, tx_id='NM_1234.5')
result_df = results.summarize(hpo, BooleanPredicate.YES)
result_df.head()

We get results, a container with a lot of data. We call summarize to prepare a data frame with phenotypes vs. genotypes, ordered by corrected p values.

Note that we provide BooleanPredicate.YES to show genotype-phenotype correlation for present HPO terms, not for not-present (we would use BooleanPredicate.NO to show those).

This is what the PR adds. Thanks to the changes, we have a general framework for applying genotype and phenotype predicates and showing the results.

Please check out the code, try it out and we can discuss in greater detail the next time.

# Conflicts: # src/genophenocorr/analysis/predicate/_all_predicates.py # src/genophenocorr/model/_cohort.py # src/genophenocorr/model/_variant.py

…e of `ProteinMetadata.get_features_variant_overlaps()`.

ielis · 2023-11-02T02:33:58Z

Now, with the develop merged into the PR branch, we should be OK to move forward with this PR if the code looks good.

ielis added 17 commits October 23, 2023 12:20

Rename SimplePredicate to BooleanPredicate and make it extend `Po…

a718152

…lyPredicate`, since it is its specialization. Improve documentation.

Streamline CohortAnalysis prep

e59bd70

Ensure we have the status attribute of Phenotype. We should always …

82b416c

…have that!

Update tutorial.rst

645efdd

Setup CohortAnalysis testing stub.

30c9bcc

Most of our predicates should be instances of BooleanPredicate.

8ad9b17

Most of our predicates should be instances of BooleanPredicate.

bc6ac44

Rename HPOPresentPredicate to PropagatingPhenotypePredicate.

fecb7ae

Turn PolyPredicate.categories property into get_categories() stat…

1306b22

…ic method.

Split predicates into genotype and phenotype packages, implement …

1552e47

…`CommunistCohortAnalysis`.

Show GPC analysis summary.

c529c84

Disable the local test.

4c24f08

Add documentation, expose phenotype_categories from `GenotypePhenot…

2aaef03

…ypeAnalysisResult`.

Add stub for comparing two variants.

f3f4c75

Merge branch 'develop' into work-on-predicates

cca7489

Add CohortAnalysis configuration.

bc91a9a

Merge branch 'fix-change-length-bug' into work-on-predicates

6ff4fc8

Merge branch 'fix-change-length-bug' into work-on-predicates

5580bf5

ielis marked this pull request as ready for review October 26, 2023 16:50

ielis linked an issue Oct 26, 2023 that may be closed by this pull request

Review CohortAnalysis._remove_low_hpo_terms #92

Closed

ielis added 4 commits November 1, 2023 22:05

Merge branch 'develop' into work-on-predicates

7ebeb2c

# Conflicts: # src/genophenocorr/analysis/predicate/_all_predicates.py # src/genophenocorr/model/_cohort.py # src/genophenocorr/model/_variant.py

Straighten out Cohort.get_protein_features_affected - the only usag…

e377472

…e of `ProteinMetadata.get_features_variant_overlaps()`.

Update protein feature predicates to use protein regions.

33428db

Mark the online test.

d89d226

ielis merged commit f588591 into develop Nov 13, 2023
4 checks passed

ielis deleted the work-on-predicates branch November 13, 2023 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streamline predicate/analysis workflow #88

Streamline predicate/analysis workflow #88

ielis commented Oct 23, 2023 •

edited

Loading

ielis commented Oct 26, 2023

ielis commented Nov 2, 2023

Streamline predicate/analysis workflow #88

Streamline predicate/analysis workflow #88

Conversation

ielis commented Oct 23, 2023 • edited Loading

ielis commented Oct 26, 2023

ielis commented Nov 2, 2023

ielis commented Oct 23, 2023 •

edited

Loading