-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added test case for #73 and a fix #110
Conversation
WalkthroughThe changes involve enhancements to the docstring of the Changes
Assessment against linked issues
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/aces/query.py (2 hunks)
Additional comments not posted (2)
src/aces/query.py (2)
34-34
: Improved docstring clarity.The modification to the docstring enhances clarity by indicating that the examples are limited.
45-71
: Comprehensive examples added to docstring.The addition of examples in the docstring significantly improves the usability and comprehensiveness of the documentation. These examples provide a clear demonstration of how to use the
query
function with valid input.
Codecov ReportAll modified and coverable lines are covered by tests ✅
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think the config does capture @Oufattole's example, but the difference is in the predicates_df
- his example has multiple rows with the same subject_id
but different timestamp
.
Changing it to:
predicates_df = pl.DataFrame({
"subject_id": [1, 1, 3], # instead of [1, 2, 3]
"timestamp": [datetime(1980, 12, 28), datetime(2010, 6, 20), datetime(2010, 5, 11)],
"A": [False, False, False],
"_ANY_EVENT": [True, True, True],
})
leads to the following with some duplicate rows:
shape: (5, 2)
┌────────────┬─────────────────────┐
│ subject_id ┆ trigger │
│ --- ┆ --- │
│ i64 ┆ datetime[μs] │
╞════════════╪═════════════════════╡
│ 1 ┆ 1980-12-28 00:00:00 │
│ 1 ┆ 2010-06-20 00:00:00 │
│ 1 ┆ 1980-12-28 00:00:00 │
│ 1 ┆ 2010-06-20 00:00:00 │
│ 3 ┆ 2010-05-11 00:00:00 │
└────────────┴─────────────────────┘
I think that is where I got confused
Ahh, I see, makes sense. I'll update the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/aces/query.py (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/aces/query.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- src/aces/constraints.py (1 hunks)
- src/aces/extract_subtree.py (2 hunks)
Files skipped from review due to trivial changes (1)
- src/aces/extract_subtree.py
Additional comments not posted (3)
src/aces/constraints.py (3)
74-90
: Example addition looks good!The added example for
check_constraints
is clear and aligns well with the existing examples. It provides a useful demonstration of the function's usage.
Line range hint
10-72
: Function implementation is solid.The
check_constraints
function is well-implemented with clear logic and appropriate error handling. The logging provides useful insights into the filtering process.
Line range hint
93-149
: Function implementation is solid.The
check_static_variables
function is well-structured with efficient filtering logic using Polars expressions. The error handling and logging are appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/aces/extract_subtree.py (3 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/aces/extract_subtree.py
So I just pushed what I think is a fix in 46690b4 Basically, the fix takes the line: child_anchor_realizations = window_summary_df.select(
"subject_id",
pl.col("child_anchor_timestamp").alias("subtree_anchor_timestamp"),
) and adds child_anchor_realizations = window_summary_df.select(
"subject_id",
pl.col("child_anchor_timestamp").alias("subtree_anchor_timestamp"),
).unique() Why is this appropriate? Because, there is literally no point to recursing on a dataframe where multiple rows are identical. If everything else is working appropriately, such an action should just likewise return a set of duplicate outputs. If two cases of possible child anchors are the same event in a patient's record, then we can stremaline subsequent window criteria checks by just analyzing those windows only once, instead of multiple times. In any event bound window case, it is totally possible that multiple input windows with different subtree anchors will end at the same event in a patient's record. This means they'd have the same |
Closes #73 as not reproducible.
Summary by CodeRabbit
query
function documentation.query
function, including configuration setups and sample outputs.check_constraints
function with a sample DataFrame and usage example.extract_subtree
function documentation with updated examples.