-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(rust, python): std when ddof>=n_values returns None even in rolling context #11750
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
5c4dbbd
to
39c6c7d
Compare
# "std", blocked by https://github.com/pola-rs/polars/issues/11140 | ||
# "var", blocked by https://github.com/pola-rs/polars/issues/11140 | ||
"std", | ||
"var", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woohoo finally
let sum = self.sum.update(start, end); | ||
sum / NumCast::from(end - start).unwrap() | ||
unsafe fn update(&mut self, start: usize, end: usize) -> Option<T> { | ||
let sum = self.sum.update(start, end).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can unwrap_unchecked
this one as we know we always get value from sum
.
@@ -47,21 +47,36 @@ where | |||
let len = values.len(); | |||
let (start, end) = det_offsets_fn(0, window_size, len); | |||
let mut agg_window = Agg::new(values, start, end, params); | |||
let mut validity = match create_validity(min_periods, len, window_size, &det_offsets_fn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use unwrap_or_else
here? That elides the Some
branch implicitly.
|
||
let out = (0..len) | ||
.map(|idx| { | ||
let (start, end) = det_offsets_fn(idx, window_size, len); | ||
// safety: | ||
// we are in bounds | ||
unsafe { agg_window.update(start, end) } | ||
let agg = unsafe { agg_window.update(start, end) }; | ||
match agg { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unwrap_or_else
Ok(Box::new(PrimitiveArray::new( | ||
T::PRIMITIVE.into(), | ||
out.into(), | ||
validity.map(|b| b.into()), | ||
Some(validity.into()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like we always create a validity even if we don't have any nulls. I think we should branch on the case where create_validity
-
returns a validity (e.g all null). In that case all values are null and we can directly return and create a
Vec<T::zero>
for the masked out values. -
doesn't return a validity. Then we can collect a
PrimitiveArray
from an iterator overOption<T>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the validity only not necessary if there are no nulls?
Because here, even if there were no nulls to begin with and even if min_periods
was 0
, agg_window.update(start, end)
might introduce nulls (for example, if taking the std
of a window whose length is smaller than ddof
)
} else { | ||
let proportion = T::from(float_idx - idx as f64).unwrap(); | ||
proportion * (vals[top_idx] - vals[idx]) + vals[idx] | ||
Some(proportion * (vals[top_idx] - vals[idx]) + vals[idx]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make these get_unchecked_release
?
@MarcoGorelli this PR is pretty old and still not marked as reviewable - do you need help finishing this or can it perhaps be closed? |
Thanks for the ping - will pick this up again, hopefully now I'll be able to address it (if not, will ask for help) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #11750 +/- ##
==========================================
- Coverage 81.02% 81.01% -0.01%
==========================================
Files 1332 1332
Lines 172865 172879 +14
Branches 2458 2458
==========================================
+ Hits 140059 140064 +5
- Misses 32339 32347 +8
- Partials 467 468 +1 ☔ View full report in Codecov by Sentry. |
if let Some(validity) = create_validity(min_periods, len, window_size, &det_offsets_fn) { | ||
if validity.iter().all(|x| !x) { | ||
// all null! | ||
return Ok(Box::new(PrimitiveArray::new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use new_null
constructor here. The bitmap now first allocates a vec full of false
and then iterates that to construct a bitmap. That can be done in a single allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Marco. Great one.
closes #11140
need to make the same change in the group_by_rolling(...).agg case as that's being tested againstdone