Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Fix parquet file metadata is dropped after first DSL->IR conversion #18789

Merged
merged 1 commit into from
Sep 17, 2024

Conversation

nameexhaustion
Copy link
Collaborator

No description provided.

@github-actions github-actions bot added internal An internal refactor or improvement python Related to Python Polars rust Related to Rust Polars labels Sep 17, 2024
/// Local use cases often repeatedly collect the same `LazyFrame` (e.g. in interactive notebook use-cases),
/// so we cache the IR conversion here, as the path expansion can be quite slow (especially for cloud paths).
#[cfg_attr(feature = "serde", serde(skip))]
cached_ir: Arc<Mutex<Option<IR>>>,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had mutexes on multiple fields on the DSL::Scan, I just cache the IR here instead of adding another mutex

Copy link

codecov bot commented Sep 17, 2024

Codecov Report

Attention: Patch coverage is 89.47368% with 12 lines in your changes missing coverage. Please review.

Project coverage is 79.84%. Comparing base (5d81d4a) to head (f996d44).
Report is 31 commits behind head on main.

Files with missing lines Patch % Lines
...-stream/src/nodes/parquet_source/metadata_fetch.rs 0.00% 6 Missing ⚠️
...ates/polars-plan/src/plans/conversion/dsl_to_ir.rs 63.63% 4 Missing ⚠️
...es/polars-mem-engine/src/executors/scan/parquet.rs 94.44% 1 Missing ⚠️
...ates/polars-stream/src/nodes/parquet_source/mod.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #18789      +/-   ##
==========================================
- Coverage   79.84%   79.84%   -0.01%     
==========================================
  Files        1518     1518              
  Lines      205563   205577      +14     
  Branches     2892     2892              
==========================================
+ Hits       164137   164143       +6     
- Misses      40878    40886       +8     
  Partials      548      548              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nameexhaustion nameexhaustion marked this pull request as ready for review September 17, 2024 12:44
@ritchie46
Copy link
Member

Can we add a test somehow?

@nameexhaustion
Copy link
Collaborator Author

Can we add a test somehow?

1 special order coming up 😁

Copy link

codspeed-hq bot commented Sep 17, 2024

CodSpeed Performance Report

Merging #18789 will not alter performance

Comparing nameexhaustion:dsl-ir-metadata (f996d44) with main (5d81d4a)

Summary

✅ 40 untouched benchmarks

@ritchie46 ritchie46 merged commit 25f84e4 into pola-rs:main Sep 17, 2024
26 checks passed
@c-peters c-peters added the accepted Ready for implementation label Sep 23, 2024
@TNieuwdorp
Copy link
Contributor

This might've fixed the post 1.1.0 regression that was introduced somehow in the dsl_to_ir.rs! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation internal An internal refactor or improvement python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants