Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): support batched frame iteration over read_database queries #11664

Merged
merged 1 commit into from
Oct 11, 2023

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Oct 11, 2023

Adding another component to the database connectivity feature set; this PR enables support for generator return of DataFrame batches from read_database, using a new iter_batches mode/option. Works generically across SQLAlchemy, ODBC, and more (provided that the underlying backend/driver actually has such support).

Note that you can set batch_size without also setting iter_batches (in which case the batches are assembled into a single DataFrame for you), but you cannot set iter_batches without also providing a batch_size.

Example

import polars as pl

for frame in pl.read_database(
    query = "SELECT * FROM some_table",
    connection = user_connection,
    iter_batches = True,  # << new mode
    batch_size = 1000,
):
    do_something(frame)

Also: closes #10697 as a drive-by (adds basic mitigation to prevent credential leakage from connectorx errors).

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Oct 11, 2023
@alexander-beedie alexander-beedie changed the title feat(python): support batched frame iterator from read_database queries feat(python): support batched frame iterator over read_database queries Oct 11, 2023
@alexander-beedie alexander-beedie changed the title feat(python): support batched frame iterator over read_database queries feat(python): support batched frame iteration over read_database queries Oct 11, 2023
@ritchie46
Copy link
Member

Cool stuff!

@ritchie46 ritchie46 merged commit 56a7817 into pola-rs:main Oct 11, 2023
25 checks passed
@alexander-beedie alexander-beedie deleted the feat-read-database-batches branch October 11, 2023 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
2 participants