Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the internal default connection and sql function? #32

Open
eitsupi opened this issue Oct 15, 2023 · 5 comments
Open

Expose the internal default connection and sql function? #32

eitsupi opened this issue Oct 15, 2023 · 5 comments

Comments

@eitsupi
Copy link
Contributor

eitsupi commented Oct 15, 2023

The sql function used internally would be useful to perform processing via DuckDB. (e.g., reading Parquet files).

duckdb-r/R/sql.R

Lines 1 to 15 in d243b53

#' Run a SQL query
#'
#' `sql()` runs an arbitrary SQL query and returns a data.frame the query results
#'
#' @param sql A SQL string
#' @param conn An optional connection, defaults to built-in default
#' @return A data frame with the query result
#' @noRd
#' @examples
#' print(duckdb::sql("SELECT 42"))
sql <- function(sql, conn = default_connection()) {
stopifnot(dbIsValid(conn))
dbGetQuery(conn, sql)
}

Would you consider exporting this with a name like duckdb_query?

@krlmlr krlmlr changed the title Exporse the internal sql function? Expose the internal sql function? Dec 2, 2023
@krlmlr
Copy link
Collaborator

krlmlr commented Dec 2, 2023

This function and the associated default_connection() is not really used even internally. We're free to design it to our desires, but I'd like to understand the motivation and use case. Most of that can be achieved today with DBI too?

@eitsupi
Copy link
Contributor Author

eitsupi commented Dec 3, 2023

Thank you for your reply.
I think it would be great if I could eventually do something like the following in Python.

https://duckdb.org/docs/archive/0.9.2/guides/python/import_pandas

import duckdb
import pandas

# Create a Pandas dataframe
my_df = pandas.DataFrame.from_dict({'a': [42]})

# create the table "my_table" from the DataFrame "my_df"
# Note: duckdb.sql connects to the default in-memory database connection
duckdb.sql("CREATE TABLE my_table AS SELECT * FROM my_df")

# insert into the table "my_table" from the DataFrame "my_df"
duckdb.sql("INSERT INTO my_table SELECT * FROM my_df")

Here, the connection specification and the registration of the pandas.DataFrame to the DB are done automatically.
The R client currently does not automatically register data frames (duckdb/duckdb#6771), but I think it would be useful at the moment to be able to omit specifying a connection like with the sql function.

@krlmlr krlmlr changed the title Expose the internal sql function? Expose the internal default connection and sql function? Mar 21, 2024
@krlmlr
Copy link
Collaborator

krlmlr commented Mar 21, 2024

Can we do a draft/pilot in a separate package, on top of DBI, before committing here?

@eitsupi
Copy link
Contributor Author

eitsupi commented Mar 24, 2024

I don't have time to work on this right away, but I'm assuming the same thing below that you can use with polarssql.

https://rpolars.github.io/r-polarssql/reference/polarssql_query.html

polarssql_register(mtcars = mtcars)

query <- "SELECT * FROM mtcars LIMIT 5"

# Returns a polars LazyFrame
polarssql_query(query)

# Clean up
polarssql_unregister("mtcars")

@stephhazlitt
Copy link

I noticed this duckdb:::sql() internal function used in today's duckdb blog post which also made me wonder if it might be worth exporting the function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants