-
Hello all, I am interested in writing (and contributing) a dialect for Apache Drill. Drill uses ANSI SQL (2003) with some additions so this looks fairly straightforward. However, there is one piece which I'm not quite sure how to handle. Drill allows the user to change certain options at query time. For example, if you are querying an Excel file and want to query something other than the default sheet, you could write a query like this: SELECT <fields>
FROM table( dfs.`test_data.xlsx` (type => 'excel', sheetName => 'secondSheet')) My goal here is to get the Drill dialect to recognize this as valid SQL and basically leave it alone. Are there any examples of something similar or do you have any suggestions? @tobymao Posted the following response on an issue I created:
Are there any examples of this? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
yes, look at snowflake, it maps -> to lambda |
Beta Was this translation helpful? Give feedback.
-
this commit makes it so all dialects now support => as a kwarg operator |
Beta Was this translation helpful? Give feedback.
-
@tobymao Thanks for all your help so far. Firstly, here's a link to my branch with he Drill dialect 1. Here is an answer to your question. Forgive me if this is too long. TL;DRdfs.test_data.xlsx is the data source. Longer ExplanationLet me explain a bit about what the So let's say that you have an Excel file that you want to query using Drill. Out of the box, you could write a query like this: SELECT * FROM dfs.`/path/to/file/data.xlsx` In this case, the FROM dfs.`/a/really/long/path/to/a/file.xlsx` Drill lets you define a workspace so you don't have to type all that. Thus you could have a query like this: SELECT *
FROM dfs.my_workspace.`file.xlsx` The Table FunctionWith all that said, there are situations where a user will want to supply additional information to Drill at query time. The example I gave originally was that a user might want to access other sheets in an excel file, but users can also use the SELECT *
FROM table(dfs.tmp.`text_table`(
schema => 'inline=(col1 date properties {`drill.format` = `yyyy-MM-dd`}) properties {`drill.strict` = `false`}')) I hope this makes sense. I'm going to try to finish up the date/time conversions in the Drill dialect today or by Monday at the latest. The module's structure is very well engineered so I have to say writing the dialect is pretty straightforward, even for a n00b. :-) |
Beta Was this translation helpful? Give feedback.
this commit makes it so all dialects now support => as a kwarg operator
fda1168