-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #419: Code Reorganization #425
Conversation
CONTRIBUTING.md
Outdated
@@ -193,7 +193,7 @@ sbt assembly | |||
$SPARK_HOME/bin/spark-shell \ | |||
--jars ./target/scala-2.12/qbeast-spark-assembly-0.6.0.jar \ | |||
--packages io.delta:delta-spark_2.12:3.1.0 \ | |||
--conf spark.sql.extensions=io.qbeast.spark.internal.QbeastSparkSessionExtension \ | |||
--conf spark.sql.extensions=io.qbeast.spark.delta.QbeastSparkSessionExtension \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the SessionExtension in delta
package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed that the QbeastSparkSessionExtension
strongly depends on io.delta.sql.DeltaSparkSessionExtension
, which is why I moved it to the delta package. However, we can discuss the best location for it or whether we should create our own format-agnostic SparkSessionExtension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to either:
- The location stays on
io.qbeast.spark.internal.QbeastSparkSessionExtension
, and all the reference to Delta is deleted (which I guess is tedious) - Rename it also to
QbeastDeltaSparkSessionExtension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second point should be faster for this iteration
@@ -74,17 +73,17 @@ case class QbeastOptions( | |||
def toMap: CaseInsensitiveMap[String] = { | |||
val options = Map.newBuilder[String, String] | |||
for (txnAppId <- txnAppId; txnVersion <- txnVersion) { | |||
options += DeltaOptions.TXN_APP_ID -> txnAppId | |||
options += DeltaOptions.TXN_VERSION -> txnVersion | |||
options += "txnAppId" -> txnAppId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we are not using the DeltaOptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see that QbeastOptions
is an internal object.
Then, I am not sure how to control the mapping between QbeastOptions
and DeltaOptions
.
Should we add methods in the specific modules to verify and merge both subsets of options?
* Code refactoring: Use Qbeast abstractions instead of Delta imports
* Issue #417: Abstract the Qbeast Snapshot Module (#411) * Issue #418: Abstract PreCommitHook and StagingDataManager (#421) * Issue #418: Abstract RollupDataWriter and QbeastStats (#423) * Issue #419: Code Reorganization (#425) * Issue #420: Create Separate Modules (#427) * Issue #398: Fix small overhead added during the refactoring (#436) * Issue #441: Fix data change on optimize (#442)
Description
This is PR is for #419 (related to #398), which reorganizes the code in the current project structure, and removes all the delta dependencies from the core/spark code
Type of change
Code refactoring
Checklist:
Here is the list of things you should do before submitting this pull request: