Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase to main package #3

Merged
merged 41 commits into from
Jan 26, 2024
Merged

Rebase to main package #3

merged 41 commits into from
Jan 26, 2024

Conversation

matt-fleming
Copy link
Member

No description provided.

Jesse and others added 30 commits June 7, 2023 14:02
…#131)

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
---------

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
## Summary

Support OAuth flow for Databricks Azure

## Background

Some OAuth endpoints (e.g. Open ID Configuration) and scopes are different between Databricks Azure and AWS. Current code only supports OAuth flow on Databricks in AWS

## What changes are proposed in this pull request?

- Change `OAuthManager` to decouple Databricks AWS specific configuration from OAuth flow
- Add `sql/auth/endpoint.py` that implements cloud specific OAuth endpoint configuration
- Change `DatabricksOAuthProvider` to work with the OAuth configurations in different Databricks cloud (AWS, Azure)
- Add the corresponding unit tests
---------

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
---------

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
…#159)

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
* Cloud Fetch download handler

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Issue fix: final result link compressed data has multiple LZ4 end-of-frame markers

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Addressing PR comments
 - Linting
 - Type annotations
 - Use response.ok
 - Log exception
 - Remove semaphore and only use threading.event
 - reset() flags method
 - Fix tests after removing semaphore
 - Link expiry logic should be in secs
 - Decompress data static function
 - link_expiry_buffer and static public methods
 - Docstrings and comments

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Changing logger.debug to remove url

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* _reset() comment to docstring

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* link_expiry_buffer -> link_expiry_buffer_secs

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

---------

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>
* Cloud Fetch download manager

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Bug fix: submit handler.run

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Type annotations

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Namedtuple -> dataclass

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Shutdown thread pool and clear handlers

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Docstrings and comments

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* handler.run is the correct call

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Link expiry buffer in secs

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Adding type annotations for download_handlers and downloadable_result_settings

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Move DownloadableResultSettings to downloader.py to avoid circular import

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Black linting

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Timeout is never None

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

---------

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>
* Cloud fetch queue and integration

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Enable cloudfetch with direct results

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Typing and style changes

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Client-settable max_download_threads

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Docstrings and comments

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Increase default buffer size bytes to 104857600

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Move max_download_threads to kwargs of ThriftBackend, fix unit tests

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Fix tests: staticmethod make_arrow_table mock not callable

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* cancel_futures in shutdown() only available in python >=3.9.0

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Black linting

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Fix typing errors

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

---------

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>
* Cloud Fetch e2e tests

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Test case works for e2-dogfood shared unity catalog

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Moving test to LargeQueriesSuite and setting catalog to hive_metastore

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Align default value of buffer_size_bytes in driver tests

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

* Adding comment to specify what's needed to run successfully

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>

---------

Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>
Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
…bricks#122)

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Sebastian Eckweiler <sebastian.eckweiler@mercedes-benz.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Co-authored-by: Sebastian Eckweiler <sebastian.eckweiler@mercedes-benz.com>
Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Daniel Segesdi <daniel.segesdi@turbine.ai>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
---------
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
)

---------
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
---------
Signed-off-by: Bogdan Kyryliuk <b.kyryliuk@gmail.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: William Gentry <william.barr.gentry@gmail.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
---------

Co-authored-by: Jesse <jesse.whitehouse@databricks.com>
Resolves databricks#187

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Behaviour is gated behind `enable_v3_retries` config. This will be removed and become the default behaviour in a subsequent release.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Jesse and others added 11 commits August 10, 2023 11:03
* Add note to changelog about using cloud_fetch
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jacobus Herman <jacobus.herman@otrium.com>

Co-authored-by: Jesse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
…icks#208)

snok/install-poetry@v1 installs the latest version of Poetry

The latest version of poetry released on 20 August 2023 (four days ago
as of this commit) which drops support for Python 3.7, causing our
github action to fail.

Until we complete databricks#207 we need to conditionally install the last version
of poetry that supports Python 3.7 (poetry==1.5.1)

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
databricks#206)

* Make retry policy backwards compatible with urllib3~=1.0.0

We already implement the equivalent of backoff_max so the behaviour will
be the same for urllib3==1.x and urllib3==2.x

We do not implement backoff jitter so the behaviour for urllib3==1.x will
NOT include backoff jitter whereas urllib3==2.x WILL include jitter.

---------

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
---------

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Copy link

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@matt-fleming matt-fleming merged commit 56687e5 into main Jan 26, 2024
0 of 2 checks passed
@evb123
Copy link

evb123 commented Jan 26, 2024

Rebase socket_timeout hack with current forked repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.