- API overview
- Connection API (All Partners)
- Datasource API (Ingestion Partners)
- Regression Testing APIs
This page details the technical specifications of the APIs that partners implement. Also see the Technical FAQ for common questions.
In the API requests, Databricks will include the following request headers:
"User-Agent": "databricks"
"Authorization": "Basic <base64 user:pwd>" [user:pwd provided by partner to Databricks for authentication]
"Content-Type": "application/json"
"Accept-Language": "user language" [en-US]
Partners need to include the following headers in all api responses that return JSON payload.
"Content-Type": "application/json"
Databricks will be authenticating using basic authentication with the partners.
The Connect API is used to sign-in or sign-up a user with a partner with Databricks resources pre-configured.
The order of events when connecting Databricks to a partner is as follows:
- The user clicks the Partner tile.
- The user confirms the Databricks resources that will be provisioned for the connection (e.g. the Service Principal, the PAT or the service principal OAuth secret, the SQL Warehouse).
- The user clicks Connect.
- Databricks calls the partner's Connect API with all of the Databricks data that the partner needs.
- The partner provisions any accounts and resources needed. (e.g. persisting the Databricks workspace_id, provisioning a Databricks output node).
- The partner responds with
- HTTP status code - for determining success, Databricks failure, and partner failure.
- Redirect URL - for Databricks to launch in a new browser tab.
- Additional data - see below.
- Databricks launches the Redirect URL in a new browser tab.
The Redirect URL can be customized by the partner to handle different cases. Partners can embed arbitrary data (e.g. user info) into the URL. As a typical example, a partner may choose to implement the following URLs. Superscripts denote options for the same scenario depending on the partner's capabilities.
- www.partner.com/create-trial
- Used when the partner has never seen the user or the account before.
- ¹Used when the account has an expired trial (and the partner allows multiple trials per account)
- www.partner.com/sign-in
- ²Used when the partner has never seen the user, but has seen the account, and can automatically provision the user.
- Used when the partner has seen the user and the account.
- www.partner.com/contact-your-admin
- ²Used when the partner has never seen the user, but has seen the account, and cannot automatically provision the user.
- www.partner.com/purchase-product
- ¹Used when the account has an expired trial (and the partner does not allow multiple trials per account)
Scenario | Partner operations during Connect API | Response |
---|---|---|
New user, new account |
|
Status_code = 200 Connection_id = abcd Configured_resources = true User_status = new Account_status = new Redirect Value = create_trial Redirect URL = www.partner.com/create-trial |
Expired account |
|
Status_code = 200 Connection_id = omitted OR abcd Configured_resources = false OR true User_status = new OR existing Account_status = expired Redirect Value = purchase_product OR create_trial Redirect URL = www.partner.com/purchase-product OR www.partner.com/create-trial |
New user, existing account |
|
Status_code = 200 Connection_id = abcd or omitted Configured_resources = true OR false User_status = new Account_status = active Redirect Value = sign_in OR contact_admin OR create_trial Redirect URL = www.partner.com/sign-in OR www.partner.com/contact-your-admin OR www.partner.com/create-trial |
Existing user, existing account |
|
Status_code = 200 Connection_id = abcd or omitted Configured_resources = true OR false User_status = existing Account_status = active Redirect Value = sign_in Redirect URL = www.partner.com/sign-in |
Databricks and the partner may have different values for whether the connection is established. A user may delete the connection on either the Databricks-side or the partner-side causing this mismatch. Here's what should happen with the Connect API in each of the 4 cases.
- If Databricks has no connection configured, it will send a payload with is_connection_established set to false.
- If the partner has no connection configured, they will configure the connection.
- If the partner has a connection configured, they will configure a new, separate connection. This can happen if a connection was previously deleted on the Databricks-side or if a connection was manually configured from the partner to Databricks outside of Partner Connect.
- If Databricks has a connection configured, it will send a payload with is_connection_established set to true.
- If the partner has no connection configured, the partner responds with 404 connection_not_found so that Databricks can tell the user to take action.
- If the partner has a connection configured, no new connection needs to be configured.
- A company or an organization has a Databricks account. We do not provide an identifier that represents the account to partners.
- An account can have multiple Databricks workspaces. A workspace is the logical container and isolation boundary for Databricks resources (e.g. SQL Warehouses). Databricks_organization_id and workspace_id are interchangeable as the appropriate identifier. Partner Connect is accessed within a workspace.
- A user can belong to multiple workspaces. A user has a unique email address. A user's identifier is databricks_user_id which is unique within a cloud (e.g. Azure).
- Partner Connect stores 0 or 1 connections per partner per workspace.
Databricks will pass the below standard fields to your API. In order to be in Partner Connect, we need your API to support all of the mandatory fields, meaning that even if you receive information you don't need, you shouldn't return an error. If there are some additional fields that you would like us to support, do let us know through your partner representative.
POST <partners/databricks/v1/connect>: [example, can be customized]
{
"user_info": {
"email": "john.doe@databricks.com", [valid email address]
"first_name": "John", [Non-null String, may be empty string]
"last_name": "Doe", [Non-null String, may be empty string]
"databricks_user_id": 1234567890, [data-type is long]
"databricks_organization_id": 1234567890, [data-type is long]
"is_connection_established" : true|false
"auth": { [Only present if is_connection_established is false]
"personal_access_token": "dapi..."
}
}
"hostname": "organization.cloud.databricks.com",
"port": 443,
"workspace_url": "https://[organization/prefix-workspaceid/string].cloud.databricks.com/?o=12345677890",
"http_path": "sql/protocolv1/o/0/0222-185802-deny427", [optional, set if is_sql_warehouse is true]
"jdbc_url": "jdbc:spark://organization.cloud.databricks.com:443/...", [optional, set if is_sql_warehouse is true, used for legacy JDBC spark driver]
"databricks_jdbc_url": "jdbc:databricks://organization.cloud.databricks.com:443/...", [optional, set if is_sql_warehouse is true, used for new JDBC databricks driver]
"connection_id": "7f2e4c43-9714-47cf-9011-d8148eaa27a2", [example, optional, only present when is_connection_established is true]
"workspace_id": 1234567890, [same as user_info.databricks_organization_id]
"demo": true|false, [see Demos section below]
"cloud_provider": "azure", [or aws or gcp]
"cloud_provider_region" "us-west-2", [optional]
"is_free_trial": true|false, [is Databricks free trial]
"destination_location": "<cloud>://<location_2>", [optional]
"catalog_name": "my_catalog",[optional, it could be a custom name if using Unity Catalog, or "hive_metastore" if not.]
"database_name": "default database to use", [optional, unused and reserved for future use]
"cluster_id": "0222-185802-deny427", [optional: set only if jdbc/interactive cluster is required.]
"is_sql_endpoint" : true|false, [optional: same value as is_sql_warehouse]
"is_sql_warehouse": true|false, [optional: set if cluster_id is set. Determines whether cluster_id refers to Interactive Cluster or SQL Warehouse]
"data_source_connector": "Oracle", [optional, unused and reserved for future use: for data connector tools, the name of the data source that the user should be referred to in their tool]
"service_principal_id": "a2a25a05-3d59-4515-a73b-b8bc5ab79e31", [optional, the UUID (username) of the service principal identity]
"service_principal_oauth_secret": "dose...", [optional, the OAuth secret of the service principal identity, it will be passed only when the partner config includes OAuth M2M auth option]
"oauth_u2m_app_id": "782b7906-20c4-4c12-8850-b26b77d125f5" [optional, the client ID of Databricks OAuth U2M app connection created by Partner Connect. It will be passed only when the partner config includes OAuth U2M auth option]
}
Successful Responses:
Status Code: 200
{
"redirect_uri": "https://...",
"redirect_value": "create_trial", [example]
"connection_id": "7f2e4c43-9714-47cf-9011-d8148eaa27a2", [example, optional, see below]
"user_status": "new", [example]
"account_status": "existing", [example]
"configured_resources": true|false,
"oauth_redirect_uri": "http://www.partner.com/oauth/callback [example, optional, see below]
}
Return values:
- redirect_uri - the URL to launch in a new browser tab. Please note that we validate the hostname in the redirect_uri to prevent a redirect attack. If your redirect_uri will have a different hostname than your Connect API, we'll need to safelist that.
- redirect_value - a String that identifies the redirect_uri scenario. This will be used to verify correct behavior in automated testing. Valid values are "create_trial', "purchase_product", "sign_in", "contact_admin", and "not_applicable".
- connection_id - a String that identifies the connection in the Partner's system. This may be used as part of future improvements (See Optional section below).
- If is_connection_established is true in the request, connection_id should be omitted in the response.
- If is_connection_established is false and configured_resources is true, connection_id must be present in the response
- If is_connection_established is false and configured_resources if false, connection_id should be omitted in the response.
- user_status - a String that represents whether the partner has seen the user before. Valid values are "new", "existing", and "not_applicable".
- account_status - a String that represents whether the partner has seen the account (i.e. the company or email domain) before. Valid values are "new", "active", "expired", and "not_applicable".
- configured_resources - a boolean that represents whether the partner configured/persisted the Databricks resources on this Connect API request.
- If is_connection_established is true, configured_resources must be set, but will be ignored.
- If is_connection_established is false and configured_resources is false, Databricks will delete the resources it provisioned.
- oauth_redirect_uri - the partner application's URL that handles Databricks OAuth redirect request in the OAuth U2M flow (Authorization code flow). It should be set only when the partner is configured with OAuth U2M as the auth option (ParterConfig
auth_options
containsAUTH_OAUTH_U2M
) and does not have a pre-registered Databricks published OAuth app connection (ParterConfigis_published_app
isfalse
ornull
).
Failure Responses:
All failure responses contain the same 3 fields:
- error_reason - Required; must be set to the appropriate enum value.
- display_message - Not required; will be displayed to the user if set in the case of 404 or 500
- debugging_message - Not required; will be logged if set.
Bad request
Thrown when Databricks provides a malformed request to the partner. This should never happen.
Status Code: 400
{
"error_reason": "bad_request",
"display_message": "foobar", [optional, not displayed for 400]
"debugging_message": "foobar" [optional]
}
Bad credentials
Thrown when Databricks provides the wrong credentials to the partner. This should never happen, but may happen in cases where credentials need to be rotated. Databricks will present the link to the sign in page for the partner.
Status Code: 401
{
"error_reason": "unauthorized",
"display_message": "foobar", [optional, not displayed for 401]
"debugging_message": "foobar" [optional]
}
Account or Connection not found
Should only be possible if Databricks sends with is_connection_established set to true.
- If the connection does not exist on the partner-side, return a 404 with error_reason set to connection_not_found. This occurs when a connection has been deleted on the partner side. The user will be directed to delete the connection in Databricks and re-create.
- If the account does not exist on the partner-side, return a 404 with error_reason set to account_not_found. This occurs when a partner wishes to create separate connections from a single Databricks workspace to multiple partner workspaces (e.g. one for each user). The user will be directed to contact their admin for an invite to an existing workspace or to delete the existing connection and recreate, which is a destructive operation. Improvements are planned for this workflow.
Status Code: 404
{
"error_reason": "account_not_found", [or “connection_not_found”]
"display_message": "foobar", [optional, will be displayed if present]
"debugging_message": "foobar" [optional]
}
Unexpected failure
For any other unexpected failures, the partner will return a 500. Databricks will retry the request 3 times with exponential backoff: first request after 1 second, second request after 2 seconds, and third request after 4 seconds. If Databricks continues to receive a 500, Databricks will present the resources that are created in the UI, but provide a link to the regular sign in page for the partner.
Status Code: 500
{
"error_reason": "general_error",
"display_message": "foobar", [optional, will be displayed if present]
"debugging_message": "foobar" [optional]
}
The partner should provide a REST GET api that returns the list of connectors they support. Partners are responsible for using the Databricks provided list of identifiers to map the data source connectors.
GET /partners/databricks/v1/connectors?pagination\_token=101
Response:
{
"connectors": [{
"identifier": "DATABRICKS_CONNECTOR1_ID",
"name": "connector name1",
"type": "source", [or “target”]
"description": "Connects to ..." [optional]
},
{
"identifier": "DATABRICKS_CONNECTOR2_ID",
"name": "connector name1",
"type": "source", [or “target”]
"description": "Connects to ..." [optional]
}, ...
],
"Pagination_token": 101 // A token to continue listing
}
The following APIs are required for automated certification tests. Once partners implement these APIs, they can use the partner certification tests to validate the user scenarios.
This API is currently only used in automated tests. In the future it may be included in the partner connect experience.
POST <partners/databricks/test-connection>:
{
"connection_id": "7f2e4c43-9714-47cf-9011-d8148eaa27a2",
"cloud_provider": "azure", [or aws or gcp]
"databricks_organization_id": 123456789012345678,
}
Successful Responses:
Status Code: 200
{
"test_results": [
{
"test_name": "Connectivity",
"status": "SUCCESS",
"message": "Successfully connected to host"
},
{
"test_name": "Permissions",
"status": "FAILED",
"message": "WRITE permission check failed to database"
}
]
}
Failure Responses:
Connection not found.
Status Code: 404
{
"error_reason": "connection_not_found"
"display_message": "foobar", [optional, will be displayed if present]
"debugging_message": "foobar" [optional]
}
Unexpected failure
Status Code: 500
{
"error_reason": "general_error",
"display_message": "foobar", [optional]
"debugging_message": "foobar" [optional]
}
This API will clean up all the resources provisioned for a given account based on the domain name, cloud provider and org id (workspace_id). This API needs to be limited to databricks test domain names (databricks-test.com and databricks-demo.com). Delete api should throw 400 Bad requests if it's called for any other domain name. This API is currently only used in automated tests.
DELETE <partners/databricks/account>:
{
"domain": "abc123.databricks-test.com",
"cloud_provider": "azure", [or aws or gcp]
"databricks_organization_id": 123456789012345678,
}
Successful Responses:
Status Code: 200
Failure Responses:
Bad Request
Status Code: 400
{
"error_reason": "bad_request",
"display_message": "foobar", [optional]
"debugging_message": "foobar" [optional]
}
Unexpected failure
Status Code: 500
{
"error_reason": "general_error",
"display_message": "foobar", [optional]
"debugging_message": "foobar" [optional]
}
This API should clean up a specific connection id in a given org id (workspace_id) and cloud provider. This API is used in automation testing and is required for the Partner Connect experience.
In the Partner Connect experience, Databricks will use the Delete Connection API to notify partners about connection deletion from Databricks's side.
Databricks will call the partner's delete-connection endpoint after deleting associated resources on the Databricks side.
Partners should return deletion_acknowledged
if no action is taken upon receiving the notification.
If the partner confirms resource deletion through the resources_deleted
resource status, automated testing will be done to ensure that new connections can be made.
DELETE <partners/databricks/connection>:
{
"connection_id": "7f2e4c43-9714-47cf-9011-d8148eaa27a2",
"cloud_provider": "azure", [or aws or gcp]
"databricks_organization_id": 123456789012345678,
}
Successful Responses:
The field resource_status is a String which identifies the action that the partner-side took, if any. Valid values are "resources_deleted", "resources_pending_deletion", "deletion_acknowledged", or "user_unauthorized". "user_unauthorized" indicates that resources were not deleted due to the user's (specified in the user_info
request field) lack of permissions.
{
"resource_status": "resources_deleted" [or resources_pending_deletion or deletion_acknowledged or user_unauthorized]
}
Failure Responses:
Bad credentials
Thrown when Databricks provides the wrong credentials to the partner. This should never happen, but may happen in cases where credentials need to be rotated.
Status Code: 401
{
"error_reason": "unauthorized",
"display_message": "foobar", [optional]
"debugging_message": "foobar" [optional]
}
Connection not found
If the connection does not exist on the partner-side, return a 404 with error_reason set to connection_not_found. This occurs when a connection has already been deleted on the partner side.
Status Code: 404
{
"error_reason": “connection_not_found”
"display_message": "foobar", [optional]
"debugging_message": "foobar" [optional]
}
Unexpected failure
For any other unexpected failures, the partner will return a 500. Databricks will retry the request 3 times with exponential backoff: first request after 1 second, second request after 2 seconds, and third request after 4 seconds.
Status Code: 500
{
"error_reason": "general_error",
"display_message": "foobar", [optional]
"debugging_message": "foobar" [optional]
}
The expire account API is used to expire a user trial. After this API is called, the partner is expected to return the redirect uri for handling expired users. The account_status should be set to expired. This API is only used for automated verification of the partner connect flow for expired accounts. This API needs to be limited to databricks test domain names (databricks-test.com and databricks-demo.com). Partners should return 400 bad requests if it's called for any other domain name. This is an optional API if the partner doesn't support time-based account expiry.
PUT <partners/databricks/useraccount>:
{
"email": "test@test.com",
"databricks_user_id": 123456789012345678,
"cloud_provider": "azure", [or aws or gcp]
"databricks_organization_id": 123456789012345678,
}
Successful Responses:
Status Code: 200
Failure Responses:
Unexpected failure
Status Code: 500
{
"error_reason": "general_error",
"display_message": "foobar", [optional]
"debugging_message": "foobar" [optional]
}
We would like to demo select partner products to our field team and in Partner Connect product demonstrations. In order to do this, we need to ensure that partners return the Create Trial experience even when the same email, account, and workspace are repeatedly used.
The Connect API request includes a " demo" boolean flag that we will set to true only when demo-ing your product. When true, ensure we get the Create Trial flow, even if you've seen the user, account, and/or Databricks workspace before. Handling concurrent requests is not required.