Skip to content

Commit

Permalink
[DOCS] key-pair auth for Snowflake (#10751)
Browse files Browse the repository at this point in the history
  • Loading branch information
klavavej authored Dec 11, 2024
1 parent ca1eb57 commit 6cfaabf
Show file tree
Hide file tree
Showing 7 changed files with 93 additions and 7 deletions.
12 changes: 12 additions & 0 deletions docs/docusaurus/docs/cloud/connect/connect_snowflake.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import Tabs from '@theme/Tabs';

- You have a [Snowflake account](https://docs.snowflake.com/en/user-guide-admin) with USAGE privileges on the table, database, and schema you are validating, and you have SELECT privileges on the table you are validating.
- Optional. To improve data security, GX recommends using a separate Snowflake user service account to connect to GX Cloud.
- Optional. To streamline automations and improve security, you can connect to Snowflake with key-pair authentication instead of a password. Note that this requires using GX Core in combination with GX Cloud.

- Optional. You can use an existing Snowflake warehouse, but GX recommends creating a separate warehouse for GX Cloud to simplify cost management and optimize performance.

Expand Down Expand Up @@ -58,8 +59,18 @@ Depending on your Snowflake permissions, you may need to ask an admin on your te

![Snowflake Run All](/img/run_all.png)


## Connect to a Snowflake Data Source and add a Data Asset

:::tip To connect with key-pair authentication, use GX Core
To connect to a Snowflake Data Source using key-pair authentication instead of a password, do the following using GX Core:

1. [Create a Cloud Data Context](/core/set_up_a_gx_environment/create_a_data_context.md?context_type=gx_cloud).
2. Pass your private key when you [create a Data Source](/core/connect_to_data/sql_data/sql_data.md) in the Cloud Data Context.

Then, you can use GX Cloud to [add a Data Asset](/cloud/data_assets/manage_data_assets.md#add-a-data-asset-from-an-existing-data-source) from that Data Source.
:::

1. In GX Cloud, click **Data Assets** > **New Data Asset** > **New Data Source** > **Snowflake**.

2. Enter a meaningful name for the Data Source in the **Data Source name** field.
Expand Down Expand Up @@ -100,3 +111,4 @@ Depending on your Snowflake permissions, you may need to ask an admin on your te

8. Add an Expectation. See [Add an Expectation](/cloud/expectations/manage_expectations.md#add-an-expectation).


Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@

To use key-pair authentication for Snowflake, you will pass the private key as a connection argument with `kwargs` in addition to passing connection details with the `connection_string` parameter. Here's an example of how to access your private key in Python.

```python title="Python"
import pathlib

from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization

PRIVATE_KEY_FILE = pathlib.Path("path/to/my/rsa_key.p8").resolve(strict=True)

p_key = serialization.load_pem_private_key(
PRIVATE_KEY_FILE.read_bytes(),
password=b"my_password",
backend=default_backend()
)

pkb = p_key.private_bytes(
encoding=serialization.Encoding.DER,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption())

connect_args = {"private_key": pkb}
```
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@ import TabItem from '@theme/TabItem';
import ConnectionString from './_connection_string.md';
import EnvironmentVariables from './_environment_variables.md';
import ConfigYml from './_config_yml.md';
import KeyPair from './_key_pair.md';
import AccessCredentials from './_access_credentials.md'
import AccessKeyPair from './_access_key_pair.md'



Expand All @@ -25,11 +27,11 @@ GX Core also supports referencing credentials that have been stored in the AWS S

<ConnectionString/>

2. Store the credentials required for your connection string.
2. Store the credentials required for your connection.

GX supports the following methods of securely storing credentials. Chose one to implement for your connection string:
GX supports the following methods of securely storing credentials. Chose one to implement for your connection:

<Tabs queryString="storage_type" groupId="storage_type" defaultValue='environment_variables' values={[{label: 'Environment Variables', value:'environment_variables'}, {label: 'config.yml', value:'config_yml'}]}>
<Tabs queryString="storage_type" groupId="storage_type" defaultValue='environment_variables' values={[{label: 'Environment Variables', value:'environment_variables'}, {label: 'config.yml', value:'config_yml'}, {label: 'Key pair (Snowflake only)', value:'key_pair'}]}>

<TabItem value="environment_variables">
<EnvironmentVariables/>
Expand All @@ -39,11 +41,29 @@ GX Core also supports referencing credentials that have been stored in the AWS S
<ConfigYml/>
</TabItem>

<TabItem value="key_pair">
<KeyPair/>
</TabItem>

</Tabs>

3. Access your credentials in Python strings.

<AccessCredentials/>
<Tabs className="hidden" queryString="storage_typet" groupId="storage_type" defaultValue='environment_variables'>

<TabItem value="environment_variables">
<AccessCredentials/>
</TabItem>

<TabItem value="config_yml">
<AccessCredentials/>
</TabItem>

<TabItem value="key_pair">
<AccessKeyPair/>
</TabItem>

</Tabs>

4. Optional. Access credentials stored in a secret manager.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import ConnectionStringTable from './_connection_string_reference_table.md';
import ConnectionStringTable from './_connection_string_reference_table.mdx';

Different types of SQL database have different formats for their connection string. In the following table, the text in `<>` corresponds to the values specific to your credentials and connection string.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
|-----------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PostgreSQL | `postgresql+psycopg2://<USERNAME>:<PASSWORD>@<HOST>:<PORT>/<DATABASE>` |
| SQLite | `sqlite:///<PATH_TO_DB_FILE>` |
| Snowflake | `snowflake://<USER_NAME>:<PASSWORD>@<ACCOUNT_NAME>/<DATABASE_NAME>/<SCHEMA_NAME>?warehouse=<WAREHOUSE_NAME>&role=<ROLE_NAME>&application=great_expectations_oss` |
| Snowflake | `snowflake://<USER_NAME>:<PASSWORD>@<ACCOUNT_NAME>/<DATABASE_NAME>/<SCHEMA_NAME>?warehouse=<WAREHOUSE_NAME>&role=<ROLE_NAME>&application=great_expectations_oss`<br />You have the option to connect to Snowflake with key-pair authentication instead of a password.|
| Databricks SQL | `databricks://token:<TOKEN>@<HOST>:<PORT>?http_path=<HTTP_PATH>&catalog=<CATALOG>&schema=<SCHEMA>` |
| BigQuery SQL | `bigquery://<GCP_PROJECT>/<BIGQUERY_DATASET>?credentials_path=/path/to/your/credentials.json` |

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
If you're connecting to Snowflake, you can use key-pair authentication instead of a password. This improves security and can be helpful for automations.

Follow Snowflake's docs to [configure and store the private and public keys](https://docs.snowflake.com/en/user-guide/key-pair-auth).
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,35 @@ import DatasourceMethodReferenceTable from './_datasource_method_reference_table

```python title="Python" name="docs/docusaurus/docs/core/connect_to_data/sql_data/_create_a_data_source/postgres.py create data source"
```
4. Optional. If you're connecting to Snowflake and want to use key-pair authentication instead of a password, pass the private key with `kwargs`. Note that a placeholder password is still required to pass the configuration validation, but the password will not be used if a `private_key` is provided.

```python title="Python"
# For details on how to access your private key, refer to "Configure credentials" above
connect_args = {"private_key": pkb}

connection_details={
"account": "accountname.region",
"user": "my_user",
"role": "my_role",
"password": "placeholder_value", # must be provided to pass validation but will be ignored
"warehouse": "my_wh",
"database": "my_db",
"schema": "my_schema"
}

data_source = context.sources.add_snowflake(
name=datasource_name,
connection_string=connection_details,
kwargs={"connect_args": connect_args}
)
```

:::warning Private key serialized in File Data Context
If you're using a [File Data Context](/core/set_up_a_gx_environment/create_a_data_context.md), `kwargs` will be serialized to `great_expectations.yml`, including the private key.
:::


4. Optional. Verify the Data Source is connected:
5. Optional. Verify the Data Source is connected:

```python title="Python" name="docs/docusaurus/docs/core/connect_to_data/sql_data/_create_a_data_source/postgres.py verify data source"
```
Expand Down

0 comments on commit 6cfaabf

Please sign in to comment.