dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.
ODPS, called MaxCompute before. This adapter is a wrapper bridged PyOdps and DBT together.
MaxCompute Features:
Feature | Status |
---|---|
Partition Table | ❎ |
Cluster Table | ❎ |
External Table | ❎ |
Table Properties | ❎ |
DBT features:
Name | Status |
---|---|
Materialization: Table | ✅ |
Materialization: View | ✅ |
Materialization: Incremental - Append | ✅ |
Materialization: Incremental - Insert+Overwrite | ✅ |
Materialization: Incremental - Merge | ✅ |
Materialization: Ephemeral | ✅ |
Seeds | ✅ |
Tests | ✅ |
Snapshots | ✅1 |
Documentation | ✅ |
python setup.py install --force
or pip install dbt-odps-winwin
Run following command after installing dbt-odps:
dbt init
Read more in here: https://docs.getdbt.com/docs/core/connection-profiles
Configuration options:
Property | Description | Example |
---|---|---|
Endpoint | The endpoint of odps, read more in https://help.aliyun.com/document_detail/34951.html | http://service.cn-shanghai.maxcompute.aliyun.com/api |
database | The project name of odps, which you can find in https://maxcompute.console.aliyun.com/{your area}/project-list | odps-test-project |
schema | Using default if you don't know what is schema. | default |
access_id | access id | LTAXXXXXXXXX |
secret_access_key | secret access key | bZXXXXXXXXXX |
type | odps | odps |
- When using merge statement, ODPS required that table is a transactional table. So, we have to create the snapshot table before select. Under the hook, we using the first referred table as source data structure to create table, so this data source must be a table, view is not supported.
DBT docs what-are-adapters