Introduction to the VDK SDK

One framework to🧑‍💻 Develop ▶️ Deploy and 📊 Operate
data workflows with Python and SQL

🎯 Write shorter, more readable code.

🔄 Ready-to-use data ETL/ELT patterns.

🧩 Lego-like extensibility.

🚀 Single click deployment.

🛠 Operate and monitor. ️

Introduction to the VDK SDK

Framework to simplify data ingestion and data processing.
Write any code using Python or SQL.
A toolset enabling you to run data jobs.

Get started with VDK SDK:

➡ Install Quickstart VDK. Only requirement is Python 3.7+.

pip install quickstart-vdk
vdk --help

➡ Develop your First Data Job if you are impatient to start quickly.

VDK.SDK.2.mp4

Data Ingestion

Extract data from various sources (HTTP APIs, Databases, CSV, etc.).
Ensure data fidelity with minimal transformations.
Load data to your preferred destination (database, cloud storage).

Ingestion examples:

➡ Ingesting data from REST API into Database
➡ Ingesting data from DB into Database
➡ Ingesting local CSV file into Database
➡ Incremental ingestion using Job Properties

VDK.Ingestion.2.mp4

Data Transformation

SQL and Python parameterized transformations.
Extensible templates for data modeling.
Creates a dataset or table as a product.

Get started with transforming data:

➡ Data Modeling: Treating Data as a Product
➡ Processing data using SQL and local database
➡ Processing data using Kimball warehousing templates

Transform.VDK.2.mp4

Data Job Deployment (build, deploy, release)

VDK Control Service provides REST API for users to create, deploy, manage, and execute data jobs in a Kubernetes runtime environment.

Scheduling, packaging, dependencies management, deployment.
Execution management and monitoring.
Source code versioning and tracking. Fast rollback.
Manage state and credentials using Properties and Secrets.

Get started with deploying jobs in control service:

➡ Install Local Control Service with vdk server --install
➡ Scheduling a Data Job for automatic execution
➡ Using VDK DAGs to orchestrate Data Jobs

VDK.CS.2.mp4

Operations and Monitoring

Use Operations UI to monitor, troubleshoot data workloads in production.
Notifications for errors during Data Job deployment or execution.
Route errors to correct people by classifying them into User or Platform errors.

Get started with operating and monitoring data jobs:

➡ Versatile Data Kit UI - Installation and Getting Started
➡ VDK Operations User Interface - Versatile Data Kit

VDK.UI.2.mp4

Lego like extensibility

Modular: use only what you need. Extensible: build what you miss.
Easy to install any plugins as python packages using pip.
Used in enhancing data processing, ingestion, job execution, command-line lifecycle

Get started with using some VDK plugins:

➡ Browse available plugins
➡ Interesting plugins to check out:
Track Lineage of your jobs using vdk-lineage
Import/Ingest or Export CSV files using vdk-csv
➡ Write your own plugin

VDK.plugins.2.mp4

Support and Contributing

For Support, you can join our Slack channel, create an issue or pull request on GitHub to submit suggestions or changes.
If you are interested in contributing as a developer, visit the contributing page.

Contacts

Message us on Slack:
☝️ Join the CNCF Slack workspace.
✌️ Join the #versatile-data-kit channel.
Join the next Community Meeting
Follow us on Twitter.
Subscribe to the Versatile Data Kit YouTube Channel.
Join our development mailing list, used by developers and maintainers of VDK.

Code of Conduct

Everyone involved in working on the project's source code, or engaging in any issue trackers, Slack channels, and mailing lists is expected to be familiar with and follow the Code of Conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 2,182 Commits
.github		.github
cicd		cicd
events		events
examples		examples
projects		projects
specs		specs
support		support
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
NOTICE.txt		NOTICE.txt
Processing_data_using_SQl_and_local_database.ipynb		Processing_data_using_SQl_and_local_database.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to the VDK SDK

Data Ingestion

Data Transformation

Data Job Deployment (build, deploy, release)

Operations and Monitoring

Lego like extensibility

Support and Contributing

Contacts

Code of Conduct

About

Releases 22

Packages

Contributors 39

Languages

License

vmware/versatile-data-kit

Folders and files

Latest commit

History

Repository files navigation

Introduction to the VDK SDK

Data Ingestion

Data Transformation

Data Job Deployment (build, deploy, release)

Operations and Monitoring

Lego like extensibility

Support and Contributing

Contacts

Code of Conduct

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 22

Packages 0

Contributors 39

Languages

Packages