This scenario demonstrates how to use Database Migration Service (DMS) to create change data capture (CDC) and full load tasks using the Cloud Development Kit in Python. It is a self-contained setup that will create a VPC to host the DMS replication instance, an S3 bucket as the data source, a Kinesis stream as the target, and the replication tasks.
- LocalStack Auth Token
- Python 3.10 &
pip
- Docker Compose
- CDK with the
cdklocal
wrapper.
Start LocalStack Pro with the LOCALSTACK_AUTH_TOKEN
pre-configured:
export LOCALSTACK_AUTH_TOKEN=<your-auth-token>
docker-compose up
The Docker Compose file will start LocalStack Pro container and a MariaDB container. The MariaDB container will be used to showcase how to reach a database external to LocalStack.
Install all the dependencies by running the following command:
make install
To deploy the infrastructure, you can run the following command:
make deploy
After successful deployment, you will see the following output:
Outputs:
DmsS3ToKinesisStack.cdcTask = arn:aws:dms:us-east-1:000000000000:task:WLVQO3DQTCC31K5OGLN9ZQI1J6X88GF9T4PKLY8
DmsS3ToKinesisStack.fullLoadTask = arn:aws:dms:us-east-1:000000000000:task:AJGNKCENBYN0UZFYCLIU62WEH9NF62713YTKJZU
DmsS3ToKinesisStack.kinesisStream = arn:aws:kinesis:us-east-1:000000000000:stream/DmsS3ToKinesisStack-TargetStream3B4B2880-6e4a4e31
DmsS3ToKinesisStack.s3Secret = arn:aws:secretsmanager:us-east-1:000000000000:secret:DmsS3ToKinesisStack-s3accesssecret988D0EB7-79f34adc-scxavJ
DmsS3ToKinesisStack.sourceBucket = source-bucket-s3-kinesis-dms
Stack ARN:
arn:aws:cloudformation:us-east-1:000000000000:stack/DmsS3ToKinesisStack/edecb02a
✨ Total time: 36.43s
You can run the tasks by executing the following command:
make run
Two tasks are deployed with the stack:
First, a full load replication task runs against the S3 source:
- Creates three tables:
employee
,department
,project
- Uploads sample CSV data for each table to the S3 bucket
- Starts the full load task
- Captures and logs Kinesis events for the uploaded data: 3 create table, 3 drop table, 10 inserts
- Logs
table_statistics
for the task
Next, a CDC replication task runs against the S3 :
- Creates three tables:
employee
,department
,project
- Uploads sample CSV data for each table to the S3 bucket
- Captures and logs Kinesis events for the uploaded data: 3 create table, 4 inserts, 3 updates, 4 deletes, 1 table for
awsdms_apply_exceptions
- Logs
table_statistics
for the task
Both the tasks target the same Kinesis Stream.
You can deploy and run the stack on AWS by running the following commands:
make deploy-aws
make run-aws
This project is licensed under the Apache 2.0 License.