Skip to content

Utility to unload old data from Redshift table-by-table

Notifications You must be signed in to change notification settings

ello/redshift-unloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Redshift Pruner

This is a simple schedulable script that lets us remove outdated Segment Warehouse data from Redshift.

It's useful for automatically archiving data outside of a given retention window (i.e. after 6 months).

Configuring

The script relies on a handful of environment variables -

  • REDSHIFT_URL - URL to use to connect to the target Redshift instance/database
  • ONLY_TABLES - (optional) comma-separated list of fully-qualified (i.e. schema.tablename) tables to which to limit pruning operations
  • RETENTION_INTERVAL - (optional, defaults to 365 days) prune data older than this time ago, as specified by a Postgres interval value

With those set, you can run the following Rake tasks:

  • rake redshift:list_tables - list all tables in the database along with the count of prunable rows in each
  • rake redshift:prune_all - prune data older than the specified threshold from all tables in the database

Setup

You'll need a working Ruby 2.3 setup with Bundler

Once those are set up, you can install dependencies with bundle install.

License

Redshift Unloader is released under the MIT License

Code of Conduct

Ello was created by idealists who believe that the essential nature of all human beings is to be kind, considerate, helpful, intelligent, responsible, and respectful of others. To that end, we will be enforcing the Ello rules within all of our open source projects. If you don’t follow the rules, you risk being ignored, banned, or reported for abuse.

About

Utility to unload old data from Redshift table-by-table

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages