Skip to content

Database Restore

MCatherine edited this page Aug 19, 2024 · 9 revisions

Instruction

The database restore cron job is configured in the db/openshift.deploy.yml file. The schedule is at midnight on February 31st, so it actually will never run automatically. We will trigger it manually whenever needs.

The cron job will use the same image that our database uses, so it comes with the Postgres command that we need for the restore. And it will use the persistent volume storage that our database backup cronjob uses, so it includes all the backup files (Please note that: do not run the restore at the same time as the backup, the same persistent volume storage can only be used by one container at a time).

FOM is using the Openshift backup-container for the database backup, so the backup files are stored under backups directory, with daily, weekly and monthly three folders. An example can be seen below (this is an example from another old project that is using backup-container):

  • Daily backup will be stored under daily
  • If it's Sunday, then the backup will be stored under weekly
  • Backup files are zipped sql files
Screen Shot 2024-08-14 at 3 38 57 PM

Restore process

  • Find the CronJob config in Openshift, update the environment variables in the yaml file
image
  • RESTORE_DIR is where the backup file you want to restore is at, TARGET_HOST is the IP address of the database pod you want to run restore on top of
Screen Shot 2024-08-19 at 4 17 55 PM image
  • Login to Openshift from command line
Screen Shot 2024-08-19 at 4 20 09 PM
  • Create a job from the Cron Job oc create job --from=cronjob/[db_restore_cron_job_name] [job_name_can_be_any]. For example, oc create job --from=cronjob/fom-24-db-restore db-backup-restore

The cron job for restoring database is running very fast. We want to keep an eye watching on it very closely, so we can find the error logs if anything wrong will happen.

While doing testing, we drop the existing fom database, and manually created a new empty one, and then run the restore cron job, to avoid conflicting errors (like table already exists error). But we shouldn't do this in TEST and PROD, it's unsafe. Some ideas we could do:

  • Create a new database that connects to a new volume (without running any data migration scripts)
  • Run the database restore on top of the new database
  • Let the api deployment points to the new database, and make sure everything works
  • Handle the old database (either remove it or rename it)
  • Rename the new database and new volume to follow our deployment template, so they can be picked up during our regular deployment process
Clone this wiki locally