Skip to content

Disaster Recovery

Daniel Glen edited this page Jun 21, 2024 · 18 revisions

Runbooks

Cloud Platform Support regarding a DR

See following for CP on call process link


See following link for user guide info. The following link indicates how we access DR services from Cloud platform team.


I want to restore a file to a previous version

You need to rollback to the version you would like to go back to.

  1. Verify on the AWS console that a backup version exists
  2. Connect to the service pod $ kubectl exec -n $namespace --stdin --tty $pod_name -- /bin/sh
  3. Once connected, run aws s3api list-object-versions --bucket bucketname --prefix path/to/file. This will list all versions of the file in chronological order by LastModified date.
  4. The next step is to delete all previous versions until you get to the one you would like to roll back to.
  5. To delete a version run aws s3api delete-object --bucket bucketname --key path/to/file --version-id versionId

I want to restore my database to a previous snapshot

April 2024 - 34 mins to recreate DB from snapshot

Snapshots are currently taken daily and can also be manually taken whenever necessary

  1. Get the db-instance-identifier of the DB instance you want to restore, using the Cloud Platform CLI; cloud-platform decode-secret -n hale-platform-env -s rds-instance-output

  2. Look for a line containing ‘rds_instance_address’ E.g cloud-platform-gffgfg.hthf.eu-west-2.rds.amazonaws.com

  3. Search for snapshots of the db using ...

  4. Locate the DBSnapshotIdentifier of the snapshot you would like to restore to. E.g rds:cloud-platform-gffgfg.hthf-2024-03-20-06-18

  5. Update the rds.tf in cloud-platform-environments and the hale environment you would like to restore to.

  6. Add the line snapshot_identifier = "cloud-platform-gffgfg.hthf-2024-03-20-06-18"

  7. Your RDS db_allocated_storage will be autoscaled to maximum of 10000 by default unless you have set to specific number. Hence, the actual db_allocated_storage may be higher than what is set as default(10).

  8. Check the actual “AllocatedStorage” value described in your RDS instance using the command aws rds describe-db-instances --db-instance-identifier <your-rds-db-identifier> This was located in step 4.

  9. If that is different to what is set for your RDS, update your RDS manifest to include the actual db_allocated_storage by adding db_allocated_storage = "<the actual value you got from the above cli command>"

  10. In the PR add a file called APPLY_PIPELINE_SKIP_THIS_NAMESPACE in the root of the namespace

  11. Reach out to the cloud platform team using #ask-cloud-platform stating that you would like to invoke DR and you are restoring a DB to a snapshot. They need to rename the old one to allow the new one to be created with the same name and permissions.

  12. You will then be notified to merge your PR which will then recreate the DB with the same name and state from the snapshot.

  13. Confirm with CP that the restore has been successful so they can clear up the old DBs.


I want to take a RDS snapshot

Automated snapshots are taken as per the environments schedule. Use the below if you would like to take a manual snapshot.

  1. Connect to the service pod in the environment you would like to snapshot $ kubectl exec -n $namespace --stdin --tty $pod_name -- /bin/sh

  2. Run the following to create a snapshot aws rds create-db-snapshot --db-instance-identifier <your-rds-db-identifier> --db-snapshot-identifier <snapshot name>

I want to restore individual tables in the database

Using the Cloud Platform, it is possible to create a new database with the data from a snapshot.

  1. Take a current snapshot of the database
  2. Identify the snapshot you would like the recovery RDS instance to be created from
  3. Create a new RDS-temp.tf file in the cloud platform namespace you would like to run this in
  4. Copy the contents of the existing RDS.tf file to this new file
  5. Change the module names have unique names, for example change module "rds" { to module "rds-temp" {
  6. Ensure you change the references in the rest of this file to call the new module name
  7. Add a snapshot_identifier field containing the snapshot you would like to use
  8. Add an is_migration = true to this file
  9. Make a PR and go through the approvals in Cloud Platform.
  10. Retrieve the credentials for the newly created RDS instance cloud-platform decode-secret -n hale-platform-dev -s rds-temp-instance-output
  11. You will now need to connect to this database to export the tables relating to the site you want to restore.
  12. Create a new port forward connection container which will allow you to use your local machine to interface with the cloud platform RDS instance.
kubectl \
  -n <namespace> \
  run port-forward-pod \
  --image=ministryofjustice/port-forward \
  --port=5432 \
  --env="REMOTE_HOST=rds_instance_address" \
  --env="LOCAL_PORT=5432" \
  --env="REMOTE_PORT=3306"
  1. Open the port forwarding container using the following
kubectl \
  -n <namespace> \
  port-forward \
  port-forward-pod 5432:5432
  1. Launch Sequel Pro and login with the below credentials
    Host: 127.0.0.1
    Username: Found in the secret in step 10
    Password: Found in the secret in step 10
    Port: 5432
  2. Click File > Export. Ensure that SQL is selected at the top of the screen
  3. Select S C and D columns for every table belonging to the site you want to restore and click export.
  4. Repeat Steps 10-14 changing to the database you would like to restore to
  5. Click File > Import. Choose the .sql file you saved earlier and import.
  6. Ensure the site is working and make sure whatever needed to be replaced was replaced.
  7. You can now clean up your environment.