Setup of GoCD deployment for NHS Patient Record Management Team.
The server is accessible at https://prod.gocd.patient-deductions.nhs.uk
via VPN. You will need to authenticate
using GitHub.
GoCD consists of a server and agents. The server data is stored in an RDS database and build artifacts are stored on an EC2 volume.
The server is behind a proxy facing the internet. There are several agents running in a dedicated virtual private
cloud (VPC). These are general-purpose agents to be used for building code or
Docker images. Other agents can be deployed in specific networks. The remote-agents-module
terraform module can be used to provision agents in other subnets.
Each agent has Docker and Dojo available. This makes it possible to build any project as long as you produce a Docker image with the required tools first. For more details, see the Dojo README.
A complete spec of the agent tools is defined by Kudulab's GoCD Agent Docker image which contains the actual Docker file.
This section details how to obtain sufficient access to work with Terraform or AWS CLI.
Note: This is based on the README of the assume-role tool.
unset AWS_ACCESS_KEY_ID
unset AWS_SECRET_ACCESS_KEY
unset AWS_SESSION_TOKEN
Set up a profile for each role you would like to assume in ~/.aws/config
, for example:
[profile default]
region = eu-west-2
output = json
[profile admin]
region = eu-west-2
role_arn = <role-arn>
mfa_serial = <mfa-arn>
source_profile = default
The source_profile
needs to match your profile in ~/.aws/credentials
, for example:
[default]
aws_access_key_id = <your-aws-access-key-id>
aws_secret_access_key = <your-aws-secret-access-key>
Install assume-role
via Homebrew: brew install remind101/formulae/assume-role
.
Run the following commands with the profile configured in your ~/.aws/config
:
- Locally:
assume-role admin
- Dojo:
eval $(dojo "echo <mfa-code> | assume-role admin")
Run the following command to confirm the role was assumed correctly: aws sts get-caller-identity
.
NHS_ENVIRONMENT
refers to the AWS envs:dev
,test
,pre-prod
, etc.GOCD_ENVIRONMENT
refers to GoCD env specifically- GoCD is always in
prod
, however it is within theci
AWS account.
- GoCD is always in
VPN client keys can be generated via the following steps:
- Gain access to AWS as described above
- Generate GoCD VPN client configuration via the following command:
GOCD_ENVIRONMENT=prod ./tasks generate_vpn_client_crt <your-first-name.your-last-name>
AWS SSM Parameters Design Principles
When creating the new SSM keys, please follow the agreed convention as per the design specified below:
- All parts of the keys are lowercase
- The words are separated by dashes (
kebab-case
) env
is optional
Please follow this design to ensure the SSM keys are easy to maintain and navigate through:
Type | Design | Example |
---|---|---|
User-specified | /repo/<env>?/user-input/ |
/repo/${var.environment}/user-input/db-username |
Auto-generated | /repo/<env>?/output/<name-of-git-repo>/ |
/repo/output/prm-deductions-base-infra/root-zone-id |
You can pick which deployment to manage by setting environment variable GOCD_ENVIRONMENT
. To make changes in prod
,
set the following:
export GOCD_ENVIRONMENT=prod
./tasks create_secrets
./tasks tf_plan create
./tasks tf_apply
At this point EC2 instance should exist.
You need to use SSH port forwarding (AKA SSH tunneling) to be able to connect your local machine to the remote
server/EC2 via the VPN. To achieve this, execute: ./tasks create_ssh_tunnel
. This starts a tunnel to the GoCD EC2
instance.
Now you should be able to provision the GoCD server using: ./tasks provision
. This opens another console/terminal
session.
The GoCD server is provisioned using an Ansible playback which will setup a Docker container for both the GoCD Server, and an NGINX proxy. The NGINX proxy needs to have SSL certificates synced to it in order to launch correctly, that can be done byt firstly generating certs ./tasks generate_ssl_certs
followed by ./tasks sync_certs
.
Updating only the agents can be done with:
GOCD_ENVIRONMENT=prod ./tasks tf_plan_agents create
GOCD_ENVIRONMENT=prod ./tasks tf_apply
To deploy/re-deploy a specific agent, e.g. for agent 3
:
GOCD_ENVIRONMENT=prod ./tasks tf_plan_agent 3 create
GOCD_ENVIRONMENT=prod ./tasks tf_apply
Agent's images are built and pushed manually, Dockerfile
s are versioned
at nhsconnect/prm-docker-gocd-agent.
From Linux (Ubuntu 18.0.4 LTS tested) the network setup is slightly different and SSH forwarding does not work out of the box, neither does DNS resolution over VPN. Simplest setup is to use direct networking without the tunnel requiring these changes (currently requiring manual change, scripts automated for use on MACOS):
- Use
Dojofile-ansible-linux
to fix the DNS - Instead of
docker.host.internal
, use<gocd env>.gocd.patient-deductions.nhs.uk
- Don't use the
-p / -P 2222
switches, go directly to SSH port 22
git clone https://github.com/susmithasrimani/gocd-google-chat-build-notifier.git
cd gocd-google-chat-build-notifier
./gradlew uberJar
- Make sure you can
ssh
into GoCD server scp build/libs/gocd-google-chat-build-notifier-uber.jar <user>@<gocd-server-ip>:/var/gocd-data/data/plugins/external/
- Go back to
prm-gocd-infra
- Make sure you've assumed the AWS role with elevated permissions
./tasks provision
- Reboot the GoCD server EC2 instance, preferably by restarting Docker instance on the server you
ssh
ed into - Go to GoCD plugins page
- Click on the cogwheel next to the Google Chat Build Notifier plugin
- Paste the Google Chat webhook token into
Webhook URL
field. You can find Google Chat webhook token in theNHS - PRM Build Failures
room atManage webhooks
option.
This uses certbot and letsencrypt and appears to only generate
on your local machine (using AWS DNS automatically to
prove ownership of domain) and then upload to /etc/letsencrypt
can occur.
- SSO into the following AWS account:
NHS Digital DomainC GPIT PatientRecordMigrator Dev
. - Set the following environment variable
export GOCD_ENVIRONMENT=prod
.
These steps were done on a machine that recently deployed GOCD agents, so it had gocd-prod
and gocd-prod.pub
keys
in terraform/ssh/
. If you don't have this, you need to run ./tasks ssh_key
.
Then generate and sync the certificates
./tasks generate_ssl_certs
sudo GOCD_ENVIRONMENT=prod ./tasks sync_certs
Important: If you see a permission error while executing SCP please refer to the Troubleshooting & Common Issues -> SCP Permission Denied when Syncing Certs
section within this README.
Ensure you are connected to the GoCD VPN and then SSH into the EC2
ssh -i terraform/ssh/gocd-prod ec2-user@prod.gocd.patient-deductions.nhs.uk
Over SSH on the GoCD server, you can then issue:
docker restart nginx
- SSL certs are currently issued manually from workstation and sent over to the GoCD server. It could be automated on the GoCD machine
- Connecting GoCD analytics to RDS requires to plugin settings via the UI
- Agent's auto-registration key was placed in SSM store manually. This is a one-time operation
- Agents could be placed behind a NAT
- SSH onto the EC2 instance via
ssh -i terraform/ssh/gocd-prod ec2-user@prod.gocd.patient-deductions.nhs.uk
. - Run
sudo chmod 644 /home/ec2-user/letsencrypt/accounts/acme-v02.api.letsencrypt.org/directory/be16a80d6cbf772ba726755a81afe0fb/private_key.json
. - Exit out of the EC2 instance.
- Run
sudo GOCD_ENVIRONMENT=prod ./tasks sync_certs
again.
When personal access token are due to expire:
- Login to GoCD first.
- Renew/Create your PAT in GitHub.
- Go to to GoCD -> Admin -> Config XML. Paste your new token as a value into the correct property under the github authconfig.
You can paste as a
value
and the server will automatically encrypt this once the config has been loaded and change this toencryptedValue
E.g. for a new Personal Access Token, copy the format below with your new token
<property>
<key>PersonalAccessToken</key>
<value>ghp_5Kxn3VHuJngV2pus5LWIvYzXxt98DS1cs7</value>
</property>
If you have generated a new token before logging into GoCD first and are now locked out the dashboard you will need to SSH into the GoCD server and update the config XML file manually. The file is cruise-config.xml
and located at /var/gocd-data/data/config
. Follow the same steps above and replace the correct sections related to the newly generated tokens, or use this sample and paste your own values in.
<authConfigs>
<authConfig id="prm-gh-auth" pluginId="cd.go.authorization.github">
<property>
<key>ClientId</key>
<value>a70893bb0bdb83mar314</value>
</property>
<property>
<key>ClientSecret</key>
<value>abb6474851507550fee082cc4ef282f5a1b36fe4</value>
</property>
<property>
<key>AuthenticateWith</key>
<value>GitHub</value>
</property>
<property>
<key>GitHubEnterpriseUrl</key>
<value />
</property>
<property>
<key>AllowedOrganizations</key>
<value>nhsconnect</value>
</property>
<property>
<key>PersonalAccessToken</key>
<value>ghp_5Kxn3VHuJngV2pus5LWIvYzXxt98DS1cs7</value>
</property>
</authConfig>
</authConfigs>
The values will be encrypted once saved and read by the server. You may need to restart the GoCD service using docker restart service
You can release some disk space by doing the following whilst logged onto the server:
- Stop the
server
container:docker stop server
- Remove the stopped
server
container to release some disk space:docker system prune
- Start a new
server
container:docker run --detach -p "8153:8153" -p "8154:8154" --env GCHAT_NOTIFIER_CONF_PATH=/home/go/gchat_notif.conf --env GOCD_SERVER_JVM_OPTS="-Dlog4j2.formatMsgNoLookups=true" --volume "/var/gocd-data/data:/godata" --volume "/var/gocd-data/go-working-dir:/go-working-dir" --volume "/var/gocd-data/home:/home/go" --name server gocd-server:latest