A Production Monitor for digital.nhs.uk, that is watching out for real-world misconfigurations.
This repository aims to safeguard the continued reliability, performance, and security of digital.nhs.uk as it interacts with real-world users.
To ensure the highest level of service, digital.nhs.uk's codebase is thoroughly tested at every stage, aiming to catch issues early. For example, in controlled testing environments, we use unit tests, automatic continuous integration tests, and functional acceptance tests, and that’s just to name a few of our methods. However, over the years, odd issues have arisen in production that could have only arisen under real-world conditions, such as DNS network misconfigurations.
The critical aspects of this repository are that it’s lightweight (it gives production a minimal extra workload), the tests are of real-world environmental situations only (the tests couldn’t be carried out earlier in the pipeline), and its system agnostic (it doesn’t know about production’s software and hardware).
You will need Java 8 or above installed on your machine, and you will need to install Maven, if you don't already have them.
Once you have the above installed, execute this Maven command in the directory where your project's POM file is located.
mvn dependency:resolve
This will resolve the project's dependences.
To run Watchdog on a local machine, the properties for the respective subject must be set.
The respective properties for running Watchdog against production are groups=production,none()
, and domain: domain=digital.nhs.uk
.
This will result in Watchdog running tests against the production environment.
Note none()
is a special tag that will run non tagged tests (i.e. @Test without @Production or @Uat).
# minimum required for production
mvn test "-Dgroups=production,none()" "-Ddomain=digital.nhs.uk"
# with a WAF accepted bot name
mvn test "-Dgroups=production,none()" "-Ddomain=digital.nhs.uk" "-DuserAgent=*************"
The respective properties for running Watchdog against UAT are groups=uat,none()
, and domain: domain=uat2.nhsd.io
.
This will result in Watchdog running tests against the UAT environment.
Note none()
is a special tag that will run non tagged tests (i.e. @Test without @Production or @Uat).
If you need to use basic authentication to connect to UAT, you will need the following properties: authType=basic
,
usernnme
and password
.
# minimum required for UAT
mvn test "-Dgroups=uat,none()" "-Ddomain=uat2.nhsd.io"
# with basic authentication
mvn test "-Dgroups=uat,none()" "-Ddomain=uat2.nhsd.io" "-authType=basic" "-Dusername=foo" "-Dpassword=bar"
To run Watchdog you simply need to run the following command:
The respective properties for running Watchdog against itself (i.e. this project code) are groups=watchdog
.
This will result in Watchdog running tests against its own source code.
The respective properties for running Watchdog against a mack environment, such as Nginx can be the same as UAT or Production, just update the domain to suit.
Note none()
is a special tag that will run non tagged tests (i.e. @Test without @Production or @Uat).
mvn test "-Dgroups=uat,none()" "-Ddomain=localhost:8080" "-Dprotocol=http"
This will result in Watchdog running tests against a local unsecure mock server.
mvn test "-Dgroups=watchdog"
To add a new test, create a method and annotated with the @Test
annotation (see J Unit for details). If the test is only for a specific environment tag it with @Production
or @Uat
respectively. For example:
package uk.nhs.england;
import org.junit.jupiter.api.Test;
import uk.nhs.england.tags.Production;
import uk.nhs.england.tags.Uat;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class CheckSubjectTest {
@Test @Uat // only UAT
public void productionShouldAnswerWithTrue() {
assertTrue(true);
}
@Test @Production // only Production
public void uatShouldAnswerWithTrue() {
assertTrue(true);
}
@Test // either UAT or Production
public void alwaysShouldAnswerWithTrue() {
assertTrue(true);
}
}
This project essentially runs a series of J Unit tests. The output should look something like this:
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running uk.nhs.england.CheckSubjectTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.063 s -- in uk.nhs.england.CheckSubjectTest
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.792 s
[INFO] Finished at: 2023-09-13T10:27:13+01:00
[INFO] ------------------------------------------------------------------------
And will gernate the flowing reports:
- Placeholder
- Placeholder
The GitHub Actions Workflow service runs Watchdog.
Watchdog is made up of two worflows, one for UAT and one for Production.
For details on using GitHub Actions' secrets
see encrypted secrets.
For details on using GitHub Actions' vars
see variables.
UAT is triggered by our log scanner. When our log scanner detects a new UAT deployment it dispatches a GitHub webhook.
See:
- alerts > UAT Started
- actions > UAT Run Watchdog
The workflow requires the following vars to be set:
SLACK_CHANNEL_UAT
this is the ID of our dedicated UAT Slack channel.
And the following secrets to be set:
SLACK_API_TOKEN
this is the Slack API token of our Slack Bot.
Production is triggered by a Workflow CRON job.
The workflow requires the following vars to be set:
SLACK_CHANNEL_PRODUCTION
this is the ID of our dedicated Production Slack channel.
And the following secrets to be set:
SLACK_API_TOKEN
this is the Slack API token of our Slack Bot.