-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #123 from DFE-Digital/2041-configure-bigquery-in-t…
…he-terraform-module [2041] dfe_analytics module
- Loading branch information
Showing
10 changed files
with
698 additions
and
0 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
# DfE Analytics | ||
Create resources in Google cloud Bigquery and provides the required variables to applications so they can send events. | ||
|
||
## Examples | ||
### Reuse existing dataset and events table | ||
|
||
```hcl | ||
module "dfe_analytics" { | ||
source = "./vendor/modules/dfe-terraform-modules//aks/dfe_analytics" | ||
azure_resource_prefix = var.azure_resource_prefix | ||
cluster = var.cluster | ||
namespace = var.namespace | ||
service_short = var.service_short | ||
environment = var.environment | ||
gcp_dataset = "events_${var.config}" | ||
gcp_project_id = "apply-for-qts-in-england" | ||
gcp_project_number = 385922361840 | ||
} | ||
``` | ||
|
||
### Create new dataset and events table | ||
Use for a new environment. To get the values for `gcp_taxonomy_id` and `gcp_policy_tag_id` see [Taxonomy and policy tag](#taxonomy-and-policy-tag). | ||
```hcl | ||
module "dfe_analytics" { | ||
source = "./vendor/modules/dfe-terraform-modules//aks/dfe_analytics" | ||
azure_resource_prefix = var.azure_resource_prefix | ||
cluster = var.cluster | ||
namespace = var.namespace | ||
service_short = var.service_short | ||
environment = var.environment | ||
gcp_keyring = "afqts-key-ring" | ||
gcp_key = "afqts-key" | ||
gcp_project_id = "apply-for-qts-in-england" | ||
gcp_project_number = 385922361840 | ||
gcp_taxonomy_id = 5456044749211275650 | ||
gcp_policy_tag_id = 2399328962407973209 | ||
} | ||
``` | ||
|
||
### Configure application | ||
#### Enable in Ruby | ||
```ruby | ||
DfE::Analytics.configure do |config| | ||
... | ||
config.azure_federated_auth = ENV.include? "GOOGLE_CLOUD_CREDENTIALS" | ||
end | ||
``` | ||
|
||
#### Enable in .NET | ||
```cs | ||
builder.Services.AddDfeAnalytics() | ||
.UseFederatedAksBigQueryClientProvider(); | ||
``` | ||
Ensure the `ProjectNumber`, `WorkloadIdentityPoolName`, `WorkloadIdentityPoolProviderName` and `ServiceAccountEmail` configuration keys are populated within the `DfeAnalytics` configuration section. | ||
|
||
#### Variables | ||
Each variable is available as a separate output. For convenience, the `variables_map` output provides them all: | ||
- BIGQUERY_PROJECT_ID | ||
- BIGQUERY_TABLE_NAME | ||
- BIGQUERY_DATASET | ||
- GOOGLE_CLOUD_CREDENTIALS | ||
|
||
```hcl | ||
module "application_configuration" { | ||
source = "./vendor/modules/dfe-terraform-modules//aks/application_configuration" | ||
... | ||
secret_variables = merge( | ||
module.dfe_analytics.variables_map, | ||
{ | ||
... | ||
} | ||
) | ||
} | ||
``` | ||
|
||
#### Enable on each app that requires it | ||
```hcl | ||
module "worker_application" { | ||
source = "./vendor/modules/dfe-terraform-modules//aks/application" | ||
... | ||
enable_gcp_wif = true | ||
} | ||
``` | ||
|
||
## Authentication - Command line | ||
The user should have Owner role on the Google project. | ||
|
||
- Run `gcloud auth application-default login` | ||
- Run terraform | ||
|
||
## Authentication - Github actions | ||
We set up workfload identity federation on the Google side and configure the workflow. The user should have Owner role on the Google project. This is done once per repository. | ||
|
||
- Run the `authorise_workflow.sh` located in *aks/dfe_analytics*: | ||
``` | ||
./authorise_workflow.sh PROJECT_ID REPO | ||
``` | ||
Example: | ||
``` | ||
./authorise_workflow.sh apply-for-qts-in-england apply-for-qualified-teacher-status | ||
``` | ||
- The script shows the *permissions* and *google-github-actions/auth step* to add to the workflow job | ||
- Adding the permission removes the [default token permissions](https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication#permissions-for-the-github_token), which may be an issue for some actions that rely on them. For example, the [marocchino/sticky-pull-request-comment](https://github.com/marocchino/sticky-pull-request-comment) action requires `pull-requests: write`. It must then be added explicitly. | ||
- Run the workflow | ||
|
||
## Taxonomy and policy tag | ||
The user should have Owner role on the Google project. | ||
|
||
- Authenticate: `gcloud auth application-default login` | ||
- Get projects list: `gcloud projects list` | ||
- Select project e.g.: `gcloud config set project apply-for-qts-in-england` | ||
- Get taxonomies list: | ||
``` | ||
gcloud data-catalog taxonomies list --location=europe-west2 --format="value(name)" | ||
``` | ||
The path contains the taxonomy id as a number e.g. 5456044749211275650 | ||
- Get policy tags e.g.: | ||
``` | ||
gcloud data-catalog taxonomies policy-tags list --taxonomy="projects/apply-for-qts-in-england/locations/europe-west2/taxonomies/5456044749211275650" --location="europe-west2" --filter="displayName:hidden" --format="value(name)" | ||
``` | ||
The path contains the policy tag id as a number e.g. 2399328962407973209 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
#!/usr/bin/env bash | ||
# Set up Direct Workload Identity Federation | ||
# See https://github.com/google-github-actions/auth?tab=readme-ov-file#preferred-direct-workload-identity-federation | ||
|
||
PROJECT_ID=$1 | ||
REPO=$2 | ||
|
||
if [[ -z "$PROJECT_ID" || -z "$REPO" ]]; then | ||
cat <<EOF | ||
Set up Direct Workload Identity Federation between Github action workflows from a repository and GCP for setting up Bigquery. The user must have the 'Owner' role on the project. | ||
Usage: ./authorise_workflow.sh PROJECT_ID REPO - Example: ./authorise_workflow.sh apply-for-qts-in-england apply-for-qualified-teacher-status | ||
EOF | ||
exit 1 | ||
fi | ||
|
||
set -eu | ||
|
||
GITHUB_ORG=DFE-Digital | ||
ORG_REPO=${GITHUB_ORG}/${REPO} | ||
# The pool name must be up to 32 characters | ||
WORKLOAD_ID="${REPO:0:32}" | ||
|
||
echo Login to Google cloud. The user must have the Owner role on the project. | ||
gcloud auth application-default login | ||
|
||
echo "Create ${WORKLOAD_ID} workload identity pool" | ||
gcloud iam workload-identity-pools create "${WORKLOAD_ID}" \ | ||
--project="${PROJECT_ID}" \ | ||
--location="global" \ | ||
--display-name="${WORKLOAD_ID}" | ||
|
||
WORKLOAD_IDENTITY_POOL_ID=$(gcloud iam workload-identity-pools describe "${WORKLOAD_ID}" \ | ||
--project="${PROJECT_ID}" \ | ||
--location="global" \ | ||
--format="value(name)") | ||
|
||
echo WORKLOAD_IDENTITY_POOL_ID=$WORKLOAD_IDENTITY_POOL_ID | ||
|
||
echo "Create ${WORKLOAD_ID} workload identity pool provider" | ||
gcloud iam workload-identity-pools providers create-oidc "${WORKLOAD_ID}" \ | ||
--project="${PROJECT_ID}" \ | ||
--location="global" \ | ||
--workload-identity-pool="${WORKLOAD_ID}" \ | ||
--display-name="${WORKLOAD_ID}" \ | ||
--attribute-mapping="google.subject=assertion.sub,attribute.actor=assertion.actor,attribute.repository=assertion.repository,attribute.repository_owner=assertion.repository_owner" \ | ||
--attribute-condition="assertion.repository_owner == '${GITHUB_ORG}' && attribute.repository == '${ORG_REPO}' " \ | ||
--issuer-uri="https://token.actions.githubusercontent.com" | ||
|
||
echo Get workload identity pool provider id | ||
WORKLOAD_IDENTITY_POOL_PROVIDER_ID=$(gcloud iam workload-identity-pools providers describe "${WORKLOAD_ID}" \ | ||
--project="${PROJECT_ID}" \ | ||
--location="global" \ | ||
--workload-identity-pool="${WORKLOAD_ID}" \ | ||
--format="value(name)") | ||
|
||
echo Bind role roles/iam.serviceAccountCreator | ||
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \ | ||
--role="roles/iam.serviceAccountAdmin" \ | ||
--member="principalSet://iam.googleapis.com/${WORKLOAD_IDENTITY_POOL_ID}/attribute.repository/${ORG_REPO}" | ||
|
||
echo Bind role roles/bigquery.admin | ||
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \ | ||
--role="roles/bigquery.admin" \ | ||
--member="principalSet://iam.googleapis.com/${WORKLOAD_IDENTITY_POOL_ID}/attribute.repository/${ORG_REPO}" | ||
|
||
echo Bind role roles/dataplex.taxonomyViewer | ||
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \ | ||
--role="roles/dataplex.taxonomyViewer" \ | ||
--member="principalSet://iam.googleapis.com/${WORKLOAD_IDENTITY_POOL_ID}/attribute.repository/${ORG_REPO}" | ||
|
||
echo Bind role roles/cloudkms.viewer | ||
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \ | ||
--role="roles/cloudkms.viewer" \ | ||
--member="principalSet://iam.googleapis.com/${WORKLOAD_IDENTITY_POOL_ID}/attribute.repository/${ORG_REPO}" | ||
|
||
echo | ||
echo Now add this step to the workflow to authenticate to Google: | ||
cat <<EOF | ||
deploy_job: | ||
permissions: | ||
id-token: write | ||
... | ||
... | ||
steps: | ||
- uses: google-github-actions/auth@v2 | ||
with: | ||
project_id: ${PROJECT_ID} | ||
workload_identity_provider: ${WORKLOAD_IDENTITY_POOL_PROVIDER_ID} | ||
EOF |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
module "cluster_data" { | ||
source = "../cluster_data" | ||
name = var.cluster | ||
} | ||
|
||
data "azurerm_client_config" "current" {} | ||
|
||
data "azurerm_user_assigned_identity" "gcp_wif" { | ||
name = "${var.azure_resource_prefix}-gcp-wif-${var.cluster}-${var.namespace}-id" | ||
resource_group_name = module.cluster_data.configuration_map.resource_group_name | ||
} |
Oops, something went wrong.