From a7035b8b80768d760eb1bd92634740c01d689747 Mon Sep 17 00:00:00 2001 From: alex <8968914+acpana@users.noreply.github.com> Date: Mon, 15 Jan 2024 15:46:33 -0800 Subject: [PATCH] docs: syncset docs (#3202) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Alex Pana <8968914+acpana@users.noreply.github.com> Signed-off-by: alex <8968914+acpana@users.noreply.github.com> Co-authored-by: Rita Zhang Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com> --- website/docs/sync.md | 56 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 49 insertions(+), 7 deletions(-) diff --git a/website/docs/sync.md b/website/docs/sync.md index a79bee11e29..e47665908f1 100644 --- a/website/docs/sync.md +++ b/website/docs/sync.md @@ -3,15 +3,46 @@ id: sync title: Replicating Data --- -`Feature State`: The `Config` resource is currently alpha. +## Replicating Data -> The "Config" resource must be named `config` for it to be reconciled by Gatekeeper. Gatekeeper will ignore the resource if you do not name it `config`. +Some constraints are impossible to write without access to more state than just the object under test. For example, it is impossible to know if a label is unique across all pods and namespaces unless a ConstraintTemplate has access to all other pods and namespaces. To enable this use case, we provide syncing of data into a data client. + +### Replicating Data with SyncSets (Recommended) + +`Feature State`: Gatekeeper version v3.15+ (alpha) + +Kubernetes data can be replicated into the data client using `SyncSet` resources. Below is an example of a `SyncSet`: + +```yaml +apiVersion: syncset.gatekeeper.sh/v1alpha1 +kind: SyncSet +metadata: + name: syncset-1 +spec: + gvks: + - group: "" + version: "v1" + kind: "Namespace" + - group: "" + version: "v1" + kind: "Pod" +``` + +The resources defined in the `gvks` field of a SyncSet will be eventually synced into the data client. -Some constraints are impossible to write without access to more state than just the object under test. For example, it is impossible to know if an ingress's hostname is unique among all ingresses unless a rule has access to all other ingresses. To make such rules possible, we enable syncing of data into OPA. +#### Working with SyncSet resources -The [audit](audit.md) feature does not require replication by default. However, when the ``audit-from-cache`` flag is set to true, the audit informer cache will be used as the source-of-truth for audit queries; thus, an object must first be cached before it can be audited for constraint violations. +* Updating a SyncSet's `gvks` field should dynamically update what objects are synced. +* Multiple `SyncSet`s may be defined and those will be reconciled by the Gatekeeper syncset-controller. Notably, the [set union](https://en.wikipedia.org/wiki/Union_(set_theory)) of all SyncSet resources' `gvks` and the [Config](sync#replicating-data-with-config) resource's `syncOnly` will be synced into the data client. +* A resource will continue to be present in the data client so long as a SyncSet or Config still specifies it under the `gvks` or `syncOnly` field. -Kubernetes data can be replicated into the audit cache via the sync config resource. Currently resources defined in `syncOnly` will be synced into OPA. Updating `syncOnly` should dynamically update what objects are synced. Below is an example: +### Replicating Data with Config + +`Feature State`: Gatekeeper version v3.6+ (alpha) + +> The "Config" resource must be named `config` for it to be reconciled by Gatekeeper. Gatekeeper will ignore the resource if you do not name it `config`. + +Kubernetes data can also be replicated into the data client via the Config resource. Resources defined in `syncOnly` will be synced into OPA. Below is an example: ```yaml apiVersion: config.gatekeeper.sh/v1alpha1 @@ -36,11 +67,22 @@ You can install this config with the following command: kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/demo/basic/sync.yaml ``` -Once data is synced into OPA, rules can access the cached data under the `data.inventory` document. +#### Working with Config resources -The `data.inventory` document has the following format: +* Updating a Config's `syncOnly` field should dynamically update what objects are synced. +* The `Config` resource is meant to be a singleton. The [set union](https://en.wikipedia.org/wiki/Union_(set_theory)) of all SyncSet resources' `gvks` and the [Config](sync#replicating-data-with-config) resource's `syncOnly` will be synced into the data client. +* A resource will continue to be present in the data client so long as a SyncSet or Config still specifies it under the `gvks` or `syncOnly` field. +### Accessing replicated data + +Once data is synced, ConstraintTemplates can access the cached data under the `data.inventory` document. + +The `data.inventory` document has the following format: * For cluster-scoped objects: `data.inventory.cluster[][][]` * Example referencing the Gatekeeper namespace: `data.inventory.cluster["v1"].Namespace["gatekeeper"]` * For namespace-scoped objects: `data.inventory.namespace[][groupVersion][][]` * Example referencing the Gatekeeper pod: `data.inventory.namespace["gatekeeper"]["v1"]["Pod"]["gatekeeper-controller-manager-d4c98b788-j7d92"]` + +### Auditing From Cache + +The [audit](audit.md) feature does not require replication by default. However, when the `audit-from-cache` flag is set to true, the audit informer cache will be used as the source-of-truth for audit queries; thus, an object must first be cached before it can be audited for constraint violations. Kubernetes data can be replicated into the audit cache via one of the resources above. \ No newline at end of file