diff --git a/README.md b/README.md index 2664dba9..c7ccdc5b 100644 --- a/README.md +++ b/README.md @@ -22,8 +22,8 @@ that make up the Bento platform. ## Requirements -- Docker >= 24.0.4 -- Docker Compose >= 2.20.0 (plugin form: you should have the `docker compose` command available, without a dash) +- Docker >= 25.0 +- Docker Compose >= 2.25.0 (plugin form: you should have the `docker compose` command available, without a dash) - Python >= 3.9 (for `bentoctl`); the services require Python 3.10 but this is included in their Docker images. @@ -36,14 +36,18 @@ that make up the Bento platform. * [Development](./docs/development.md) * [Troubleshooting guide](./docs/troubleshooting.md) * [Deployment](./docs/deployment.md) +* [Monitoring](./docs/monitoring.md) +* [Public discovery configuration](./docs/public_discovery.md) ### Data ingestion and usage * [Guide to genomic reference material in Bento](./docs/reference_material.md) -* [Converting Phenopackets from V1 to V2 using `bentoctl`](./docs/phenopackets_v1_to_v2.md) +* [Converting Phenopackets from V1 to V2 using `bentoctl`](./docs/phenopackets_v1_to_v2.md) +* [JSON Schemas for data types and discovery configuration](./docs/json-schemas.md) ### Migration documents +* [v16 to v17](./docs/migrating_to_17.md) * [v15.2 to v16](./docs/migrating_to_16.md) * [v15.1 to v15.2](./docs/migrating_to_15_2.md) * [v15 to v15.1](./docs/migrating_to_15_1.md) diff --git a/docker-compose.dev.yaml b/docker-compose.dev.yaml index 96dddf5f..243e6bb1 100644 --- a/docker-compose.dev.yaml +++ b/docker-compose.dev.yaml @@ -65,6 +65,11 @@ services: - ${BENTOV2_DOMAIN} - ${BENTOV2_PORTAL_DOMAIN} - ${BENTOV2_AUTH_DOMAIN} + monitoring-net: + aliases: + - ${BENTOV2_DOMAIN} + - ${BENTOV2_PORTAL_DOMAIN} + - ${BENTOV2_AUTH_DOMAIN} public-net: aliases: - ${BENTOV2_DOMAIN} @@ -172,7 +177,6 @@ services: - FLASK_DEBUG=True - BENTO_DEBUG=True - CHORD_DEBUG=True - - BENTO_BEACON_DEBUG=true ports: - "${BENTO_BEACON_EXTERNAL_PORT}:${BENTO_BEACON_INTERNAL_PORT}" - "${BENTO_BEACON_DEBUGGER_EXTERNAL_PORT}:${BENTO_BEACON_DEBUGGER_INTERNAL_PORT}" @@ -232,3 +236,14 @@ services: cbioportal: ports: - "${BENTO_CBIOPORTAL_EXTERNAL_PORT}:${BENTO_CBIOPORTAL_INTERNAL_PORT}" + + grafana: + ports: + - "3000:3000" + environment: + # Workaround for self signed certificates in dev + - GF_AUTH_GENERIC_OAUTH_TLS_SKIP_VERIFY_INSECURE=true + + loki: + ports: + - "3100:3100" diff --git a/docker-compose.local.yaml b/docker-compose.local.yaml index 4962a249..5dbd201f 100644 --- a/docker-compose.local.yaml +++ b/docker-compose.local.yaml @@ -59,11 +59,6 @@ services: - BENTO_GIT_EMAIL - BENTO_GIT_REPOSITORY_DIR=/app - adminer: - # No Docker networks required, bound to host - ports: - - 8080:8080 - aggregation: image: ${BENTOV2_AGGREGATION_IMAGE}:${BENTOV2_AGGREGATION_VERSION_DEV} environment: diff --git a/docker-compose.yaml b/docker-compose.yaml index 05c9e908..f91aaa52 100644 --- a/docker-compose.yaml +++ b/docker-compose.yaml @@ -13,6 +13,7 @@ include: - lib/event-relay/docker-compose.event-relay.yaml - lib/gohan/docker-compose.gohan.yaml # Optional feature; controlled by a compose profile - lib/katsu/docker-compose.katsu.yaml + - lib/logs/docker-compose.logs.yaml - lib/notification/docker-compose.notification.yaml - lib/public/docker-compose.public.yaml # Optional feature; controlled by a compose profile - lib/redis/docker-compose.redis.yaml diff --git a/docs/deployment.md b/docs/deployment.md index 1f3e2d6a..53f682f9 100644 --- a/docs/deployment.md +++ b/docs/deployment.md @@ -99,3 +99,11 @@ ls certs/gateway/letsencrypt/live/ ``` If all went well, the `old-bento.example.com` domain should be redirected to `bento.example.com` in a browser. + +## Discovery configuration + +Bento can serve censored data publicly if configured to do so. This allows anonymous users to take a glimpse into the +data hosted by a Bento node. + +When deploying a Bento instance, make sure that the discovery settings are configured properly at the necessary levels. +Consult the [public discovery](./public_discovery.md) documentation for more details. diff --git a/docs/img/discovery_proj_creation.png b/docs/img/discovery_proj_creation.png new file mode 100644 index 00000000..f2508f9f Binary files /dev/null and b/docs/img/discovery_proj_creation.png differ diff --git a/docs/img/discovery_proj_edit.png b/docs/img/discovery_proj_edit.png new file mode 100644 index 00000000..bebc9be5 Binary files /dev/null and b/docs/img/discovery_proj_edit.png differ diff --git a/docs/img/grafana_explore.png b/docs/img/grafana_explore.png new file mode 100644 index 00000000..bf0fd336 Binary files /dev/null and b/docs/img/grafana_explore.png differ diff --git a/docs/img/kc_grafana_join_group.png b/docs/img/kc_grafana_join_group.png new file mode 100644 index 00000000..133de0aa Binary files /dev/null and b/docs/img/kc_grafana_join_group.png differ diff --git a/docs/installation.md b/docs/installation.md index d70f84ce..6f9bc9e4 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -13,7 +13,7 @@ virtual environment. ### Instance-specific environment variable file: `local.env` Depending on your use, development or deployment, you will need to copy the right template file -to `local.env` in the root of the `bentoV2` folder: +to `local.env` in the root of the `bento` folder, i.e., the repository folder: ```bash # Dev @@ -28,87 +28,16 @@ Then, modify the values as needed; depending on if you're using the instance for #### Development example -The below is an example of a completed development configuration: - -```bash -# in local.env: - -MODE=dev - -# Gateway/domains ----------------------------------------------------- -BENTOV2_DOMAIN=bentov2.local -BENTOV2_PORTAL_DOMAIN=portal.${BENTOV2_DOMAIN} -BENTOV2_AUTH_DOMAIN=bentov2auth.local -# Unused if cBioPortal is disabled: -BENTOV2_CBIOPORTAL_DOMAIN=cbioportal.${BENTOV2_DOMAIN} -# --------------------------------------------------------------------- - -# Feature switches ---------------------------------------------------- - -BENTOV2_USE_EXTERNAL_IDP=0 -BENTOV2_USE_BENTO_PUBLIC=1 - -# - Switch to enable TLS - defaults to true (i.e., use TLS): -BENTO_GATEWAY_USE_TLS='true' - -BENTO_BEACON_ENABLED='true' -BENTO_BEACON_UI_ENABLED='true' -BENTO_CBIOPORTAL_ENABLED='false' -BENTO_GOHAN_ENABLED='true' - -# - Switch to enable French translation in Bento Public -BENTO_PUBLIC_TRANSLATED='true' - -# --------------------------------------------------------------------- - -# Set this to a data storage location, optionally within the repo itself, like: /path-to-my-bentov2-repo/data -# Data directories are split to better use SSD and HDD resources in prod. -# In dev/local it is more convenient to use a single directory -BENTO_FAST_DATA_DIR=./data -BENTO_SLOW_DATA_DIR=./data - -# Auth ---------------------------------------------------------------- -# - Session secret should be set to a unique secure value. -# this adds security and allows sessions to exist across gateway restarts. -# - Empty by default, to be filled by local.env -# - IMPORTANT: set before starting gateway -BENTOV2_SESSION_SECRET=my-very-secret-session-secret # !!! ADD SOMETHING MORE SECURE !!! - -# - Set auth DB password if using a local IDP -BENTO_AUTH_DB_PASSWORD=some-secure-password -# - Always set authz DB password -BENTO_AUTHZ_DB_PASSWORD=some-other-secure-password - -BENTOV2_AUTH_ADMIN_USER=admin -BENTOV2_AUTH_ADMIN_PASSWORD=admin # !!! obviously for dev only !!! - -BENTOV2_AUTH_TEST_USER=user -BENTOV2_AUTH_TEST_PASSWORD=user # !!! obviously for dev only !!! - -# - WES Client ID/secret; client within BENTOV2_AUTH_REALM -BENTO_WES_CLIENT_ID=wes -BENTO_WES_CLIENT_SECRET= -# -------------------------------------------------------------------- - -# Gohan -BENTOV2_GOHAN_ES_PASSWORD=devpassword567 - -# Katsu -BENTOV2_KATSU_DB_PASSWORD=devpassword123 -BENTOV2_KATSU_APP_SECRET=some-random-phrase-here # !!! ADD SOMETHING MORE SECURE !!! - -# Development settings ------------------------------------------------ - -# - Git configuration -BENTO_GIT_NAME=David # Change this to your name -BENTO_GIT_EMAIL=do-not-reply@example.org # Change this to your GitHub account email -``` +For an example of a semi-completed development configuration, see [etc/bento_dev.env](../etc/bento_dev.env). +In the step above, this file was copied to `local.env`, and **you must now edit `local.env` to specify secrets and other +deployment-specific values.** You should at least fill to the following settings in dev mode (it may differ for a production setup), which are not set in the example file: * `BENTOV2_SESSION_SECRET` * `BENTO_AUTH_DB_PASSWORD` * `BENTO_AUTHZ_DB_PASSWORD` +* `BENTO_AGGREGATION_CLIENT_SECRET` * `BENTO_WES_CLIENT_SECRET` If the internal OIDC identity provider (IdP) is being used (by default, Keycloak), variables specifying default @@ -158,6 +87,14 @@ If using Beacon, first copy the configuration file: Then update any config values as needed at `lib/beacon/config/beacon_config.json` and `lib/beacon/config/beacon_cohort.json`. +If using the Beacon network, copy the configuration file: + +```bash +./bentoctl.bash init-config beacon-network +``` + +and update values at `lib/beacon/config/beacon_network_config.json`. + ### Gohan configuration @@ -276,8 +213,10 @@ specified in the step above. ./bentoctl.bash init-auth ``` -**If using an external identity provider**, only start the cluster's gateway -after setting `CLIENT_SECRET` in your local environment file: +After running `init-auth`, be sure to put all client secrets into your `local.env` file! + +**If using an external identity provider**, only start the cluster's gateway after setting various `*_CLIENT_SECRET` +variables in your local environment file: ```bash ./bentoctl.bash run gateway @@ -297,7 +236,7 @@ utilize new variables generated during the OIDC configuration. ## 6. Configure permissions -### a. Create superuser permissions in the new Bento authorization service +### a. Create superuser permissions in the Bento authorization service First, run the authorization service and then open a shell into the container: @@ -317,24 +256,54 @@ which in Keycloak should be a UUID. ### b. Create grants for the Workflow Execution Service (WES) OAuth2 client -Run the following commands to set up authorization for the WES client. Don't forget to replace `ISSUER_HERE` by the -issuer URL! +Run the following commands to set up authorization for the WES client. +**Don't forget to replace `` with the issuer URL!** ```bash # This grant is a temporary hack to get permissions working for v12/v13. In the future, it should be removed. bento_authz create grant \ - '{"iss": "ISSUER_HERE", "client": "wes"}' \ + '{"iss": "", "client": "wes"}' \ '{"everything": true}' \ 'view:private_portal' # This grant gives permission to access and ingest data into all projects and the reference genome service bento_authz create grant \ - '{"iss": "ISSUER_HERE", "client": "wes"}' \ + '{"iss": "", "client": "wes"}' \ '{"everything": true}' \ 'query:data' 'ingest:data' 'ingest:reference_material' 'delete:reference_material' ``` -### c. *Optional step:* Assign portal access to all users in the instance realm +### c. Create a grant for the aggregation and Beacon services + +Run the following commands to set up authorization for the aggregation/Beacon client. +**Don't forget to replace `` with the issuer URL!** + +```bash +# In the future, view:private_portal will need to be removed from this grant. +bento_authz create grant \ + '{"iss": "", "client": "aggregation"}' \ + '{"everything": true}' \ + 'query:data' 'view:private_portal' +``` + + +### d. Configure public data access for all users, including anonymous visitors (if desired): + +To configure public data access, run the following command in the authorization service container. Note that with the +`full` value, **THIS GIVES FULL DATA ACCESS TO EVERYONE WHO VISITS YOUR INSTANCE!** + +```bash +# Configure public data access +# ---------------------------- +# The level below ("counts") preserves previous functionality. Other possible options are: +# - none - will do nothing. +# - bool - for censored true/false discovery, but in effect right now forbids access. +# - counts - for censored count discovery. +# - full - allows full data access (record-level, including sensitive data such as IDs), uncensored counts, etc. +bento_authz public-data-access counts +``` + +### e. Assign portal access to all users in the instance realm We added a special permission, `view:private_portal`, to Bento v12/v13 in order to carry forward the current 'legacy' authorization behaviour for one more major version. This permission currently behaves as a super-permission, diff --git a/docs/json-schemas.md b/docs/json-schemas.md new file mode 100644 index 00000000..9e2c2a0f --- /dev/null +++ b/docs/json-schemas.md @@ -0,0 +1,15 @@ +# JSON Schemas for data types and discovery configuration + +Bento's services perform some of the data validation using [JSON Schemas](https://json-schema.org/specification). + +For a given version, Bento's services expect data to respect the schemas in use for: +- Phenopackets +- Experiments +- Discovery configuration + +Starting with Bento v17 ([Katsu v9.0.0](https://github.com/bento-platform/katsu/releases/tag/v9.0.0)), +the compiled JSON Schemas are published as Katsu release artifacts. + +| Bento Release | JSON Schemas Download | +| --------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | +| [v17](https://github.com/bento-platform/bento/releases/tag/v17) | [Katsu v9.0.0 json-schemas.zip](https://github.com/bento-platform/katsu/releases/download/v9.0.0/json-schemas.zip) | diff --git a/docs/migrating_to_17.md b/docs/migrating_to_17.md new file mode 100644 index 00000000..817b3c44 --- /dev/null +++ b/docs/migrating_to_17.md @@ -0,0 +1,122 @@ +# Migrating to Bento v17 + +Key points: + +* Bento now has observability tools to help monitor the services (Grafana). Some setup is required for this feature to + work. +* Katsu discovery endpoints now have an authorization layer. + * Data that used to be completely public by default (i.e., + censored counts) now requires a permission (`query:project_level_counts` and/or `query:dataset_level_counts`), and + thus a grant in the authorization service. + * Beacon now requires a client ID/secret and an authorization service grant to access uncensored data. +* Katsu discovery is now more granular, and can be configured to the project or dataset level, in addition to the + instance level. See the [Public data discovery configuration](./public_discovery.md) document for more information. +* ... + + +## 1. Stop Bento + +```bash +./bentoctl.bash stop +``` + + +## 2. Checkout to v17 and pull new Docker images + +```bash +# Checkout on the v17 tag +git checkout v17 +# Pull new Docker images +./bentoctl.bash pull +``` + + +## 3. Set up credentials for aggregation/Beacon and, optionally, set up Grafana + +If you wish to enable Grafana, you first must enable the monitoring feature in your `local.env` file: + +```bash +BENTO_MONITORING_ENABLED='true' +``` + +After enabling the Grafana feature flag for the first time, +you must initialize the Docker networks and mounted directories. +```bash +# Init new Docker networks and directories if using Grafana +./bentoctl.bash init-docker +./bentoctl.bash init-dirs +``` + +To create the client secrets for aggregation/Beacon and Grafana (if the latter is enabled), run the following commands: + +```bash +./bentoctl.bash run auth && ./bentoctl.bash run gateway +./bentoctl.bash init-auth +``` + +**Reminder:** Make sure to put the client secret(s) generated by `init-auth` into your `local.env` file! + +Aggregation/Beacon data access authorization will not work until an authorization service grant is configured; +see step 4 below. + +Grafana will not be accessible to users until they are given a valid role; +see the [monitoring user management](./monitoring.md#user-management) section for details. + +## 4. Set up aggregation/Beacon permissions and public data access grants + +Now that Beacon uses a client ID/secret to get authorized, uncensored data access for discovery, a grant must be +configured to give the aggregation/Beacon client data access. + +Another change to permissions: starting from Bento v17, anonymous visitors do not have access to see censored counts +data by default, even if a discovery configuration has been set up. For anonymous visitors to access data, a level +(`bool`, `counts`, `full`) must be chosen and passed to the `bento_authz` CLI command below. + +```bash +./bentoctl.bash run authz +./bentoctl.bash shell authz + +# Configure aggregation/Beacon permissions +# ---------------------------------------- +# This assumes the aggregation/Beacon client ID is "aggregation". +# MUST be replaced with your actual issuer value. +# - The query:data permission gives access to Katsu endpoints which are properly authz-enabled. +# - The view:private_portal permission gives access to Katsu and Gohan endpoints where the proxy still manages access. +# This permission will be removed in an uncoming version. +bento_authz create grant \ + '{"iss": "", "client": "aggregation"}' \ + '{"everything": true}' \ + 'query:data' 'view:private_portal' + +# Configure public data access +# ---------------------------- +# The level below ("counts") preserves previous functionality. Other possible options are: +# - none - will do nothing. +# - bool - for censored true/false discovery, but in effect right now forbids access. +# - counts - for censored count discovery. +# - full - allows full data access (record-level, including sensitive data such as IDs), uncensored counts, etc. +bento_authz public-data-access counts +``` + + +## 5. Optionally, add Beacon network + +To host a network of beacons, with a corresponding UI in Bento Public, first copy the config file: + +```bash +./bentoctl.bash init-config beacon-network +``` + + + +then update values at `lib/beacon/config/beacon_network_config.json`. Activate the network by adding (or modifying) this value in local.env: + + +```bash +BENTO_BEACON_NETWORK_ENABLED='true' +``` + +## 6. Start Bento + +```bash +./bentoctl.bash start +``` diff --git a/docs/monitoring.md b/docs/monitoring.md new file mode 100644 index 00000000..627983a4 --- /dev/null +++ b/docs/monitoring.md @@ -0,0 +1,93 @@ +# Bento Monitoring + +Previously, the only way to get the logs of a given service was to connect to the the server hosting Bento +and getting the logs directly from Docker. +Since v17, Bento includes tools that allow authenticated and authorized users to explore the services' logs +in a convenient web application. + +The stack enabling this is composed by three open-source services: +- Promtail: forwards the logs from Bento's services to the log database +- Loki: stores the logs from Promtail and serves them to Grafana +- Grafana: auth protected web application to query and analyse collected logs + +## Configuration + +Enable monitoring by setting the feature flag + +```bash +BENTO_MONITORING_ENABLED='true' +``` + +Pull the images and prepare the network/directories for monitoring containers: + +```bash +./bentoctl.bash pull +./bentoctl.bash init-docker +./bentoctl.bash init-dirs +``` + +Grafana is configured to only let in authenticated users from Bento's Keycloak realm, +if they have the required client permissions for Grafana. + +Create the Grafana OIDC client, its permissions and group mappings with the following: + +```bash +./bentoctl.bash init-auth +``` + +Set the outputted value for `BENTO_GRAFANA_CLIENT_SECRET` in the `local.env` file and restart Grafana. + +```bash +./bentoctl.bash restart grafana +``` + +## User management + +In order for a user to access Grafana, they must belong to a Grafana sub-group in Bento's Keycloak. + +Group role-mappings in Keycloak: +- Grafana (parent group, no permission) + - Admin + - Editor permissions + - Administration of Grafana + - Editor + - Viewer permissions + - Can explore logs + - Can create dashboards for viewers + - Viewer + - Can view created dashboards + +The `admin`, `editor` and `viewer` roles are Grafana concepts. During authentication, Grafana will synchronize the +user's roles from Keycloak, and only let the user in if a valid role can be retrieved from the ID token. + +The `init-auth` step in the [configuration](#configuration) creates everything needed for this in Keycloak. + +The only remaining step is to add users to Grafana groups: +- In a browser, navigate to the Keycloak admin portal (your `BENTOV2_AUTH_DOMAIN`) +- Authenticate using the admin credentials +- By default the realm will be `Keycloak`, change it to Bento's realm (value of `BENTOV2_AUTH_REALM`) +- Navigate to the `Users` tab +- Select a user +- Select the user's `Groups` tab +- Click on the `Join Group` button +- Select a Grafana sub-group and click on `Join` + +![Grafana sub-group attribution](./img/kc_grafana_join_group.png) + +## Using Grafana + +The user can now connect to `bento_web` and access Grafana from the header tabs! + +At the moment, no default dashboards are provided, so only users with `admin` or `editor` roles will have +access to data. + +To look at the raw logs for a service: +- Select the `Explore` tab in Grafana +- Select the `service_name` label filter +- Select a service from the value drop down +- In the top-right, click on `run query`, `live`, or `Last