From 3a4cd844ee57805a8118d46728916b1b2604fd09 Mon Sep 17 00:00:00 2001 From: Nicolas Dupont Date: Tue, 4 Jun 2024 16:58:21 +0200 Subject: [PATCH 1/7] Update documentation for the Federated API --- content/api/federated.en.md | 43 ++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/content/api/federated.en.md b/content/api/federated.en.md index 8544550..1457ae5 100644 --- a/content/api/federated.en.md +++ b/content/api/federated.en.md @@ -1,23 +1,23 @@ --- -title: Federated +title: Federation weight: 3 --- -# Federated Web API [Beta] +# Federation Web API [Beta] -Open Terms Archive is a decentralised system that tracks collections of services' terms across multiple servers. Each collection operates its own API, and the federated API unifies search and discovery across collections, fostering collaboration with external applications. +Open Terms Archive is a decentralised system that tracks collections of services' terms across multiple servers. Each collection operates its own API, and the federation API unifies search and discovery across collections, fostering collaboration with external applications. -The Federated Web API exposes JSON data over HTTP. Its [documentation](http://51.89.136.45/v1/docs/) is provided in a dedicated, interactive interface. +The Federation Web API exposes JSON data over HTTP. Its [documentation](http://162.19.74.224/federation-api/v1/docs/) is provided in a dedicated, interactive interface. That endpoint exposes both the [OpenAPI](https://swagger.io/specification/) specification if the requested `Content-Type` is JSON, and a Swagger UI for visual and interactive documentation otherwise. ## Beta -This API is offered as a preview, based on a first use case [defined](https://github.com/OpenTermsArchive/engine/issues/1016) with partner [ToS;DR](https://tosdr.org). Unexpected problems or missing functionality may arise. Please provide feedback through [issues](https://github.com/OpenTermsArchive/federated-api/issues) in the dedicated repository. +This API is offered as a preview, based on a first use case [defined](https://github.com/OpenTermsArchive/engine/issues/1016) with partner [ToS;DR](https://tosdr.org). Unexpected problems or missing functionality may arise. Please provide feedback through [issues](https://github.com/OpenTermsArchive/federation-api/issues) in the dedicated repository. ## Source code -The codebase for the Federated API is available on [`github.com/OpenTermsArchive/federated-api`](https://github.com/OpenTermsArchive/federated-api). +The codebase for the Federation API is available on [`github.com/OpenTermsArchive/federation-api`](https://github.com/OpenTermsArchive/federation-api). ## Configuring @@ -25,21 +25,24 @@ The default configuration can be found in `config/default.json`. The full refere ```js { - "logger": { // Logging mechanism to be notified upon error - "smtp": { - "host": "SMTP server hostname", // Hostname of the SMTP server for sending emails - "username": "User for server authentication" // Password for server authentication is defined in environment variables, see the “Environment variables” section below - }, - "sendMailOnError": { // Can be set to `false` to disable sending email on error - "to": "The address to send the email to in case of an error", - "from": "The address from which to send the email", - "sendWarnings": "Boolean. Set to true to also send email in case of warning. Default: false", + "@opentermsarchive/federation-api": { + "logger": { // Logging mechanism to be notified upon error + "smtp": { + "host": "SMTP server hostname", // Hostname of the SMTP server for sending emails + "username": "User for server authentication" // Password for server authentication is defined in environment variables, see the “Environment variables” section below + }, + "sendMailOnError": { // Can be set to `false` to disable sending email on error + "to": "The address to send the email to in case of an error", + "from": "The address from which to send the email", + "sendWarnings": "Boolean. Set to true to also send email in case of warning. Default: false", + } } + "port": "Port number on which the server will listen for incoming connections. Default: 3333", + "basePath": "The base path for the API endpoints", + "collections": [ // Overriding this value creates a risk of splintering the federation, make sure to fully understand what happens when changing this value + "List of collections to federate; see below for how to configure. Default: https://opentermsarchive.org/collections.json" + ] } - "port": "Port number on which the server will listen for incoming connections. Default: 3333", - "collections": [ // Overriding this value creates a risk of splintering the federation, make sure to fully understand what happens when changing this value - "List of collections to federate; see below for how to configure. Default: https://opentermsarchive.org/collections.json" - ] } ``` @@ -88,4 +91,4 @@ If multiple collections share the same `id`, the latest defined collection in th ## Deploying -Deployment recipes are available in a [dedicated repository](https://github.com/OpenTermsArchive/deployment). Look at the [Federated API section](https://github.com/OpenTermsArchive/deployment#federated-api-application) on the README to know how to deploy the API. +Deployment recipes are available in a [dedicated repository](https://github.com/OpenTermsArchive/deployment). From d34815d274d777176eb73daad344a2b106f437bc Mon Sep 17 00:00:00 2001 From: Nicolas Dupont Date: Tue, 4 Jun 2024 16:59:00 +0200 Subject: [PATCH 2/7] Rename `federated` into `federation` --- content/api/{federated.en.md => federation.en.md} | 0 content/collections/federation.en.md | 2 +- 2 files changed, 1 insertion(+), 1 deletion(-) rename content/api/{federated.en.md => federation.en.md} (100%) diff --git a/content/api/federated.en.md b/content/api/federation.en.md similarity index 100% rename from content/api/federated.en.md rename to content/api/federation.en.md diff --git a/content/collections/federation.en.md b/content/collections/federation.en.md index 7d07905..2783492 100644 --- a/content/collections/federation.en.md +++ b/content/collections/federation.en.md @@ -16,7 +16,7 @@ A collection that **joins** the **federation** enjoys the following benefits: 1. Visibility on the Open Terms Archive website lists of collections and datasets. 2. Access to the Open Terms Archive GitHub organisation, administered by the Open Terms Archive core team. 3. Collection logo provided by the Open Terms Archive core team. -4. Referencing in the official [collections list](https://opentermsarchive.org/collections.json), enabling off-the-shelf discovery in the [Federated API]({{< relref "api/federated" >}}). +4. Referencing in the official [collections list](https://opentermsarchive.org/collections.json), enabling off-the-shelf discovery in the [Federation API]({{< relref "api/federation" >}}). 5. Referencing in the official [datasets list](https://opentermsarchive.org/datasets), providing visibility to analysts. 6. Dedicated channel on the Open Terms Archive instant messaging system. 7. API uptime tracking. From 46deb1148238e47c009da795edf6b36c11f2bee0 Mon Sep 17 00:00:00 2001 From: Nicolas Dupont Date: Tue, 4 Jun 2024 16:59:17 +0200 Subject: [PATCH 3/7] Update `engine` documentation --- content/_index.en.md | 98 +++++++++++++++++++----------------- content/api/collection.en.md | 2 +- 2 files changed, 52 insertions(+), 48 deletions(-) diff --git a/content/_index.en.md b/content/_index.en.md index 2b22bfc..faa1e53 100644 --- a/content/_index.en.md +++ b/content/_index.en.md @@ -286,59 +286,63 @@ The default configuration can be found in `config/default.json`. The full refere ```js { - "services": { - "declarationsPath": "Directory containing services declarations and associated filters" - }, - "recorder": { - "versions": { - "storage": { - "": "Storage repository configuration object; see below" + "@opentermsarchive/engine": { + "trackingSchedule": "Cron expression to define the tracking schedule", + "services": { + "declarationsPath": "Directory containing services declarations and associated filters" + }, + "recorder": { + "versions": { + "storage": { + "": "Storage repository configuration object; see below" + } + }, + "snapshots": { + "storage": { + "": "Storage repository configuration object; see below" + } } }, - "snapshots": { - "storage": { - "": "Storage repository configuration object; see below" + "fetcher": { + "waitForElementsTimeout": "Maximum time (in milliseconds) to wait for elements to be present in the page when fetching document in a headless browser" + "navigationTimeout": "Maximum time (in milliseconds) to wait for page to load", + "language": "Language (in ISO 639-1 format) to pass in request headers" + }, + "notifier": { // Notify specified mailing lists when new versions are recorded + "sendInBlue": { // SendInBlue API Key is defined in environment variables, see the “Environment variables” section below + "updatesListId": "SendInBlue contacts list ID of persons to notify on terms updates", + "updateTemplateId": "SendInBlue email template ID used for updates notifications" } - } - }, - "fetcher": { - "waitForElementsTimeout": "Maximum time (in milliseconds) to wait for elements to be present in the page when fetching document in a headless browser" - "navigationTimeout": "Maximum time (in milliseconds) to wait for page to load", - "language": "Language (in ISO 639-1 format) to pass in request headers" - }, - "notifier": { // Notify specified mailing lists when new versions are recorded - "sendInBlue": { // SendInBlue API Key is defined in environment variables, see the “Environment variables” section below - "updatesListId": "SendInBlue contacts list ID of persons to notify on terms updates", - "updateTemplateId": "SendInBlue email template ID used for updates notifications" - } - }, - "logger": { // Logging mechanism to be notified upon error - "smtp": { - "host": "SMTP server hostname", - "username": "User for server authentication" // Password for server authentication is defined in environment variables, see the “Environment variables” section below }, - "sendMailOnError": { // Can be set to `false` if sending email on error is not needed - "to": "The address to send the email to in case of an error", - "from": "The address from which to send the email", - "sendWarnings": "Boolean. Set to true to also send email in case of warning", - } - }, - "reporter": { // Reporter mechanism to create GitHub issues when terms content is inaccessible - "githubIssues": { - "repositories": { - "declarations": "GitHub repository where to create issues; expected format: /", - "versions": "GitHub repository of versions associated with the declarations; expected format: /", - "snapshots": "GitHub repository of snapshots associated with the declarations; expected format: /" + "logger": { // Logging mechanism to be notified upon error + "smtp": { + "host": "SMTP server hostname", + "username": "User for server authentication" // Password for server authentication is defined in environment variables, see the “Environment variables” section below + }, + "sendMailOnError": { // Can be set to `false` if sending email on error is not needed + "to": "The address to send the email to in case of an error", + "from": "The address from which to send the email", + "sendWarnings": "Boolean. Set to true to also send email in case of warning", } + }, + "reporter": { // Reporter mechanism to create GitHub issues when terms content is inaccessible + "githubIssues": { + "repositories": { + "declarations": "GitHub repository where to create issues; expected format: /", + "versions": "GitHub repository of versions associated with the declarations; expected format: /", + "snapshots": "GitHub repository of snapshots associated with the declarations; expected format: /" + } + } + }, + "dataset": { // Release mechanism to create dataset periodically + "title": "Title of the dataset; recommended to be the name of the instance that generated it", + "versionsRepositoryURL": "GitHub repository where the dataset will be published as a release; recommended to be the versions repository for discoverability and tagging purposes", + "publishingSchedule": "Cron expression to define the dataset publishing schedule" + }, + "collection-api": { // Collection metadata API + "port": "The port number on which the API will listen for incoming requests", + "basePath": "The base path for the API endpoints" } - }, - "dataset": { // Release mechanism to create dataset periodically - "title": "Title of the dataset; recommended to be the name of the instance that generated it", - "versionsRepositoryURL": "GitHub repository where the dataset will be published as a release; recommended to be the versions repository for discoverability and tagging purposes" - }, - "api": { // Collection metadata API - "port": "The port number on which the API will listen for incoming requests", - "basePath": "The base path for the API endpoints" } } ``` diff --git a/content/api/collection.en.md b/content/api/collection.en.md index 1cc3c5b..e5dfc8b 100644 --- a/content/api/collection.en.md +++ b/content/api/collection.en.md @@ -11,4 +11,4 @@ The Collection API exposes JSON data over HTTP. Its [OpenAPI](https://swagger.io That endpoint exposes both the OpenAPI specification if the requested `Content-Type` is JSON, and a Swagger UI for visual and interactive documentation otherwise. -> For example, the [documentation](http://162.19.74.224/api/v1/docs) of the [Demo collection](https://github.com/OpenTermsArchive/demo-declarations) is publicly available for exploration. +> For example, the [documentation](http://162.19.74.224/collection-api/v1/docs) of the [Demo collection](https://github.com/OpenTermsArchive/demo-declarations) is publicly available for exploration. From 7383db2ff61ef255bc4076aefdf5dfee27d301ce Mon Sep 17 00:00:00 2001 From: Nicolas Dupont Date: Wed, 5 Jun 2024 14:16:32 +0200 Subject: [PATCH 4/7] Add a `Schedules` dedicated section --- content/_index.en.md | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/content/_index.en.md b/content/_index.en.md index faa1e53..77935b5 100644 --- a/content/_index.en.md +++ b/content/_index.en.md @@ -287,7 +287,7 @@ The default configuration can be found in `config/default.json`. The full refere ```js { "@opentermsarchive/engine": { - "trackingSchedule": "Cron expression to define the tracking schedule", + "trackingSchedule": "Cron expression to define the tracking schedule; see below", "services": { "declarationsPath": "Directory containing services declarations and associated filters" }, @@ -337,7 +337,7 @@ The default configuration can be found in `config/default.json`. The full refere "dataset": { // Release mechanism to create dataset periodically "title": "Title of the dataset; recommended to be the name of the instance that generated it", "versionsRepositoryURL": "GitHub repository where the dataset will be published as a release; recommended to be the versions repository for discoverability and tagging purposes", - "publishingSchedule": "Cron expression to define the dataset publishing schedule" + "publishingSchedule": "Cron expression to define the dataset publishing schedule; see below" }, "collection-api": { // Collection metadata API "port": "The port number on which the API will listen for incoming requests", @@ -351,6 +351,20 @@ The default configuration is merged with (and overridden by) environment-specifi For development, in order to have a local configuration that overrides the existing config, it is recommended to create a `config/development.json` file. +#### Schedules + +Schedules for tracking and dataset publication are defined using CRON expressions. + +A CRON expression is a string comprised of five or six fields separated by spaces, each representing a different unit of time: minute, hour, day of the month, month, and day of the week (and optionally, year). For example, the expression `30 */12 * * *` means "at minute 30 past every 12th hour of every day." + +Here are some examples of CRON expressions and what they represent: + +- `0 0 * * *`: Run at midnight every day. +- `0 */6 * * *`: Run every 6 hours. +- `30 2 * * MON`: Run at 2:30 AM every Monday. +. +Some online tools, such as [crontab.guru](https://crontab.guru), which provide a user-friendly interface can be used to create and validate CRON expressions. + #### Storage repositories Two storage repositories are currently supported: Git and MongoDB. Each one can be used independently for versions and snapshots. From 112c38a99b9933043e9ee7b6b2b0e071262deedb Mon Sep 17 00:00:00 2001 From: Nicolas Dupont Date: Wed, 5 Jun 2024 14:48:57 +0200 Subject: [PATCH 5/7] Add alias MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Clément Biron --- content/api/federation.en.md | 1 + 1 file changed, 1 insertion(+) diff --git a/content/api/federation.en.md b/content/api/federation.en.md index 1457ae5..31f5d21 100644 --- a/content/api/federation.en.md +++ b/content/api/federation.en.md @@ -1,6 +1,7 @@ --- title: Federation weight: 3 +aliases: /api/federated/ --- # Federation Web API [Beta] From 87319261c39adc2de56cbb22e31c20360b74c2c9 Mon Sep 17 00:00:00 2001 From: Nicolas Dupont Date: Wed, 5 Jun 2024 15:12:01 +0200 Subject: [PATCH 6/7] Improve wording MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Clément Biron Co-authored-by: Matti Schneider --- content/_index.en.md | 4 ++-- content/api/federation.en.md | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/content/_index.en.md b/content/_index.en.md index 77935b5..cfec883 100644 --- a/content/_index.en.md +++ b/content/_index.en.md @@ -357,13 +357,13 @@ Schedules for tracking and dataset publication are defined using CRON expression A CRON expression is a string comprised of five or six fields separated by spaces, each representing a different unit of time: minute, hour, day of the month, month, and day of the week (and optionally, year). For example, the expression `30 */12 * * *` means "at minute 30 past every 12th hour of every day." -Here are some examples of CRON expressions and what they represent: +Here are some valid examples of CRON expressions and what they represent: - `0 0 * * *`: Run at midnight every day. - `0 */6 * * *`: Run every 6 hours. - `30 2 * * MON`: Run at 2:30 AM every Monday. . -Some online tools, such as [crontab.guru](https://crontab.guru), which provide a user-friendly interface can be used to create and validate CRON expressions. +Some online tools, such as [crontab.guru](https://crontab.guru), provide a user-friendly interface to create and validate CRON expressions. #### Storage repositories diff --git a/content/api/federation.en.md b/content/api/federation.en.md index 31f5d21..ee404d7 100644 --- a/content/api/federation.en.md +++ b/content/api/federation.en.md @@ -4,11 +4,11 @@ weight: 3 aliases: /api/federated/ --- -# Federation Web API [Beta] +# Federation API -Open Terms Archive is a decentralised system that tracks collections of services' terms across multiple servers. Each collection operates its own API, and the federation API unifies search and discovery across collections, fostering collaboration with external applications. +Open Terms Archive is a decentralised system that tracks collections of services' terms across multiple servers. Each collection operates its own API, and the Federation API unifies search and discovery across collections, fostering collaboration with external applications. -The Federation Web API exposes JSON data over HTTP. Its [documentation](http://162.19.74.224/federation-api/v1/docs/) is provided in a dedicated, interactive interface. +The Federation API exposes JSON data over HTTP. Its [documentation](http://162.19.74.224/federation-api/v1/docs/) is provided in a dedicated, interactive interface. That endpoint exposes both the [OpenAPI](https://swagger.io/specification/) specification if the requested `Content-Type` is JSON, and a Swagger UI for visual and interactive documentation otherwise. From afdc4f1a6a54bf8893fe4a80e0667a07dddc6c96 Mon Sep 17 00:00:00 2001 From: Nicolas Dupont Date: Wed, 5 Jun 2024 15:13:54 +0200 Subject: [PATCH 7/7] Fix Cron spelling Co-authored-by: Matti Schneider --- content/_index.en.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/_index.en.md b/content/_index.en.md index cfec883..9f8f26e 100644 --- a/content/_index.en.md +++ b/content/_index.en.md @@ -353,17 +353,17 @@ For development, in order to have a local configuration that overrides the exist #### Schedules -Schedules for tracking and dataset publication are defined using CRON expressions. +Schedules for tracking and dataset publication are defined using Cron expressions. -A CRON expression is a string comprised of five or six fields separated by spaces, each representing a different unit of time: minute, hour, day of the month, month, and day of the week (and optionally, year). For example, the expression `30 */12 * * *` means "at minute 30 past every 12th hour of every day." +A Cron expression is a string comprised of five or six fields separated by spaces, each representing a different unit of time: minute, hour, day of the month, month, and day of the week (and optionally, year). For example, the expression `30 */12 * * *` means "at minute 30 past every 12th hour of every day." -Here are some valid examples of CRON expressions and what they represent: +Here are some valid examples of Cron expressions and what they represent: - `0 0 * * *`: Run at midnight every day. - `0 */6 * * *`: Run every 6 hours. - `30 2 * * MON`: Run at 2:30 AM every Monday. . -Some online tools, such as [crontab.guru](https://crontab.guru), provide a user-friendly interface to create and validate CRON expressions. +Some online tools, such as [crontab.guru](https://crontab.guru), provide a user-friendly interface to create and validate Cron expressions. #### Storage repositories