Full docs review

ckan · Aug 29, 2024 · f91f92b · f91f92b
1 parent 7dd0ba7
commit f91f92b
Show file tree

Hide file tree

Showing 12 changed files with 681 additions and 529 deletions.
diff --git a/README.md b/README.md
@@ -877,7 +877,7 @@ This plugin also contains a profile to serialize a CKAN dataset to a [schema.org
 
 To define which profiles to use you can:
 
-1. Set the `ckanext.dcat.rdf.profiles` configuration option on your CKAN configuration file:
+1. Set the [`ckanext.dcat.rdf.profiles`](configuration.md#ckanextdcatrdfprofiles) configuration option on your CKAN configuration file:
 
     ckanext.dcat.rdf.profiles = euro_dcat_ap sweden_dcat_ap
 
@@ -1166,6 +1166,15 @@ Default value: `True`
 Whether to expose the catalog and dataset endpoints with the RDF DCAT
 serializations.
 
+#### ckanext.dcat.base_uri
+
+Example:
+
+```
+https://my-site.org/uris/
+```
+
+Base URI to use when generating URIs for all entities. It needs to be a valid URI value.
 
 #### ckanext.dcat.catalog_endpoint
 
@@ -1181,7 +1190,7 @@ Custom route for the catalog endpoint. It should start with `/` and include the
 `{_format}` placeholder.
 
 
-#### ckanext.dcat.dataset_per_page
+#### ckanext.dcat.datasets_per_page
 
 Default value: `100`
 

diff --git a/ckanext/dcat/config_declaration.yml b/ckanext/dcat/config_declaration.yml
@@ -63,14 +63,19 @@ groups:
           serializations.
         type: bool
 
+      - key: ckanext.dcat.base_uri
+        description: |
+          Base URI to use when generating URIs for all entities. It needs to be a valid URI value.
+        example: 'https://my-site.org/uri/'
+
       - key: ckanext.dcat.catalog_endpoint
         default: '/catalog.{_format}'
         description: |
           Custom route for the catalog endpoint. It should start with `/` and include the
           `{_format}` placeholder.
         example: '/dcat/catalog/{_format}'
 
-      - key: ckanext.dcat.dataset_per_page
+      - key: ckanext.dcat.datasets_per_page
         default: 100
         type: int
         description: |

diff --git a/docs/cli.md b/docs/cli.md
@@ -1,5 +1,3 @@
-## CLI
-
 The `ckan dcat` command offers utilites to transform between DCAT RDF Serializations and CKAN datasets (`ckan dcat consume`) and
 viceversa (`ckan dcat produce`). In both cases the input can be provided as a path to a file:
 
@@ -16,4 +14,3 @@ The latter form allows chaininig commands for more complex metadata processing,
     curl https://demo.ckan.org/api/action/package_search | jq .result.results | ckan dcat produce -f jsonld -
 
 For the full list of options check `ckan dcat consume --help` and  `ckan dcat produce --help`.
-
diff --git a/docs/configuration.md b/docs/configuration.md
@@ -1,5 +1,3 @@
-## Configuration reference
-
 <!-- start-config -->
 
 ### General settings
@@ -73,6 +71,15 @@ Default value: `True`
 Whether to expose the catalog and dataset endpoints with the RDF DCAT
 serializations.
 
+#### ckanext.dcat.base_uri
+
+Example:
+
+```
+https://my-site.org/uris/
+```
+
+Base URI to use when generating URIs for all entities. It needs to be a valid URI value.
 
 #### ckanext.dcat.catalog_endpoint
 
@@ -88,7 +95,7 @@ Custom route for the catalog endpoint. It should start with `/` and include the
 `{_format}` placeholder.
 
 
-#### ckanext.dcat.dataset_per_page
+#### ckanext.dcat.datasets_per_page
 
 Default value: `100`
 

diff --git a/docs/endpoints.md b/docs/endpoints.md
@@ -1,15 +1,13 @@
 # RDF DCAT endpoints
 
-By default when the `dcat` plugin is enabled, the following RDF endpoints are available on your CKAN instance. The schema used on the serializations can be customized using [profiles](#profiles).
+By default, when the `dcat` plugin is enabled, the following RDF endpoints are available on your CKAN instance. The schema used on the serializations can be customized using [profiles](profiles.md#profiles).
 
-To disable the RDF endpoints, you can set the following config in your ini file:
-
-    ckanext.dcat.enable_rdf_endpoints = False
+To disable the RDF endpoints, you can set the [`ckanext.dcat.enable_rdf_endpoints`](configuration.md#ckanextdcatenable_rdf_endpoints) option in your ini file.
 
 
 ## Dataset endpoints
 
-RDF representations of a particular dataset can accessed using the following endpoint:
+RDF representations of a particular dataset can be accessed using the following endpoint:
 
     https://{ckan-instance-host}/dataset/{dataset-id}.{format}
 
@@ -26,32 +24,32 @@ The fallback `rdf` format defaults to RDF/XML.
 
 Here's an example of the different formats:
 
-* https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.rdf
-* https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml
-* https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.ttl
-* https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.n3
-* https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.jsonld
-
-RDF representations will be advertised using `<link rel="alternate">` tags on the `<head>` sectionon the dataset page source code, eg:
+* [https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.rdf](https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.rdf)
+* [https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml](https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml)
+* [https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.ttl](https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.ttl)
+* [https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.n3](https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.n3)
+* [https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.jsonld](https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.jsonld)
 
-    <head>
+RDF representations will be advertised using `<link rel="alternate">` tags on the `<head>` section of the dataset page source code, e.g.:
 
-        <link rel="alternate" type="application/rdf+xml" href="http://demo.ckan.org/dataset/34315559-2b08-44eb-a2e6-ebe9ce1a266b.rdf"/>
-        <link rel="alternate" type="text/turtle" href="http://demo.ckan.org/dataset/34315559-2b08-44eb-a2e6-ebe9ce1a266b.ttl"/>
-        <!-- ... -->
+```html
+<head>
 
-    </head>
+    <link rel="alternate" type="application/rdf+xml" href="http://demo.ckan.org/dataset/34315559-2b08-44eb-a2e6-ebe9ce1a266b.rdf"/>
+    <link rel="alternate" type="text/turtle" href="http://demo.ckan.org/dataset/34315559-2b08-44eb-a2e6-ebe9ce1a266b.ttl"/>
+    <!-- ... -->
 
+</head>
+```
 
-Check the [RDF DCAT Serializer](#rdf-dcat-serializer) section for more details about how these are generated and how to customize the output using [profiles](#profiles).
+Check the [RDF DCAT Serializer](profiles.md#rdf-dcat-serializer) section for more details about how these are generated and how to customize the output using [profiles](profiles.md#profiles).
 
 
 You can specify the profile by using the `profiles=<profile1>,<profile2>` query parameter on the dataset endpoint (as a comma-separated list):
 
-* https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml?profiles=euro_dcat_ap
-* https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.jsonld?profiles=schemaorg
+* [https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml?profiles=euro_dcat_ap](https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml?profiles=euro_dcat_ap)
+* [https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.jsonld?profiles=schemaorg](https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.jsonld?profiles=schemaorg)
 
-*Note*: When using this plugin, the above endpoints will replace the old deprecated ones that were part of CKAN core.
 
 
 ## Catalog endpoint
@@ -60,7 +58,7 @@ Additionally to the individual dataset representations, the extension also offer
 
     https://{ckan-instance-host}/catalog.{format}?[page={page}]&[modified_since={date}]&[profiles={profile1},{profile2}]&[q={query}]&[fq={filter query}]
 
-This endpoint can be customized if necessary using the `ckanext.dcat.catalog_endpoint` configuration option, eg:
+This endpoint base path can be customized if necessary using the [`ckanext.dcat.catalog_endpoint`](configuration.md#ckanextdcatcatalog_endpoint) configuration option, eg:
 
     ckanext.dcat.catalog_endpoint = /dcat/catalog/{_format}
 
@@ -72,44 +70,47 @@ As described previously, the extension will determine the RDF serialization form
 * http://demo.ckan.org/catalog.xml
 * http://demo.ckan.org/catalog.ttl
 
-RDF representations will be advertised using `<link rel="alternate">` tags on the `<head>` sectionon the homepage and the dataset search page source code, eg:
+RDF representations will be advertised using `<link rel="alternate">` tags on the `<head>` section of the catalog homepage and the dataset search page source code, eg:
 
-    <head>
+```html
+<head>
 
+    <link rel="alternate" type="application/rdf+xml" href="http://demo.ckan.org/catalog.rdf"/>
+    <link rel="alternate" type="application/rdf+xml" href="http://demo.ckan.org/catalog.xml"/>
+    <link rel="alternate" type="text/turtle" href="http://demo.ckan.org/catalog.ttl"/>
+    <!-- ... -->
 
-        <link rel="alternate" type="application/rdf+xml" href="http://demo.ckan.org/catalog.rdf"/>
-        <link rel="alternate" type="application/rdf+xml" href="http://demo.ckan.org/catalog.xml"/>
-        <link rel="alternate" type="text/turtle" href="http://demo.ckan.org/catalog.ttl"/>
-        <!-- ... -->
+</head>
+```
 
-    </head>
+The number of datasets returned is limited. The response will include paging info, serialized using the [Hydra](http://www.w3.org/ns/hydra/spec/latest/core/) vocabulary. The different properties are self-explanatory, and can be used by clients to iterate the catalog:
 
-The number of datasets returned is limited. The response will include paging info, serialized using the [Hydra](http://www.w3.org/ns/hydra/spec/latest/core/) vocabulary. The different terms are self-explanatory, and can be used by clients to iterate the catalog:
+```turtle
+@prefix hydra: <http://www.w3.org/ns/hydra/core#> .
 
-    @prefix hydra: <http://www.w3.org/ns/hydra/core#> .
+<http://example.com/catalog.ttl?page=1> a hydra:PagedCollection ;
+    hydra:first "http://example.com/catalog.ttl?page=1" ;
+    hydra:last "http://example.com/catalog.ttl?page=3" ;
+    hydra:next "http://example.com/catalog.ttl?page=2" ;
+    hydra:totalItems 283 .
+```
 
-    <http://example.com/catalog.ttl?page=1> a hydra:PagedCollection ;
-        hydra:first "http://example.com/catalog.ttl?page=1" ;
-        hydra:last "http://example.com/catalog.ttl?page=3" ;
-        hydra:next "http://example.com/catalog.ttl?page=2" ;
-        hydra:totalItems 283 .
+The default number of datasets returned (100) can be modified by CKAN site maintainers using [`ckanext.dcat.datasets_per_page`](configuration.md#ckanextdcatdatasets_per_page)
 
-The default number of datasets returned (100) can be modified by CKAN site maintainers using the following configuration option on your ini file:
+The catalog endpoint also supports a `modified_since` parameter to restrict datasets to those modified from a certain date. The parameter value should be a valid ISO-8601 date:
 
-    ckanext.dcat.datasets_per_page = 20
+    http://demo.ckan.org/catalog.xml?modified_since=2015-07-24
 
-The catalog endpoint also supports a `modified_since` parameter to restrict datasets to those modified from a certain date. The parameter value should be a valid ISO-8601 date:
+It is possible to specify the profile(s) to use for the serialization using the `profiles` parameter:
 
-http://demo.ckan.org/catalog.xml?modified_since=2015-07-24
+    http://demo.ckan.org/catalog.xml?profiles=euro_dcat_ap,sweden_dcat_ap
 
-It's possible to specify the profile(s) to use for the serialization using the `profiles` parameter:
+To filter the output, the catalog endpoint supports the `q` and `fq` parameters to specify a [search query](https://solr.apache.org/guide/solr/latest/query-guide/dismax-query-parser.html#q-parameter) or [filter query](https://solr.apache.org/guide/solr/latest/query-guide/common-query-parameters.html#fq-filter-query-parameter):
 
-http://demo.ckan.org/catalog.xml?profiles=euro_dcat_ap,sweden_dcat_ap
 
-To filter the output, the catalog endpoint supports the `q` and `fq` parameters to specify a [search query](https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-TheqParameter) or [filter query](https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html#CommonQueryParameters-Thefq_FilterQuery_Parameter):
 
-http://demo.ckan.org/catalog.xml?q=budget
-http://demo.ckan.org/catalog.xml?fq=tags:economy
+    http://demo.ckan.org/catalog.xml?q=budget
+    http://demo.ckan.org/catalog.xml?fq=tags:economy
 
 
 
@@ -118,8 +119,8 @@ http://demo.ckan.org/catalog.xml?fq=tags:economy
 Whenever possible, URIs are generated for the relevant entities. To try to generate them, the extension will use the first found of the following for each entity:
 
 * Catalog:
-    - `ckanext.dcat.base_uri` configuration option value. This is the recommended approach. Value should be a valid URI
-    - `ckan.site_url` configuration option value.
+    - [`ckanext.dcat.base_uri`](configuration.md#ckanextdcatbase_uri) configuration option value. This is the recommended approach. Value should be a valid URI.
+    - [`ckan.site_url`](https://docs.ckan.org/en/latest/maintaining/configuration.html#ckan-site-url) configuration option value.
     - 'http://' + `app_instance_uuid` configuration option value. This is not recommended, and a warning log message will be shown.
 
 * Dataset:
@@ -131,12 +132,18 @@ Whenever possible, URIs are generated for the relevant entities. To try to gener
     - The value of the `uri` field (note that this is not included in the default CKAN schema)
     - Catalog URI (see above) + '/dataset/' + `package_id` field + '/resource/ + `id` field
 
-Note that if you are using the [RDF DCAT harvester](#rdf-dcat-harvester) to import datasets from other catalogs and these define a proper URI for each dataset or resource, these will be stored as `uri` fields in your instance, and thus used when generating serializations for them.
+Note that if you are using the [RDF DCAT harvester](harvester.md) to import datasets from other catalogs and these define a proper URI for each dataset or resource, these will be stored as `uri` fields in your instance, and so used when generating serializations for them.
 
 
 ## Content negotiation
 
-The extension supports returning different representations of the datasets based on the value of the `Accept` header ([Content negotiation](https://en.wikipedia.org/wiki/Content_negotiation)).
+The extension supports returning different representations of the datasets based on the value of the `Accept` header ([Content negotiation](https://en.wikipedia.org/wiki/Content_negotiation)). This is turned off by default, to enable it, set [`ckanext.dcat.enable_content_negotiation`](configuration.md#ckanextdcatenable_content_negotiation).
+
+!!! Note
+
+    This feature overrides the CKAN core home page and dataset page view routes, 
+    so you probably don't want to enable it if your own extension is also doing it.
+
 
 When enabled, client applications can request a particular format via the `Accept` header on requests to the main dataset page, eg:
 
@@ -147,9 +154,3 @@ When enabled, client applications can request a particular format via the `Accep
 This is also supported on the [catalog endpoint](#catalog-endpoint), in this case when making a request to the CKAN root URL (home page). This won't support the pagination and filter parameters:
 
     curl https://{ckan-instance-host} -H Accept:text/turtle
-
-Note that this feature overrides the CKAN core home page and dataset page controllers, so you probably don't want to enable it if your own extension is also doing it.
-
-To enable content negotiation, set the following configuration option on your ini file:
-
-    ckanext.dcat.enable_content_negotiation = True