Skip to content

Commit

Permalink
Merge pull request #293 from ckan/config-declaration
Browse files Browse the repository at this point in the history
Add config declaration, document options
  • Loading branch information
amercader authored Aug 22, 2024
2 parents a09a63d + a2e9600 commit d00cbd6
Show file tree
Hide file tree
Showing 4 changed files with 277 additions and 1 deletion.
150 changes: 149 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ Check the [overview](#overview) section for a summary of the available features.
- [Translation of fields](#translation-of-fields)
- [Structured data and Google Dataset Search indexing](#structured-data-and-google-dataset-search-indexing)
- [CLI](#cli)
- [Configuration reference](#configuration-reference)
- [Running the Tests](#running-the-tests)
- [Releases](#releases)
- [Acknowledgements](#acknowledgements)
Expand Down Expand Up @@ -95,7 +96,7 @@ These are implemented internally using:

3. Enable the required plugins in your ini file:

ckan.plugins = dcat dcat_rdf_harvester dcat_json_harvester dcat_json_interface structured_data
ckan.plugins = dcat dcat_rdf_harvester structured_data

4. To use the pre-built schemas, install [ckanext-scheming](https://github.com/ckan/ckanext-scheming):

Expand All @@ -105,6 +106,11 @@ Check the [Schemas](#schemas) section for extra configuration needed.

Optionally, if you want to use the RDF harvester, install ckanext-harvest as well ([https://github.com/ckan/ckanext-harvest#installation](https://github.com/ckan/ckanext-harvest#installation)).

For further configuration options available, see [Configuration reference](#configuration-reference).




## Schemas

The extension includes ready to use [ckanext-scheming](https://github.com/ckan/ckanext-scheming) schemas that enable DCAT support. These include a schema definition file (located in `ckanext/dcat/schemas`) plus extra validators and other custom logic that integrates the metadata modifications with the RDF DCAT [Parsers](#rdf-dcat-parser) and [Serializers](#rdf-dcat-serializer) and other CKAN features and extensions.
Expand Down Expand Up @@ -1142,6 +1148,148 @@ The latter form allows chaininig commands for more complex metadata processing,

For the full list of options check `ckan dcat consume --help` and `ckan dcat produce --help`.

## Configuration reference

<!-- start-config -->

### General settings

#### ckanext.dcat.rdf.profiles

Example:

```
ckanext.dcat.rdf.profiles = euro_dcat_ap_2 my_local_ap
```

Default value: `euro_dcat_ap_2`

RDF profiles to use when parsing and serializing. See https://github.com/ckan/ckanext-dcat#profiles
for more details.


#### ckanext.dcat.translate_keys

Default value: `True`

If set to True, the plugin will automatically translate the keys of the DCAT
fields used in the frontend (at least those present in the `ckanext/dcat/i18n`
po files).


### Parsers / Serializers settings

#### ckanext.dcat.output_spatial_format

Default value: `wkt`

Format to use for geometries when serializing RDF documents. The default is
recommended as is the format expected by GeoDCAT, alternatively you can
use `geojson` (or both, which will make SHACL validation fail)


#### ckanext.dcat.resource.inherit.license

Default value: `False`

If there is no license defined for a resource / distribution, inherit it from
the dataset.


#### ckanext.dcat.normalize_ckan_format

Default value: `True`

When true, the resource label will be tried to match against the standard
list of CKAN formats (https://github.com/ckan/ckan/blob/master/ckan/config/resource_formats.json)
This allows for instance to populate the CKAN resource format field
with a value that view plugins, etc will understand (`csv`, `xml`, etc.)


#### ckanext.dcat.clean_tags

Default value: `False`

Remove special characters from keywords (use the old munge_tag() CKAN function).
This is generally not needed.


### Endpoints settings

#### ckanext.dcat.enable_rdf_endpoints

Default value: `True`

Whether to expose the catalog and dataset endpoints with the RDF DCAT
serializations.


#### ckanext.dcat.catalog_endpoint

Example:

```
ckanext.dcat.catalog_endpoint = /dcat/catalog/{_format}
```

Default value: `/catalog.{_format}`

Custom route for the catalog endpoint. It should start with `/` and include the
`{_format}` placeholder.


#### ckanext.dcat.dataset_per_page

Default value: `100`

Default number of datasets returned by the catalog endpoint.


#### ckanext.dcat.enable_content_negotiation

Default value: `False`

Enable content negotiation in the main catalog and dataset endpoints. Note that
setting this to True overrides the core `home.index` and `dataset.read` endpoints.


### Harvester settings

#### ckanext.dcat.max_file_size

Default value: `50`

Maximum file size that will be downloaded for parsing by the harvesters


#### ckanext.dcat.expose_subcatalogs

Default value: `False`

Store information about the origin catalog when harvesting datasets.
See https://github.com/ckan/ckanext-dcat#transitive-harvesting for more details.


### Deprecated options (will be removed in future versions)

#### ckanext.dcat.compatibility_mode

Default value: `False`

Whether to modify some fields to maintain compatibility with previous versions
of the ckanext-dcat parsers.


#### ckanext.dcat.json_endpoint

Default value: `/dcat.json`

Custom route to expose the legacy JSON endpoint



<!-- end-config -->

## Running the Tests

To run the tests do:
Expand Down
115 changes: 115 additions & 0 deletions ckanext/dcat/config_declaration.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
version: 1
groups:
- annotation: General settings
options:

- key: ckanext.dcat.rdf.profiles
default_callable: 'ckanext.dcat.processors:_get_default_rdf_profiles'
description: |
RDF profiles to use when parsing and serializing. See https://github.com/ckan/ckanext-dcat#profiles
for more details.
example: 'euro_dcat_ap_2 my_local_ap'

- key: ckanext.dcat.translate_keys
type: bool
default: True
description: |
If set to True, the plugin will automatically translate the keys of the DCAT
fields used in the frontend (at least those present in the `ckanext/dcat/i18n`
po files).
- annotation: Parsers / Serializers settings
options:

- key: ckanext.dcat.output_spatial_format
type: list
default:
- 'wkt'
description: |
Format to use for geometries when serializing RDF documents. The default is
recommended as is the format expected by GeoDCAT, alternatively you can
use `geojson` (or both, which will make SHACL validation fail)
- key: ckanext.dcat.resource.inherit.license
type: bool
default: False
description: |
If there is no license defined for a resource / distribution, inherit it from
the dataset.
- key: ckanext.dcat.normalize_ckan_format
type: bool
default: True
description: |
When true, the resource label will be tried to match against the standard
list of CKAN formats (https://github.com/ckan/ckan/blob/master/ckan/config/resource_formats.json)
This allows for instance to populate the CKAN resource format field
with a value that view plugins, etc will understand (`csv`, `xml`, etc.)
- key: ckanext.dcat.clean_tags
type: bool
default: False
description: |
Remove special characters from keywords (use the old munge_tag() CKAN function).
This is generally not needed.
- annotation: Endpoints settings
options:

- key: ckanext.dcat.enable_rdf_endpoints
default: True
description: |
Whether to expose the catalog and dataset endpoints with the RDF DCAT
serializations.
type: bool

- key: ckanext.dcat.catalog_endpoint
default: '/catalog.{_format}'
description: |
Custom route for the catalog endpoint. It should start with `/` and include the
`{_format}` placeholder.
example: '/dcat/catalog/{_format}'

- key: ckanext.dcat.dataset_per_page
default: 100
type: int
description: |
Default number of datasets returned by the catalog endpoint.
- key: ckanext.dcat.enable_content_negotiation
default: False
type: bool
description: |
Enable content negotiation in the main catalog and dataset endpoints. Note that
setting this to True overrides the core `home.index` and `dataset.read` endpoints.
- annotation: Harvester settings
options:

- key: ckanext.dcat.max_file_size
type: int
default: 50
description: |
Maximum file size that will be downloaded for parsing by the harvesters
- key: ckanext.dcat.expose_subcatalogs
type: bool
default: false
description: |
Store information about the origin catalog when harvesting datasets.
See https://github.com/ckan/ckanext-dcat#transitive-harvesting for more details.
- annotation: Deprecated options (will be removed in future versions)
options:

- key: ckanext.dcat.compatibility_mode
type: bool
default: False
description: |
Whether to modify some fields to maintain compatibility with previous versions
of the ckanext-dcat parsers.
- key: ckanext.dcat.json_endpoint
default: '/dcat.json'
description: |
Custom route to expose the legacy JSON endpoint
8 changes: 8 additions & 0 deletions ckanext/dcat/plugins/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,13 @@
I18N_DIR = os.path.join(HERE, u"../i18n")


def config_declaration(func):
if p.toolkit.check_ckan_version(min_version="2.10.0"):
return p.toolkit.blanket.config_declarations(func)
else:
return func


def _get_dataset_schema(dataset_type="dataset"):
schema = None
try:
Expand All @@ -43,6 +50,7 @@ def _get_dataset_schema(dataset_type="dataset"):
return schema


@config_declaration
class DCATPlugin(p.SingletonPlugin, DefaultTranslation):

p.implements(p.IConfigurer, inherit=True)
Expand Down
5 changes: 5 additions & 0 deletions ckanext/dcat/processors.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@
DEFAULT_RDF_PROFILES = ['euro_dcat_ap_2']


def _get_default_rdf_profiles():
"""Helper function used fo documenting the rdf profiles config option"""
return " ".join(DEFAULT_RDF_PROFILES)


class RDFProcessor(object):

def __init__(self, profiles=None, dataset_type='dataset', compatibility_mode=False):
Expand Down

0 comments on commit d00cbd6

Please sign in to comment.