Skip to content

Commit

Permalink
Merge branch 'main' into maintenance-branch
Browse files Browse the repository at this point in the history
  • Loading branch information
stichbury authored Aug 8, 2023
2 parents ae84dbb + 0c9d4ec commit 15b8540
Show file tree
Hide file tree
Showing 6 changed files with 75 additions and 15 deletions.
3 changes: 2 additions & 1 deletion RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,10 @@
* Allowed registering of custom resolvers to `OmegaConfigLoader` through `CONFIG_LOADER_ARGS`.

## Bug fixes and other changes
* Updated `kedro pipeline create` to use new `/conf` file structure.
* Updated `kedro pipeline create` and `kedro catalog create` to use new `/conf` file structure.

## Documentation changes
* Added migration guide from the `ConfigLoader` to the `OmegaConfigLoader`. The `ConfigLoader` is deprecated and will be removed in the `0.19.0` release.

## Breaking changes to the API

Expand Down
62 changes: 62 additions & 0 deletions docs/source/configuration/config_loader_migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Migration guide for config loaders
The `ConfigLoader` and `TemplatedConfigLoader` classes have been deprecated since Kedro `0.18.12` and will be removed in Kedro `0.19.0`. To ensure a smooth transition, we strongly recommend you adopt the [`OmegaConfigLoader`](/kedro.config.OmegaConfigLoader) as soon as possible.
This migration guide outlines the primary distinctions between the old loaders and the `OmegaConfigLoader`, providing step-by-step instructions on updating your code base to utilise the new class effectively.

## [`ConfigLoader`](/kedro.config.ConfigLoader) to [`OmegaConfigLoader`](/kedro.config.OmegaConfigLoader)

### 1. Install the Required Library
The [`OmegaConfigLoader`](advanced_configuration.md#omegaconfigloader) was introduced in Kedro `0.18.5` and is based on [OmegaConf](https://omegaconf.readthedocs.io/). In order to use it you need to ensure you have both a version of Kedro of `0.18.5` or above and `omegaconf` installed.
You can install both using `pip`:

```bash
pip install kedro==0.18.5
```
This would be the minimum required Kedro version which includes `omegaconf` as a dependency.
Or you can run:
```bash
pip install -U kedro
```

This command installs the most recent version of Kedro which also includes `omegaconf` as a dependency.

### 2. Use the `OmegaConfigLoader`
To use `OmegaConfigLoader` in your project, set the `CONFIG_LOADER_CLASS` constant in your [`src/<package_name>/settings.py`](../kedro_project_setup/settings.md):

```diff
+ from kedro.config import OmegaConfigLoader # new import

+ CONFIG_LOADER_CLASS = OmegaConfigLoader
```

### 3. Import Statements
Replace the import statement for `ConfigLoader` with the one for `OmegaConfigLoader`:

```diff
- from kedro.config import ConfigLoader

+ from kedro.config import OmegaConfigLoader
```

### 4. File Format Support
`OmegaConfigLoader` supports only `yaml` and `json` file formats. Make sure that all your configuration files are in one of these formats. If you previously used other formats with `ConfigLoader`, convert them to `yaml` or `json`.

### 5. Load Configuration
The method to load the configuration using `OmegaConfigLoader` differs slightly from that used by `ConfigLoader`, which allowed users to access configuration through the `.get()` method and required patterns as argument.
When you migrate to use `OmegaConfigLoader` it requires you to fetch configuration through a configuration key that points to [configuration patterns specified in the loader class](configuration_basics.md#configuration-patterns) or [provided in the `CONFIG_LOADER_ARGS`](advanced_configuration.md#how-to-change-which-configuration-files-are-loaded) in `settings.py`.

```diff
- conf_path = str(project_path / settings.CONF_SOURCE)
- conf_loader = ConfigLoader(conf_source=conf_path, env="local")
- catalog = conf_loader.get("catalog*")

+ conf_path = str(project_path / settings.CONF_SOURCE)
+ config_loader = OmegaConfigLoader(conf_source=conf_path, env="local")
+ catalog = config_loader["catalog"]
```

In this example, `"catalog"` is the key to the default catalog patterns specified in the `OmegaConfigLoader` class.

### 6. Exception Handling
For error and exception handling, most errors are the same. Those you need to be aware of that are different between the original `ConfigLoader` and `OmegaConfigLoader` are as follows:
* `OmegaConfigLoader` throws a `MissingConfigException` when configuration paths don't exist, rather than the `ValueError` used in `ConfigLoader`.
* In `OmegaConfigLoader`, if there is bad syntax in your configuration files, it will trigger a `ParserError` instead of a `BadConfigException` used in `ConfigLoader`.
2 changes: 1 addition & 1 deletion docs/source/configuration/configuration_basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Kedro merges configuration information and returns a configuration dictionary ac
* If any two configuration files located inside the **same** environment path (such as `conf/base/`) contain the same top-level key, the configuration loader raises a `ValueError` indicating that duplicates are not allowed.
* If two configuration files contain the same top-level key but are in **different** environment paths (for example, one in `conf/base/`, another in `conf/local/`) then the last loaded path (`conf/local/`) takes precedence as the key value. `ConfigLoader.get` does not raise any errors but a `DEBUG` level log message is emitted with information on the overridden keys.

When using any of the configuration loaders, any top-level keys that start with `_` are considered hidden (or reserved) and are ignored. Those keys will neither trigger a key duplication error nor appear in the resulting configuration dictionary. However, you can still use such keys, for example, as [YAML anchors and aliases](https://www.educative.io/blog/advanced-yaml-syntax-cheatsheet#anchors)
When using any of the configuration loaders, any top-level keys that start with `_` are considered hidden (or reserved) and are ignored. Those keys will neither trigger a key duplication error nor appear in the resulting configuration dictionary. However, you can still use such keys, for example, as [YAML anchors and aliases](https://www.educative.io/blog/advanced-yaml-syntax-cheatsheet)
or [to enable templating in the catalog when using the `OmegaConfigLoader`](advanced_configuration.md#how-to-do-templating-with-the-omegaconfigloader).

### Configuration file names
Expand Down
1 change: 1 addition & 0 deletions docs/source/configuration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,6 @@
configuration_basics
credentials
parameters
config_loader_migration
advanced_configuration
```
3 changes: 1 addition & 2 deletions kedro/framework/cli/catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,8 +170,7 @@ def create_catalog(metadata: ProjectMetadata, pipeline_name, env):
context.project_path
/ settings.CONF_SOURCE
/ env
/ "catalog"
/ f"{pipeline_name}.yml"
/ f"catalog_{pipeline_name}.yml"
)
_add_missing_datasets_to_catalog(missing_ds, catalog_path)
click.echo(f"Data Catalog YAML configuration was created: {catalog_path}")
Expand Down
19 changes: 8 additions & 11 deletions tests/framework/cli/test_catalog.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import shutil

import pytest
import yaml
from click.testing import CliRunner
Expand Down Expand Up @@ -242,11 +240,12 @@ class TestCatalogCreateCommand:
@staticmethod
@pytest.fixture(params=["base"])
def catalog_path(request, fake_repo_path):
catalog_path = fake_repo_path / "conf" / request.param / "catalog"
catalog_path = fake_repo_path / "conf" / request.param

yield catalog_path

shutil.rmtree(catalog_path, ignore_errors=True)
for file in catalog_path.glob("catalog_*"):
file.unlink()

def test_pipeline_argument_is_required(self, fake_project_cli):
result = CliRunner().invoke(fake_project_cli, ["catalog", "create"])
Expand Down Expand Up @@ -278,7 +277,7 @@ def test_catalog_is_created_in_base_by_default(
main_catalog_config = yaml.safe_load(main_catalog_path.read_text())
assert "example_iris_data" in main_catalog_config

data_catalog_file = catalog_path / f"{self.PIPELINE_NAME}.yml"
data_catalog_file = catalog_path / f"catalog_{self.PIPELINE_NAME}.yml"

result = CliRunner().invoke(
fake_project_cli,
Expand All @@ -302,9 +301,9 @@ def test_catalog_is_created_in_base_by_default(
def test_catalog_is_created_in_correct_env(
self, fake_project_cli, fake_metadata, catalog_path
):
data_catalog_file = catalog_path / f"{self.PIPELINE_NAME}.yml"
data_catalog_file = catalog_path / f"catalog_{self.PIPELINE_NAME}.yml"

env = catalog_path.parent.name
env = catalog_path.name
result = CliRunner().invoke(
fake_project_cli,
["catalog", "create", "--pipeline", self.PIPELINE_NAME, "--env", env],
Expand Down Expand Up @@ -335,7 +334,7 @@ def test_no_missing_datasets(
)

data_catalog_file = (
fake_repo_path / "conf" / "base" / "catalog" / f"{self.PIPELINE_NAME}.yml"
fake_repo_path / "conf" / "base" / f"catalog_{self.PIPELINE_NAME}.yml"
)

result = CliRunner().invoke(
Expand All @@ -351,9 +350,7 @@ def test_no_missing_datasets(
def test_missing_datasets_appended(
self, fake_project_cli, fake_metadata, catalog_path
):
data_catalog_file = catalog_path / f"{self.PIPELINE_NAME}.yml"
assert not catalog_path.exists()
catalog_path.mkdir()
data_catalog_file = catalog_path / f"catalog_{self.PIPELINE_NAME}.yml"

catalog_config = {
"example_test_x": {"type": "pandas.CSVDataSet", "filepath": "test.csv"}
Expand Down

0 comments on commit 15b8540

Please sign in to comment.