diff --git a/RELEASE.md b/RELEASE.md index da3423fab0..5447340938 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -5,7 +5,7 @@ * Implemented `KedroDataCatalog` repeating `DataCatalog` functionality with a few API enhancements: * Removed `_FrozenDatasets` and access datasets as properties; * Added get dataset by name feature; - * `add_feed_dict()` was simplified and renamed to `add_data()`; + * `add_feed_dict()` was simplified to only add raw data; * Datasets' initialisation was moved out from `from_config()` method to the constructor. * Moved development requirements from `requirements.txt` to the dedicated section in `pyproject.toml` for project template. * Implemented `Protocol` abstraction for the current `DataCatalog` and adding new catalog implementations. @@ -13,12 +13,14 @@ * Moved pattern resolution logic from `DataCatalog` to a separate component - `CatalogConfigResolver`. Updated `DataCatalog` to use `CatalogConfigResolver` internally. * Made packaged Kedro projects return `session.run()` output to be used when running it in the interactive environment. * Enhanced `OmegaConfigLoader` configuration validation to detect duplicate keys at all parameter levels, ensuring comprehensive nested key checking. + +**Note:** ``KedroDataCatalog`` is an experimental feature and is under active development. Therefore, it is possible we'll introduce breaking changes to this class, so be mindful of that if you decide to use it already. Let us know if you have any feedback about the ``KedroDataCatalog`` or ideas for new features. + ## Bug fixes and other changes * Fixed bug where using dataset factories breaks with `ThreadRunner`. * Fixed a bug where `SharedMemoryDataset.exists` would not call the underlying `MemoryDataset`. * Fixed template projects example tests. -* Made credentials loading consistent between `KedroContext._get_catalog()` and `resolve_patterns` so that both us -e `_get_config_credentials()` +* Made credentials loading consistent between `KedroContext._get_catalog()` and `resolve_patterns` so that both use `_get_config_credentials()` ## Breaking changes to the API * Removed `ShelveStore` to address a security vulnerability. diff --git a/kedro/io/kedro_data_catalog.py b/kedro/io/kedro_data_catalog.py index ce06e34aac..d07de8151a 100644 --- a/kedro/io/kedro_data_catalog.py +++ b/kedro/io/kedro_data_catalog.py @@ -3,6 +3,9 @@ use a ``KedroDataCatalog``, you need to instantiate it with a dictionary of datasets. Then it will act as a single point of reference for your calls, relaying load and save functions to the underlying datasets. + +``KedroDataCatalog`` is an experimental feature aimed to replace ``DataCatalog`` in the future. +Expect possible breaking changes while using it. """ from __future__ import annotations @@ -44,6 +47,8 @@ def __init__( single point of reference for your calls, relaying load and save functions to the underlying datasets. + Note: ``KedroDataCatalog`` is an experimental feature and is under active development. Therefore, it is possible we'll introduce breaking changes to this class, so be mindful of that if you decide to use it already. + Args: datasets: A dictionary of dataset names and dataset instances. raw_data: A dictionary with data to be added in memory as `MemoryDataset`` instances. @@ -56,6 +61,13 @@ def __init__( case-insensitive string that conforms with operating system filename limitations, b) always return the latest version when sorted in lexicographical order. + + Example: + :: + >>> # settings.py + >>> from kedro.io import KedroDataCatalog + >>> + >>> DATA_CATALOG_CLASS = KedroDataCatalog """ self._config_resolver = config_resolver or CatalogConfigResolver() self._datasets = datasets or {} @@ -68,7 +80,7 @@ def __init__( self._add_from_config(ds_name, ds_config) if raw_data: - self.add_data(raw_data) + self.add_feed_dict(raw_data) @property def datasets(self) -> dict[str, Any]: @@ -304,16 +316,13 @@ def confirm(self, name: str) -> None: else: raise DatasetError(f"Dataset '{name}' does not have 'confirm' method") - def add_data(self, data: dict[str, Any], replace: bool = False) -> None: + def add_feed_dict(self, feed_dict: dict[str, Any], replace: bool = False) -> None: + # TODO: remove when removing old catalog # This method was simplified to add memory datasets only, since # adding AbstractDataset can be done via add() method - for ds_name, ds_data in data.items(): + for ds_name, ds_data in feed_dict.items(): self.add(ds_name, MemoryDataset(data=ds_data), replace) # type: ignore[abstract] - def add_feed_dict(self, feed_dict: dict[str, Any], replace: bool = False) -> None: - # TODO: remove when removing old catalog - return self.add_data(feed_dict, replace) - def shallow_copy( self, extra_dataset_patterns: Patterns | None = None ) -> KedroDataCatalog: