Replace any remaining instances of DataSet in docs

kedro-org · Aug 18, 2023 · 76732ea · 76732ea
1 parent dba7503
commit 76732ea
Show file tree

Hide file tree

Showing 7 changed files with 16 additions and 16 deletions.
diff --git a/docs/source/data/advanced_data_catalog_usage.md b/docs/source/data/advanced_data_catalog_usage.md
@@ -55,7 +55,7 @@ gear = cars["gear"].values
 The following steps happened behind the scenes when `load` was called:
 
 - The value `cars` was located in the Data Catalog
-- The corresponding `AbstractDataSet` object was retrieved
+- The corresponding `AbstractDataset` object was retrieved
 - The `load` method of this dataset was called
 - This `load` method delegated the loading to the underlying pandas `read_csv` function
 
@@ -70,9 +70,9 @@ This pattern is not recommended unless you are using platform notebook environme
 To save data using an API similar to that used to load data:
 
 ```python
-from kedro.io import MemoryDataSet
+from kedro.io import MemoryDataset
 
-memory = MemoryDataSet(data=None)
+memory = MemoryDataset(data=None)
 io.add("cars_cache", memory)
 io.save("cars_cache", "Memory can store anything.")
 io.load("cars_cache")
@@ -190,7 +190,7 @@ io.save("test_data_set", data1)
 reloaded = io.load("test_data_set")
 assert data1.equals(reloaded)
 
-# raises DataSetError since the path
+# raises DatasetError since the path
 # data/01_raw/test.csv/my_exact_version/test.csv already exists
 io.save("test_data_set", data2)
 ```
@@ -219,7 +219,7 @@ io = DataCatalog({"test_data_set": test_data_set})
 
 io.save("test_data_set", data1)  # emits a UserWarning due to version inconsistency
 
-# raises DataSetError since the data/01_raw/test.csv/exact_load_version/test.csv
+# raises DatasetError since the data/01_raw/test.csv/exact_load_version/test.csv
 # file does not exist
 reloaded = io.load("test_data_set")
 ```
diff --git a/docs/source/data/data_catalog.md b/docs/source/data/data_catalog.md
@@ -145,9 +145,9 @@ kedro run --load-version=cars:YYYY-MM-DDThh.mm.ss.sssZ
 ```
 where `--load-version` is dataset name and version timestamp separated by `:`.
 
-A dataset offers versioning support if it extends the [`AbstractVersionedDataSet`](/kedro.io.AbstractVersionedDataset) class to accept a version keyword argument as part of the constructor and adapt the `_save` and `_load` method to use the versioned data path obtained from `_get_save_path` and `_get_load_path` respectively.
+A dataset offers versioning support if it extends the [`AbstractVersionedDataset`](/kedro.io.AbstractVersionedDataset) class to accept a version keyword argument as part of the constructor and adapt the `_save` and `_load` method to use the versioned data path obtained from `_get_save_path` and `_get_load_path` respectively.
 
-To verify whether a dataset can undergo versioning, you should examine the dataset class code to inspect its inheritance [(you can find contributed datasets within the `kedro-datasets` repository)](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/kedro_datasets). Check if the dataset class inherits from the `AbstractVersionedDataSet`. For instance, if you encounter a class like `CSVDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame])`, this indicates that the dataset is set up to support versioning.
+To verify whether a dataset can undergo versioning, you should examine the dataset class code to inspect its inheritance [(you can find contributed datasets within the `kedro-datasets` repository)](https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/kedro_datasets). Check if the dataset class inherits from the `AbstractVersionedDataset`. For instance, if you encounter a class like `CSVDataSet(AbstractVersionedDataset[pd.DataFrame, pd.DataFrame])`, this indicates that the dataset is set up to support versioning.
 
 ```{note}
 Note that HTTP(S) is a supported file system in the dataset implementations, but if you it, you can't also use versioning.

diff --git a/docs/source/data/data_catalog_yaml_examples.md b/docs/source/data/data_catalog_yaml_examples.md
@@ -397,12 +397,12 @@ for loading, so the first node outputs a `pyspark.sql.DataFrame`, while the seco
 
 You can use the [`kedro catalog create` command to create a Data Catalog YAML configuration](../development/commands_reference.md#create-a-data-catalog-yaml-configuration-file).
 
-This creates a `<conf_root>/<env>/catalog/<pipeline_name>.yml` configuration file with `MemoryDataSet` datasets for each dataset in a registered pipeline if it is missing from the `DataCatalog`.
+This creates a `<conf_root>/<env>/catalog/<pipeline_name>.yml` configuration file with `MemoryDataset` datasets for each dataset in a registered pipeline if it is missing from the `DataCatalog`.
 
 ```yaml
 # <conf_root>/<env>/catalog/<pipeline_name>.yml
 rockets:
-  type: MemoryDataSet
+  type: MemoryDataset
 scooters:
-  type: MemoryDataSet
+  type: MemoryDataset
 ```
diff --git a/docs/source/data/how_to_create_a_custom_dataset.md b/docs/source/data/how_to_create_a_custom_dataset.md
@@ -2,9 +2,9 @@
 
 [Kedro supports many datasets](/kedro_datasets) out of the box, but you may find that you need to create a custom dataset. For example, you may need to handle a proprietary data format or filesystem in your pipeline, or perhaps you have found a particular use case for a dataset that Kedro does not support. This tutorial explains how to create a custom dataset to read and save image data.
 
-## AbstractDataSet
+## AbstractDataset
 
-For contributors, if you would like to submit a new dataset, you must extend the [`AbstractDataSet` interface](/kedro.io.AbstractDataset) or [`AbstractVersionedDataSet` interface](/kedro.io.AbstractVersionedDataset) if you plan to support versioning. It requires subclasses to override the `_load` and `_save` and provides `load` and `save` methods that enrich the corresponding private methods with uniform error handling. It also requires subclasses to override `_describe`, which is used in logging the internal information about the instances of your custom `AbstractDataSet` implementation.
+For contributors, if you would like to submit a new dataset, you must extend the [`AbstractDataset` interface](/kedro.io.AbstractDataset) or [`AbstractVersionedDataset` interface](/kedro.io.AbstractVersionedDataset) if you plan to support versioning. It requires subclasses to override the `_load` and `_save` and provides `load` and `save` methods that enrich the corresponding private methods with uniform error handling. It also requires subclasses to override `_describe`, which is used in logging the internal information about the instances of your custom `AbstractDataset` implementation.
 
 
 ## Scenario
@@ -309,7 +309,7 @@ Versioning doesn't work with `PartitionedDataset`. You can't use both of them at
 ```
 
 To add versioning support to the new dataset we need to extend the
- [AbstractVersionedDataSet](/kedro.io.AbstractVersionedDataset) to:
+ [AbstractVersionedDataset](/kedro.io.AbstractVersionedDataset) to:
 
 * Accept a `version` keyword argument as part of the constructor
 * Adapt the `_save` and `_load` method to use the versioned data path obtained from `_get_save_path` and `_get_load_path` respectively

diff --git a/docs/source/data/kedro_dataset_factories.md b/docs/source/data/kedro_dataset_factories.md
@@ -215,7 +215,7 @@ The matches are ranked according to the following criteria:
 
 ## How to override the default dataset creation with dataset factories
 
-You can use dataset factories to define a catch-all pattern which will overwrite the default [`MemoryDataSet`](/kedro.io.MemoryDataset) creation.
+You can use dataset factories to define a catch-all pattern which will overwrite the default [`MemoryDataset`](/kedro.io.MemoryDataset) creation.
 
 ```yaml
 "{default_dataset}":

diff --git a/docs/source/data/partitioned_and_incremental_datasets.md b/docs/source/data/partitioned_and_incremental_datasets.md
@@ -15,7 +15,7 @@ This is why Kedro provides a built-in [PartitionedDataset](/kedro.io.Partitioned
 In this section, each individual file inside a given location is called a partition.
 ```
 
-### How to use `PartitionedDataSet`
+### How to use `PartitionedDataset`
 
 You can use a `PartitionedDataset` in `catalog.yml` file like any other regular dataset definition:
 

diff --git a/docs/source/deployment/argo.md b/docs/source/deployment/argo.md
@@ -24,7 +24,7 @@ To use Argo Workflows, ensure you have the following prerequisites in place:
 - [Argo Workflows is installed](https://github.com/argoproj/argo/blob/master/README.md#quickstart) on your Kubernetes cluster
 - [Argo CLI is installed](https://github.com/argoproj/argo/releases) on your machine
 - A `name` attribute is set for each [Kedro node](/kedro.pipeline.node) since it is used to build a DAG
-- [All node input/output DataSets must be configured in `catalog.yml`](../data/data_catalog_yaml_examples.md) and refer to an external location (e.g. AWS S3); you cannot use the `MemoryDataset` in your workflow
+- [All node input/output datasets must be configured in `catalog.yml`](../data/data_catalog_yaml_examples.md) and refer to an external location (e.g. AWS S3); you cannot use the `MemoryDataset` in your workflow
 
 ```{note}
 Each node will run in its own container.