Minor changes to create a PR and test Vale styles

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
kedro-org · Aug 24, 2023 · 81a69a2 · 81a69a2
1 parent 15bba0e
commit 81a69a2
Show file tree

Hide file tree

Showing 7 changed files with 7 additions and 7 deletions.
diff --git a/docs/source/data/advanced_data_catalog_usage.md b/docs/source/data/advanced_data_catalog_usage.md
@@ -6,7 +6,7 @@ You can define a Data Catalog in two ways. Most use cases can be through a YAML
 
 To use the `DataCatalog` API, construct a `DataCatalog` object programmatically in a file like `catalog.py`.
 
-In the following, we are using several pre-built data loaders documented in the [API reference documentation](/kedro_datasets).
+In the following code, we use several pre-built data loaders documented in the [API reference documentation](/kedro_datasets).
 
 ```python
 from kedro.io import DataCatalog

diff --git a/docs/source/data/data_catalog.md b/docs/source/data/data_catalog.md
@@ -3,7 +3,7 @@
 
 In a Kedro project, the Data Catalog is a registry of all data sources available for use by the project. It is specified with a YAML catalog file that maps the names of node inputs and outputs as keys in the `DataCatalog` class.
 
-This page introduces the basic sections of `catalog.yml`, which is the file used to register data sources for a Kedro project.
+This page introduces the basic sections of `catalog.yml`, which is the file Kedro uses to register data sources for a project.
 
 ## The basics of `catalog.yml`
 A separate page of [Data Catalog YAML examples](./data_catalog_yaml_examples.md)  gives further examples of how to work with `catalog.yml`, but here we revisit the [basic `catalog.yml` introduced by the spaceflights tutorial](../tutorial/set_up_data.md).

diff --git a/docs/source/data/data_catalog_yaml_examples.md b/docs/source/data/data_catalog_yaml_examples.md
@@ -8,7 +8,7 @@ This page contains a set of examples to help you structure your YAML configurati
 
 ## Load data from a local binary file using `utf-8` encoding
 
-The `open_args_load` and `open_args_save` parameters are passed to the filesystem's `open` method to configure how a dataset file (on a specific filesystem) is opened during a load or save operation, respectively.
+The `open_args_load` and `open_args_save` parameters are passed to the filesystem's `open` method to configure how a dataset file (on a specific filesystem) is opened during a load or save operation respectively.
 
 ```yaml
 test_dataset:

diff --git a/docs/source/data/how_to_create_a_custom_dataset.md b/docs/source/data/how_to_create_a_custom_dataset.md
@@ -4,7 +4,7 @@
 
 ## AbstractDataset
 
-For contributors, if you would like to submit a new dataset, you must extend the [`AbstractDataset` interface](/kedro.io.AbstractDataset) or [`AbstractVersionedDataset` interface](/kedro.io.AbstractVersionedDataset) if you plan to support versioning. It requires subclasses to override the `_load` and `_save` and provides `load` and `save` methods that enrich the corresponding private methods with uniform error handling. It also requires subclasses to override `_describe`, which is used in logging the internal information about the instances of your custom `AbstractDataset` implementation.
+If you are a contributor and would like to submit a new dataset, you must extend the [`AbstractDataset` interface](/kedro.io.AbstractDataset) or [`AbstractVersionedDataset` interface](/kedro.io.AbstractVersionedDataset) if you plan to support versioning. It requires subclasses to override the `_load` and `_save` and provides `load` and `save` methods that enrich the corresponding private methods with uniform error handling. It also requires subclasses to override `_describe`, which is used in logging the internal information about the instances of your custom `AbstractDataset` implementation.
 
 
 ## Scenario

diff --git a/docs/source/data/index.md b/docs/source/data/index.md
@@ -3,7 +3,7 @@
 
 In a Kedro project, the Data Catalog is a registry of all data sources available for use by the project. The catalog is stored in a YAML file (`catalog.yml`) that maps the names of node inputs and outputs as keys in the `DataCatalog` class.
 
-[Kedro provides different built-in datasets in the `kedro-datasets` package](/kedro_datasets) for numerous file types and file systems, so you don’t have to write any of the logic for reading/writing data.
+[Kedro provides different built-in datasets in the `kedro-datasets` package](/kedro_datasets) for numerous file types and file systems so you don’t have to write any of the logic for reading/writing data.
 
 
 We first introduce the basic sections of `catalog.yml`, which is the file used to register data sources for a Kedro project.

diff --git a/docs/source/data/kedro_dataset_factories.md b/docs/source/data/kedro_dataset_factories.md
@@ -1,7 +1,7 @@
 # Kedro dataset factories
 You can load multiple datasets with similar configuration using dataset factories, introduced in Kedro 0.18.12.
 
-The syntax allows you to generalise the configuration and reduce the number of similar catalog entries by matching datasets used in your project's pipelines to dataset factory patterns.
+The syntax allows you to generalise your configuration and reduce the number of similar catalog entries by matching datasets used in your project's pipelines to dataset factory patterns.
 
 ## How to generalise datasets with similar names and types
 

diff --git a/docs/source/data/partitioned_and_incremental_datasets.md b/docs/source/data/partitioned_and_incremental_datasets.md
@@ -2,7 +2,7 @@
 
 ## Partitioned datasets
 
-Distributed systems play an increasingly important role in ETL data pipelines. They significantly increase the processing throughput, enabling us to work with much larger volumes of input data. However, these benefits sometimes come at a cost. When dealing with the input data generated by such distributed systems, you might encounter a situation where your Kedro node needs to read the data from a directory full of uniform files of the same type (e.g. JSON, CSV, Parquet, etc.) rather than from a single file. Tools like `PySpark` and the corresponding [SparkDataSet](/kedro_datasets.spark.SparkDataSet) cater for such use cases, but the use of Spark is not always feasible.
+Distributed systems play an increasingly important role in ETL data pipelines. They significantly increase the processing throughput, enabling us to work with much larger volumes of input data. However, these benefits sometimes come at a cost. When dealing with the input data generated by such distributed systems, you might encounter a situation where your Kedro node needs to read the data from a directory full of uniform files of the same type (e.g. JSON, CSV, Parquet, etc.) rather than from a single file. Tools like `PySpark` and the corresponding [SparkDataSet](/kedro_datasets.spark.SparkDataSet) cater for such use cases, but using Spark is not always feasible.
 
 This is why Kedro provides a built-in [PartitionedDataset](/kedro.io.PartitionedDataset), with the following features: