From ce24b3db4a70611feeca19f6cb07418d55c661ef Mon Sep 17 00:00:00 2001 From: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Date: Tue, 22 Aug 2023 14:46:10 +0100 Subject: [PATCH] Fix typos across the documentation (#2956) * Fix typos across docs Signed-off-by: Ankita Katiyar * Capitalisation stuff Signed-off-by: Ankita Katiyar --------- Signed-off-by: Ankita Katiyar --- docs/source/data/index.md | 2 +- docs/source/data/partitioned_and_incremental_datasets.md | 4 ++-- .../deployment/databricks/databricks_deployment_workflow.md | 6 +++--- .../databricks/databricks_notebooks_development_workflow.md | 4 ++-- docs/source/development/commands_reference.md | 2 +- docs/source/experiment_tracking/index.md | 2 +- docs/source/extend_kedro/architecture_overview.md | 2 +- 7 files changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/source/data/index.md b/docs/source/data/index.md index b90a3d9961..a6196bcc13 100644 --- a/docs/source/data/index.md +++ b/docs/source/data/index.md @@ -22,7 +22,7 @@ The following page offers a range of examples of YAML specification for various data_catalog_yaml_examples ``` -Once you are familiar with the format of `catalog.yml`, you may find your catalog gets repetitive if you need to load multiple datasets with similar configuration. From Kedro 0.18.12 you can use dataset factories to generalise the configuration and reduce the number of similar catalog entries. This works by by matching datasets used in your project’s pipelines to dataset factory patterns and is explained in a new page about Kedro dataset factories: +Once you are familiar with the format of `catalog.yml`, you may find your catalog gets repetitive if you need to load multiple datasets with similar configuration. From Kedro 0.18.12 you can use dataset factories to generalise the configuration and reduce the number of similar catalog entries. This works by matching datasets used in your project’s pipelines to dataset factory patterns and is explained in a new page about Kedro dataset factories: ```{toctree} diff --git a/docs/source/data/partitioned_and_incremental_datasets.md b/docs/source/data/partitioned_and_incremental_datasets.md index fde9dfd90a..a57b56d2a4 100644 --- a/docs/source/data/partitioned_and_incremental_datasets.md +++ b/docs/source/data/partitioned_and_incremental_datasets.md @@ -75,12 +75,12 @@ my_partitioned_dataset: Here is an exhaustive list of the arguments supported by `PartitionedDataset`: | Argument | Required | Supported types | Description | -| ----------------- | ------------------------------ | ------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| ----------------- | ------------------------------ | ------------------------------------------------ |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `path` | Yes | `str` | Path to the folder containing partitioned data. If path starts with the protocol (e.g., `s3://`) then the corresponding `fsspec` concrete filesystem implementation will be used. If protocol is not specified, local filesystem will be used | | `dataset` | Yes | `str`, `Type[AbstractDataset]`, `Dict[str, Any]` | Underlying dataset definition, for more details see the section below | | `credentials` | No | `Dict[str, Any]` | Protocol-specific options that will be passed to `fsspec.filesystemcall`, for more details see the section below | | `load_args` | No | `Dict[str, Any]` | Keyword arguments to be passed into `find()` method of the corresponding filesystem implementation | -| `filepath_arg` | No | `str` (defaults to `filepath`) | Argument name of the underlying dataset initializer that will contain a path to an individual partition | +| `filepath_arg` | No | `str` (defaults to `filepath`) | Argument name of the underlying dataset initialiser that will contain a path to an individual partition | | `filename_suffix` | No | `str` (defaults to an empty string) | If specified, partitions that don't end with this string will be ignored | ### Dataset definition diff --git a/docs/source/deployment/databricks/databricks_deployment_workflow.md b/docs/source/deployment/databricks/databricks_deployment_workflow.md index 245708e6bf..26c7a1634b 100644 --- a/docs/source/deployment/databricks/databricks_deployment_workflow.md +++ b/docs/source/deployment/databricks/databricks_deployment_workflow.md @@ -33,7 +33,7 @@ For those reasons, the packaging approach is unsuitable for development projects The sequence of steps described in this section is as follows: 1. [Note your Databricks username and host](#note-your-databricks-username-and-host) -2. [Install Kedro and the databricks CLI in a new virtual environment](#install-kedro-and-the-databricks-cli-in-a-new-virtual-environment) +2. [Install Kedro and the Databricks CLI in a new virtual environment](#install-kedro-and-the-databricks-cli-in-a-new-virtual-environment) 3. [Authenticate the Databricks CLI](#authenticate-the-databricks-cli) 4. [Create a new Kedro project](#create-a-new-kedro-project) 5. [Create an entry point for Databricks](#create-an-entry-point-for-databricks) @@ -49,10 +49,10 @@ Find your Databricks username in the top right of the workspace UI and the host ![Find Databricks host and username](../../meta/images/find_databricks_host_and_username.png) ```{note} -Your databricks host must include the protocol (`https://`). +Your Databricks host must include the protocol (`https://`). ``` -### Install Kedro and the databricks CLI in a new virtual environment +### Install Kedro and the Databricks CLI in a new virtual environment The following commands will create a new `conda` environment, activate it, and then install Kedro and the Databricks CLI. diff --git a/docs/source/deployment/databricks/databricks_notebooks_development_workflow.md b/docs/source/deployment/databricks/databricks_notebooks_development_workflow.md index 5867163ab9..ef2081a28a 100644 --- a/docs/source/deployment/databricks/databricks_notebooks_development_workflow.md +++ b/docs/source/deployment/databricks/databricks_notebooks_development_workflow.md @@ -4,7 +4,7 @@ This guide demonstrates a workflow for developing Kedro projects on Databricks u This method of developing a Kedro project for use on Databricks is ideal for developers who prefer developing their projects in notebooks rather than an in an IDE. It also avoids the overhead of setting up and syncing a local environment with Databricks. If you want to take advantage of the powerful features of an IDE to develop your project, consider following the [guide for developing a Kedro project for Databricks using your local environment](./databricks_ide_development_workflow.md). -In this guide, you will store your project's code in a repository on [GitHub](https://github.com/). Databricks integrates with many [Git providers](https://docs.databricks.com/repos/index.html#supported-git-providers), including GitLab and Azure Devops. The steps to create a Git repository and sync it with Databricks also generally apply to these Git providers, though the exact details may vary. +In this guide, you will store your project's code in a repository on [GitHub](https://github.com/). Databricks integrates with many [Git providers](https://docs.databricks.com/repos/index.html#supported-git-providers), including GitLab and Azure DevOps. The steps to create a Git repository and sync it with Databricks also generally apply to these Git providers, though the exact details may vary. ## What this page covers @@ -263,7 +263,7 @@ Now that your project has run successfully once, you can make changes using the The `databricks-iris` starter uses a default 80-20 ratio of training data to test data when training the classifier. You will edit this ratio to 70-30 and re-run your project to view the different result. -In the Databricks workspace, click on the `Repos` tab in the side bar and navigate to `/iris-databricks/conf/base/`. Open the the file `parameters.yml` by double-clicking it. This will take you to a built-in file editor. Edit the line `train_fraction: 0.8` to `train_fraction: 0.7`, your changes will automatically be saved. +In the Databricks workspace, click on the `Repos` tab in the side bar and navigate to `/iris-databricks/conf/base/`. Open the file `parameters.yml` by double-clicking it. This will take you to a built-in file editor. Edit the line `train_fraction: 0.8` to `train_fraction: 0.7`, your changes will automatically be saved. ![Databricks edit file](../../meta/images/databricks_edit_file.png) diff --git a/docs/source/development/commands_reference.md b/docs/source/development/commands_reference.md index adf3db84c3..ded8da9dcc 100644 --- a/docs/source/development/commands_reference.md +++ b/docs/source/development/commands_reference.md @@ -446,7 +446,7 @@ kedro micropkg package Further information is available in the [micro-packaging documentation](../nodes_and_pipelines/micro_packaging.md). ##### Pull a micro-package in your project -The following command pulls all the files related to a micro-package, e.g. a modular pipeline, from either [Pypi](https://pypi.org/) or a storage location of a [Python source distribution file](https://packaging.python.org/overview/#python-source-distributions). +The following command pulls all the files related to a micro-package, e.g. a modular pipeline, from either [PyPI](https://pypi.org/) or a storage location of a [Python source distribution file](https://packaging.python.org/overview/#python-source-distributions). ```bash kedro micropkg pull (or path to a sdist file) diff --git a/docs/source/experiment_tracking/index.md b/docs/source/experiment_tracking/index.md index 31bff89ee2..3004fe28e0 100644 --- a/docs/source/experiment_tracking/index.md +++ b/docs/source/experiment_tracking/index.md @@ -32,7 +32,7 @@ Kedro-Viz version 6.2 includes support for collaborative experiment tracking usi The choice of experiment tracking tool depends on your use case and choice of complementary tools, such as MLflow and Neptune: - **Kedro** - If you need experiment tracking, are looking for improved metrics visualisation and want a lightweight tool to work alongside existing functionality in Kedro. Kedro does not support a model registry. -- **MLflow** - You can combine MLFlow with Kedro by using [`kedro-mlflow`](https://kedro-mlflow.readthedocs.io/en/stable/) if you require experiment tracking, model registry and/or model serving capabilities or have access to Managed MLflow within the Databricks ecosystem. +- **MLflow** - You can combine MLflow with Kedro by using [`kedro-mlflow`](https://kedro-mlflow.readthedocs.io/en/stable/) if you require experiment tracking, model registry and/or model serving capabilities or have access to Managed MLflow within the Databricks ecosystem. - **Neptune** - If you require experiment tracking and model registry functionality, improved visualisation of metrics and support for collaborative data science, you may consider [`kedro-neptune`](https://docs.neptune.ai/integrations/kedro/) for your workflow. [We support a growing list of integrations](../extend_kedro/plugins.md). diff --git a/docs/source/extend_kedro/architecture_overview.md b/docs/source/extend_kedro/architecture_overview.md index 272fcef572..44d046fd02 100644 --- a/docs/source/extend_kedro/architecture_overview.md +++ b/docs/source/extend_kedro/architecture_overview.md @@ -3,7 +3,7 @@ There are different ways to leverage Kedro in your work, you can: - Commit to using all of Kedro (framework, project, starters and library); which is preferable to take advantage of the full value proposition of Kedro - - You can leverage parts of Kedro, like the DataCatalog (I/O), ConfigLoader, Pipelines and Runner, by using it as a Python libary; this best supports a workflow where you don't want to adopt the Kedro project template + - You can leverage parts of Kedro, like the DataCatalog (I/O), ConfigLoader, Pipelines and Runner, by using it as a Python library; this best supports a workflow where you don't want to adopt the Kedro project template - Or, you can develop extensions for Kedro e.g. custom starters, plugins, Hooks and more At a high level, Kedro consists of five main parts: