diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md index 5d4c59d9..f93aee55 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.md +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -7,31 +7,30 @@ assignees: '' --- -**_If you like the repo, please give it a :star:_** + ## Description - -Short description of the problem here. + ## Context -How has this bug affected you? What were you trying to accomplish? + ## Steps to Reproduce -Please provide a detailed description. A Minimal Reproducible Example would really help to solve your issue faster (see this [Stack Overflow thread](https://stackoverflow.com/help/minimal-reproducible-example) to see how to create a good "reprex"). A link to a github repo is even better. + ## Expected Result -Tell us what should happen. + ## Actual Result -Tell us what happens instead. + ``` -- If you received an error, place it here. @@ -43,7 +42,7 @@ Tell us what happens instead. ## Your Environment -Include as many relevant details about the environment in which you experienced the bug: + * `kedro` and `kedro-mlflow` version used (`pip show kedro` and `pip show kedro-mlflow`): * Python version used (`python -V`): @@ -51,7 +50,7 @@ Include as many relevant details about the environment in which you experienced ## Does the bug also happen with the last version on master? -The plugin is still in early development and known bugs are fixed as soon as we can. If you are lucky, your bug is already fixed on the `master` branch which is the most up to date. This branch contains our more recent development unpublished on PyPI yet. + diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md index 6b85060f..2b0c9a17 100644 --- a/.github/ISSUE_TEMPLATE/feature_request.md +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -6,16 +6,16 @@ labels: 'Issue: Feature Request' assignees: '' --- -*If you like the repo, please give it a :star:* + ## Description -A clear and concise description of what you want to achieve. An image or a code example is worth thousand words! + ## Context -Why is this change important to you? How would you use it? How can it benefit other users? + ## Possible Implementation -(Optional) Suggest an idea for implementing the addition or change. + ## Possible Alternatives -(Optional) Describe any alternative solutions or features you've considered. + diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 1fba8aff..1d49d5b3 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,7 +1,7 @@ exclude: ^kedro_mlflow/template/project/run.py$ repos: - repo: https://github.com/astral-sh/ruff-pre-commit - rev: v0.1.3 + rev: v0.1.8 hooks: - id: ruff args: [--fix, --exit-non-zero-on-fix] diff --git a/CHANGELOG.md b/CHANGELOG.md index 083ae15e..3ccee6b9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,10 +6,12 @@ - :sparkles: Add support for python 3.11 ([#450, rxm7706](https://github.com/Galileo-Galilei/kedro-mlflow/pull/450)) - :sparkles: :arrow_up: Add support for pydantic v2 ([#476](https://github.com/Galileo-Galilei/kedro-mlflow/pull/476)) +- :sparkles: :arrow_up: Add support for ``kedro==0.19.X`` ([#](https://github.com/Galileo-Galilei/kedro-mlflow/pull/)) ### Changed -- :boom: :sparkles: Change default ``copy_mode`` to ``"assign"`` in ``KedroPipelineModel`` because this is the most efficient setup (and usually the desired one) when serving a Kedro ``Pipeline`` as a Mlflow model. This is different from Kedro's default which is to deepcopy the dataset ([#463, ShubhamZoro](https://github.com/Galileo-Galilei/kedro-mlflow/pull/463)). +- :boom: ::arrow_up:: Drop support for ``kedro==0.18.X`` series. +- :boom: :sparkles: Change default ``copy_mode`` to ``"assign"`` in ``KedroPipelineModel`` because this is the most efficient setup (and usually the desired one) when serving a Kedro ``Pipeline`` as a Mlflow model. This is different from Kedro's default which is to deepcopy the dataset ([#463](https://github.com/Galileo-Galilei/kedro-mlflow/pull/463)). - :boom: :recycle: ``MlflowArtifactDataset.__init__`` method ``data_set`` argument is renamed ``dataset`` to match new Kedro conventions ([#391](https://github.com/Galileo-Galilei/kedro-mlflow/pull/391)). - :boom: :recycle: Rename the following ``DataSets`` with the ``Dataset`` suffix (without capitalized ``S``) to match new kedro conventions from ``kedro>=0.19`` and onwards ([#439, ShubhamZoro](https://github.com/Galileo-Galilei/kedro-mlflow/pull/439)): - ``MlflowArtifactDataSet``->``MlflowArtifactDataset`` @@ -26,7 +28,7 @@ ### Fixed -- :bug: Avoid error when using kedro==0.18.1 with `TemplatedConfigLoader` and no `mlflow.yml` configuration file ([#452, sami-sweng](https://github.com/Galileo-Galilei/kedro-mlflow/issues/452)) +- :bug: Avoid error when using ``kedro==0.18.1`` with `TemplatedConfigLoader` and no `mlflow.yml` configuration file ([#452, sami-sweng](https://github.com/Galileo-Galilei/kedro-mlflow/issues/452)) ## [0.11.9] - 2023-07-23 diff --git a/docs/source/03_getting_started/01_example_project.md b/docs/source/03_getting_started/01_example_project.md index 3dc208e5..f7a112a2 100644 --- a/docs/source/03_getting_started/01_example_project.md +++ b/docs/source/03_getting_started/01_example_project.md @@ -12,7 +12,7 @@ pip install kedro-mlflow==0.11.10 ## Install the toy project -For this end to end example, we will use the [kedro starter](https://kedro.readthedocs.io/en/stable/get_started/starters.html) with the [iris dataset](https://github.com/quantumblacklabs/kedro-starter-pandas-iris). +For this end to end example, we will use the [kedro starter](https://docs.kedro.org/en/stable/starters/starters.html#official-kedro-starters) with the [iris dataset](https://github.com/quantumblacklabs/kedro-starter-pandas-iris). We use this project because: diff --git a/docs/source/04_experimentation_tracking/06_mlflow_ui.md b/docs/source/04_experimentation_tracking/06_mlflow_ui.md index 8c71297b..2dd6ec2b 100644 --- a/docs/source/04_experimentation_tracking/06_mlflow_ui.md +++ b/docs/source/04_experimentation_tracking/06_mlflow_ui.md @@ -6,7 +6,7 @@ Mlflow offers a user interface (UI) that enable to browse the run history. ## The ``kedro-mlflow`` helper -When you use a local storage for kedro mlflow, you can call a [mlflow cli command](https://www.mlflow.org/docs/latest/quickstart.html#viewing-the-tracking-ui) to launch the UI if you do not have a [mlflow tracking server configured](https://www.mlflow.org/docs/latest/tracking.html#tracking-ui). +When you use a local storage for kedro mlflow, you can call a [mlflow cli command](https://www.mlflow.org/docs/latest/tracking.html#tracking-ui) to launch the UI if you do not have a [mlflow tracking server configured](https://www.mlflow.org/docs/latest/tracking.html#mlflow-tracking-server-optional). To ensure this UI is linked to the tracking uri specified configuration, ``kedro-mlflow`` offers the following command: diff --git a/kedro_mlflow/framework/hooks/mlflow_hook.py b/kedro_mlflow/framework/hooks/mlflow_hook.py index de1d9a4b..547324ee 100644 --- a/kedro_mlflow/framework/hooks/mlflow_hook.py +++ b/kedro_mlflow/framework/hooks/mlflow_hook.py @@ -66,7 +66,7 @@ def after_context_created( {"mlflow": ["mlflow*", "mlflow*/**", "**/mlflow*"]} ) conf_mlflow_yml = context.config_loader["mlflow"] - except (MissingConfigException, AttributeError): + except MissingConfigException: LOGGER.warning( "No 'mlflow.yml' config file found in environment. Default configuration will be used. Use ``kedro mlflow init`` command in CLI to customize the configuration." ) diff --git a/kedro_mlflow/io/artifacts/mlflow_artifact_dataset.py b/kedro_mlflow/io/artifacts/mlflow_artifact_dataset.py index ad7f8842..56de1ca3 100644 --- a/kedro_mlflow/io/artifacts/mlflow_artifact_dataset.py +++ b/kedro_mlflow/io/artifacts/mlflow_artifact_dataset.py @@ -1,10 +1,9 @@ import shutil -from inspect import isclass from pathlib import Path from typing import Any, Dict, Union import mlflow -from kedro.io import AbstractDataset, AbstractVersionedDataset +from kedro.io import AbstractVersionedDataset from kedro.io.core import parse_dataset_definition from mlflow.tracking import MlflowClient @@ -21,9 +20,6 @@ def __new__( artifact_path: str = None, credentials: Dict[str, Any] = None, ): - if isclass(dataset["type"]) and issubclass(dataset["type"], AbstractDataset): - # parse_dataset_definition needs type to be a string, not the class itself - dataset["type"] = f"{dataset['type'].__module__}.{dataset['type'].__name__}" dataset_obj, dataset_args = parse_dataset_definition(config=dataset) # fake inheritance : this mlflow class should be a mother class which wraps diff --git a/kedro_mlflow/io/metrics/__init__.py b/kedro_mlflow/io/metrics/__init__.py index 2fdb707e..0c4a1e26 100644 --- a/kedro_mlflow/io/metrics/__init__.py +++ b/kedro_mlflow/io/metrics/__init__.py @@ -1,6 +1,6 @@ from .mlflow_metric_dataset import MlflowMetricDataset from .mlflow_metric_history_dataset import MlflowMetricHistoryDataset -from .mlflow_metrics_dataset import MlflowMetricsHistoryDataset +from .mlflow_metrics_history_dataset import MlflowMetricsHistoryDataset __all__ = [ "MlflowMetricDataset", diff --git a/kedro_mlflow/io/metrics/mlflow_metrics_dataset.py b/kedro_mlflow/io/metrics/mlflow_metrics_history_dataset.py similarity index 100% rename from kedro_mlflow/io/metrics/mlflow_metrics_dataset.py rename to kedro_mlflow/io/metrics/mlflow_metrics_history_dataset.py diff --git a/kedro_mlflow/io/models/mlflow_abstract_model_dataset.py b/kedro_mlflow/io/models/mlflow_abstract_model_dataset.py index 91c236e4..13326da7 100644 --- a/kedro_mlflow/io/models/mlflow_abstract_model_dataset.py +++ b/kedro_mlflow/io/models/mlflow_abstract_model_dataset.py @@ -7,9 +7,9 @@ from kedro.io.core import DatasetError -class MlflowModelRegistryDataset(AbstractVersionedDataset): +class MlflowAbstractModelDataSet(AbstractVersionedDataset): """ - Absract mother class for model datasets. + Abstract mother class for model datasets. """ def __init__( @@ -21,7 +21,7 @@ def __init__( save_args: Dict[str, Any] = None, version: Version = None, ) -> None: - """Initialize the Kedro MlflowModelDataSet. + """Initialize the Kedro MlflowAbstractModelDataSet. Parameters are passed from the Data Catalog. diff --git a/kedro_mlflow/io/models/mlflow_model_local_filesystem_dataset.py b/kedro_mlflow/io/models/mlflow_model_local_filesystem_dataset.py index 9c70e024..5efe2859 100644 --- a/kedro_mlflow/io/models/mlflow_model_local_filesystem_dataset.py +++ b/kedro_mlflow/io/models/mlflow_model_local_filesystem_dataset.py @@ -5,11 +5,11 @@ from kedro.io import Version from kedro_mlflow.io.models.mlflow_abstract_model_dataset import ( - MlflowModelRegistryDataset, + MlflowAbstractModelDataSet, ) -class MlflowModelLocalFileSystemDataset(MlflowModelRegistryDataset): +class MlflowModelLocalFileSystemDataset(MlflowAbstractModelDataSet): """Wrapper for saving, logging and loading for all MLflow model flavor.""" def __init__( diff --git a/kedro_mlflow/io/models/mlflow_model_registry_dataset.py b/kedro_mlflow/io/models/mlflow_model_registry_dataset.py index bc1dd2c5..8306dd33 100644 --- a/kedro_mlflow/io/models/mlflow_model_registry_dataset.py +++ b/kedro_mlflow/io/models/mlflow_model_registry_dataset.py @@ -1,11 +1,11 @@ from typing import Any, Dict, Optional, Union from kedro_mlflow.io.models.mlflow_abstract_model_dataset import ( - MlflowModelRegistryDataset, + MlflowAbstractModelDataSet, ) -class MlflowModelRegistryDataset(MlflowModelRegistryDataset): +class MlflowModelRegistryDataset(MlflowAbstractModelDataSet): """Wrapper for saving, logging and loading for all MLflow model flavor.""" def __init__( diff --git a/kedro_mlflow/io/models/mlflow_model_tracking_dataset.py b/kedro_mlflow/io/models/mlflow_model_tracking_dataset.py index 053a73bf..b91153a4 100644 --- a/kedro_mlflow/io/models/mlflow_model_tracking_dataset.py +++ b/kedro_mlflow/io/models/mlflow_model_tracking_dataset.py @@ -4,11 +4,11 @@ from kedro.io.core import DatasetError from kedro_mlflow.io.models.mlflow_abstract_model_dataset import ( - MlflowModelRegistryDataset, + MlflowAbstractModelDataSet, ) -class MlflowModelTrackingDataset(MlflowModelRegistryDataset): +class MlflowModelTrackingDataset(MlflowAbstractModelDataSet): """Wrapper for saving, logging and loading for all MLflow model flavor.""" def __init__( diff --git a/pyproject.toml b/pyproject.toml index ca45526a..f01fb44c 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,12 +1,3 @@ -[tool.black] -exclude = "/template/" - -[tool.isort] -profile = "black" -sections=["FUTURE","STDLIB","THIRDPARTY","FIRSTPARTY","LOCALFOLDER"] -known_first_party=["kedro_mlflow"] -known_third_party=["anyconfig","black","click","cookiecutter","flake8","isort","jinja2","kedro","mlflow","pandas","pytest","pytest_lazyfixture","setuptools","sklearn","yaml"] - [tool.ruff] select = [ "F", # Pyflakes diff --git a/requirements.txt b/requirements.txt index d86240da..8a8e9073 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,4 +1,4 @@ -kedro>=0.18.1, <0.19.0 +kedro>=0.19.0, <0.20.0 kedro_datasets mlflow>=1.0.0, <3.0.0 pydantic>=1.0.0, <3.0.0 diff --git a/tests/config/test_get_mlflow_config.py b/tests/config/test_get_mlflow_config.py index 841afdd9..5a46312b 100644 --- a/tests/config/test_get_mlflow_config.py +++ b/tests/config/test_get_mlflow_config.py @@ -63,7 +63,7 @@ def test_mlflow_config_default(kedro_project): def test_mlflow_config_in_uninitialized_project(kedro_project, package_name): # config_with_base_mlflow_conf is a pytest.fixture in conftest bootstrap_project(kedro_project) - session = KedroSession.create(project_path=kedro_project, package_name=package_name) + session = KedroSession.create(project_path=kedro_project) context = session.load_context() assert context.mlflow.dict() == dict( server=dict( @@ -90,9 +90,7 @@ def test_mlflow_config_with_no_experiment_name(kedro_project): open((kedro_project / "conf" / "base" / "mlflow.yml").as_posix(), mode="w").close() bootstrap_project(kedro_project) - session = KedroSession.create( - project_path=kedro_project, package_name="fake_project" - ) + session = KedroSession.create(project_path=kedro_project) context = session.load_context() assert context.mlflow.dict() == dict( server=dict( @@ -203,9 +201,7 @@ def fake_project(tmp_path, local_logging_config): ) def test_mlflow_config_correctly_set(kedro_project, project_settings): bootstrap_project(kedro_project) - session = KedroSession.create( - project_path=kedro_project, package_name="fake_project" - ) + session = KedroSession.create(project_path=kedro_project) context = session.load_context() assert context.mlflow.dict(exclude={"project_path"}) == dict( server=dict( @@ -229,7 +225,7 @@ def test_mlflow_config_correctly_set(kedro_project, project_settings): # TODO: when OmegaConfigLoader will support templating with globals, add a similar test @pytest.mark.usefixtures("mock_settings_omega_config_loader_class") -def test_mlflow_config_interpolated_with_globals_resolver(fake_project): +def test_mlflow_config_interpolated_with_globals_resolver(monkeypatch, fake_project): dict_config = dict( server=dict( mlflow_tracking_uri="${globals: mlflow_tracking_uri}", @@ -266,7 +262,7 @@ def test_mlflow_config_interpolated_with_globals_resolver(fake_project): ).as_uri() bootstrap_project(fake_project) - with KedroSession.create("fake_package", fake_project) as session: + with KedroSession.create(fake_project) as session: context = session.load_context() assert context.mlflow.dict(exclude={"project_path"}) == expected @@ -323,7 +319,7 @@ def request_headers(self): _write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config) bootstrap_project(fake_project) - with KedroSession.create("fake_package", fake_project) as session: + with KedroSession.create(project_path=fake_project) as session: session.load_context() # trigger setup and request_header_provider registration assert ( @@ -382,7 +378,7 @@ def request_headers(self): _write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config) bootstrap_project(fake_project) - with KedroSession.create("fake_package", fake_project) as session: + with KedroSession.create(project_path=fake_project) as session: session.load_context() # trigger setup and request_header_provider registration assert ( @@ -445,7 +441,7 @@ def request_headers(self): _write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config) bootstrap_project(fake_project) - with KedroSession.create("fake_package", fake_project) as session: + with KedroSession.create(project_path=fake_project) as session: session.load_context() # trigger setup and request_header_provider registration assert ( @@ -500,7 +496,7 @@ def request_headers(self): _write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config) bootstrap_project(fake_project) - with KedroSession.create("fake_package", fake_project) as session: + with KedroSession.create(project_path=fake_project) as session: with pytest.raises(ValueError, match=r"should be a sublass of"): session.load_context() # trigger setup and request_header_provider registration diff --git a/tests/framework/cli/test_cli.py b/tests/framework/cli/test_cli.py index da077d7a..0c10d1b9 100644 --- a/tests/framework/cli/test_cli.py +++ b/tests/framework/cli/test_cli.py @@ -104,9 +104,7 @@ def test_cli_init_existing_config( monkeypatch.chdir(kedro_project_with_mlflow_conf) bootstrap_project(kedro_project_with_mlflow_conf) - with KedroSession.create( - "fake_project", project_path=kedro_project_with_mlflow_conf - ) as session: + with KedroSession.create(project_path=kedro_project_with_mlflow_conf) as session: # emulate first call by writing a mlflow.yml file yaml_str = yaml.dump(dict(server=dict(mlflow_tracking_uri="toto"))) ( diff --git a/tests/framework/hooks/test_hook_deactivate_tracking.py b/tests/framework/hooks/test_hook_deactivate_tracking.py index 668e4410..9c4ebad6 100644 --- a/tests/framework/hooks/test_hook_deactivate_tracking.py +++ b/tests/framework/hooks/test_hook_deactivate_tracking.py @@ -115,9 +115,9 @@ def config_dir( payload = { "tool": { "kedro": { - "project_version": kedro_version, "project_name": MOCK_PACKAGE_NAME, "package_name": MOCK_PACKAGE_NAME, + "kedro_init_version": kedro_version, } } } @@ -221,7 +221,7 @@ def mock_session(mocker, mock_settings_with_mlflow_hooks, kedro_project_path): ) # prevent registering the one of the plugins which are already installed configure_project(MOCK_PACKAGE_NAME) - return KedroSession.create(MOCK_PACKAGE_NAME, kedro_project_path) + return KedroSession.create(kedro_project_path) def test_deactivated_tracking_but_not_for_given_pipeline(mock_session): diff --git a/tests/framework/hooks/test_hook_log_metrics.py b/tests/framework/hooks/test_hook_log_metrics.py index ea7f4962..b7197630 100644 --- a/tests/framework/hooks/test_hook_log_metrics.py +++ b/tests/framework/hooks/test_hook_log_metrics.py @@ -106,7 +106,7 @@ def dummy_catalog(tmp_path): "raw_data": MemoryDataset(pd.DataFrame(data=[1], columns=["a"])), "params:unused_param": MemoryDataset("blah"), "data": MemoryDataset(), - "model": PickleDataset((tmp_path / "model.csv").as_posix()), + "model": PickleDataset(filepath=(tmp_path / "model.csv").as_posix()), "my_metrics": MlflowMetricsHistoryDataset(), "another_metrics": MlflowMetricsHistoryDataset(prefix="foo"), "my_metric": MlflowMetricDataset(), @@ -181,7 +181,9 @@ def test_mlflow_hook_metrics_dataset_with_run_id( "params:unused_param": MemoryDataset("blah"), "data": MemoryDataset(), "model": PickleDataset( - (kedro_project_with_mlflow_conf / "data" / "model.csv").as_posix() + filepath=( + kedro_project_with_mlflow_conf / "data" / "model.csv" + ).as_posix() ), "my_metrics": MlflowMetricsHistoryDataset(run_id=existing_run_id), "another_metrics": MlflowMetricsHistoryDataset( diff --git a/tests/framework/hooks/test_hook_pipeline_ml.py b/tests/framework/hooks/test_hook_pipeline_ml.py index e898ef4a..d3147c16 100644 --- a/tests/framework/hooks/test_hook_pipeline_ml.py +++ b/tests/framework/hooks/test_hook_pipeline_ml.py @@ -91,7 +91,7 @@ def dummy_catalog(tmp_path): "raw_data": MemoryDataset(pd.DataFrame(data=[1], columns=["a"])), "params:unused_param": MemoryDataset("blah"), "data": MemoryDataset(), - "model": PickleDataset((tmp_path / "model.csv").as_posix()), + "model": PickleDataset(filepath=(tmp_path / "model.csv").as_posix()), } ) return dummy_catalog @@ -402,7 +402,9 @@ def test_mlflow_hook_save_pipeline_ml_with_parameters( "params:stopwords": MemoryDataset(["Hello", "Hi"]), "params:penalty": MemoryDataset(0.1), "model": PickleDataset( - (kedro_project_with_mlflow_conf / "data" / "model.csv").as_posix() + filepath=( + kedro_project_with_mlflow_conf / "data" / "model.csv" + ).as_posix() ), "params:threshold": MemoryDataset(0.5), } diff --git a/tests/mlflow/test_kedro_pipeline_model.py b/tests/mlflow/test_kedro_pipeline_model.py index a78dcf0a..696ee76a 100644 --- a/tests/mlflow/test_kedro_pipeline_model.py +++ b/tests/mlflow/test_kedro_pipeline_model.py @@ -162,8 +162,12 @@ def catalog_with_encoder(tmp_path): { "raw_data": MemoryDataset(), "data": MemoryDataset(), - "encoder": PickleDataset((tmp_path / "encoder.pkl").resolve().as_posix()), - "model": PickleDataset((tmp_path / "model.pkl").resolve().as_posix()), + "encoder": PickleDataset( + filepath=(tmp_path / "encoder.pkl").resolve().as_posix() + ), + "model": PickleDataset( + filepath=(tmp_path / "model.pkl").resolve().as_posix() + ), } ) return catalog_with_encoder @@ -176,9 +180,11 @@ def catalog_with_stopwords(tmp_path): "data": MemoryDataset(), "cleaned_data": MemoryDataset(), "stopwords_from_nltk": PickleDataset( - (tmp_path / "stopwords.pkl").resolve().as_posix() + filepath=(tmp_path / "stopwords.pkl").resolve().as_posix() + ), + "model": PickleDataset( + filepath=(tmp_path / "model.pkl").resolve().as_posix() ), - "model": PickleDataset((tmp_path / "model.pkl").resolve().as_posix()), } ) return catalog_with_stopwords @@ -192,7 +198,9 @@ def catalog_with_parameters(tmp_path): "cleaned_data": MemoryDataset(), "params:stopwords": MemoryDataset(["Hello", "Hi"]), "params:penalty": MemoryDataset(0), - "model": PickleDataset((tmp_path / "model.pkl").resolve().as_posix()), + "model": PickleDataset( + filepath=(tmp_path / "model.pkl").resolve().as_posix() + ), "params:threshold": MemoryDataset(0.5), } ) diff --git a/tests/pipeline/test_pipeline_ml.py b/tests/pipeline/test_pipeline_ml.py index e4a71841..3774058f 100644 --- a/tests/pipeline/test_pipeline_ml.py +++ b/tests/pipeline/test_pipeline_ml.py @@ -199,7 +199,7 @@ def dummy_catalog(): { "raw_data": MemoryDataset(), "data": MemoryDataset(), - "model": CSVDataset("fake/path/to/model.csv"), + "model": CSVDataset(filepath="fake/path/to/model.csv"), } ) return dummy_catalog @@ -211,8 +211,8 @@ def catalog_with_encoder(): { "raw_data": MemoryDataset(), "data": MemoryDataset(), - "encoder": CSVDataset("fake/path/to/encoder.csv"), - "model": CSVDataset("fake/path/to/model.csv"), + "encoder": CSVDataset(filepath="fake/path/to/encoder.csv"), + "model": CSVDataset(filepath="fake/path/to/model.csv"), } ) return catalog_with_encoder @@ -224,8 +224,8 @@ def catalog_with_stopwords(): { "data": MemoryDataset(), "cleaned_data": MemoryDataset(), - "stopwords_from_nltk": CSVDataset("fake/path/to/stopwords.csv"), - "model": CSVDataset("fake/path/to/model.csv"), + "stopwords_from_nltk": CSVDataset(filepath="fake/path/to/stopwords.csv"), + "model": CSVDataset(filepath="fake/path/to/model.csv"), } ) return catalog_with_stopwords @@ -239,7 +239,7 @@ def catalog_with_parameters(): "cleaned_data": MemoryDataset(), "params:stopwords": MemoryDataset(["Hello", "Hi"]), "params:penalty": MemoryDataset(0.1), - "model": CSVDataset("fake/path/to/model.csv"), + "model": CSVDataset(filepath="fake/path/to/model.csv"), "params:threshold": MemoryDataset(0.5), } )