Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to kedro 19 #502

Merged
merged 12 commits into from
Dec 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 10 additions & 11 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,30 @@ assignees: ''

---

**_If you like the repo, please give it a :star:_**
<!-- **_If you like the repo, please give it a :star:_** -->

## Description

Short description of the problem here.
<!-- Short description of the problem here. -->

## Context

How has this bug affected you? What were you trying to accomplish?
<!-- How has this bug affected you? What were you trying to accomplish? -->

## Steps to Reproduce

Please provide a detailed description. A Minimal Reproducible Example would really help to solve your issue faster (see this [Stack Overflow thread](https://stackoverflow.com/help/minimal-reproducible-example) to see how to create a good "reprex"). A link to a github repo is even better.
<!-- Please provide a detailed description. A Minimal Reproducible Example would really help to solve your issue faster (see this [Stack Overflow thread](https://stackoverflow.com/help/minimal-reproducible-example) to see how to create a good "reprex"). A link to a github repo is even better.
1. [First Step]
2. [Second Step]
3. [And so on...]
3. [And so on...] -->

## Expected Result

Tell us what should happen.
<!-- Tell us what should happen. -->

## Actual Result

Tell us what happens instead.
<!-- Tell us what happens instead. -->

```
-- If you received an error, place it here.
Expand All @@ -43,20 +42,20 @@ Tell us what happens instead.

## Your Environment

Include as many relevant details about the environment in which you experienced the bug:
<!-- Include as many relevant details about the environment in which you experienced the bug: -->

* `kedro` and `kedro-mlflow` version used (`pip show kedro` and `pip show kedro-mlflow`):
* Python version used (`python -V`):
* Operating system and version:

## Does the bug also happen with the last version on master?

The plugin is still in early development and known bugs are fixed as soon as we can. If you are lucky, your bug is already fixed on the `master` branch which is the most up to date. This branch contains our more recent development unpublished on PyPI yet.
<!-- The plugin is still in early development and known bugs are fixed as soon as we can. If you are lucky, your bug is already fixed on the `master` branch which is the most up to date. This branch contains our more recent development unpublished on PyPI yet.
In your environment, please try:
```bash
pip install --upgrade git+https://github.com/Galileo-Galilei/kedro-mlflow
```
And check if you can to reproduce the error. If you can't, just wait for the next release or use the master branch at your own risk!
And check if you can to reproduce the error. If you can't, just wait for the next release or use the master branch at your own risk! -->
10 changes: 5 additions & 5 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,16 @@ labels: 'Issue: Feature Request'
assignees: ''
---

*If you like the repo, please give it a :star:*
<!-- **_If you like the repo, please give it a :star:_** -->

## Description
A clear and concise description of what you want to achieve. An image or a code example is worth thousand words!
<!-- A clear and concise description of what you want to achieve. An image or a code example is worth thousand words! -->

## Context
Why is this change important to you? How would you use it? How can it benefit other users?
<!-- Why is this change important to you? How would you use it? How can it benefit other users? -->

## Possible Implementation
(Optional) Suggest an idea for implementing the addition or change.
<!-- (Optional) Suggest an idea for implementing the addition or change. -->

## Possible Alternatives
(Optional) Describe any alternative solutions or features you've considered.
<!-- (Optional) Describe any alternative solutions or features you've considered. -->
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
exclude: ^kedro_mlflow/template/project/run.py$
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.3
rev: v0.1.8
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
Expand Down
6 changes: 4 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@

- :sparkles: Add support for python 3.11 ([#450, rxm7706](https://github.com/Galileo-Galilei/kedro-mlflow/pull/450))
- :sparkles: :arrow_up: Add support for pydantic v2 ([#476](https://github.com/Galileo-Galilei/kedro-mlflow/pull/476))
- :sparkles: :arrow_up: Add support for ``kedro==0.19.X`` ([#](https://github.com/Galileo-Galilei/kedro-mlflow/pull/))

### Changed

- :boom: :sparkles: Change default ``copy_mode`` to ``"assign"`` in ``KedroPipelineModel`` because this is the most efficient setup (and usually the desired one) when serving a Kedro ``Pipeline`` as a Mlflow model. This is different from Kedro's default which is to deepcopy the dataset ([#463, ShubhamZoro](https://github.com/Galileo-Galilei/kedro-mlflow/pull/463)).
- :boom: ::arrow_up:: Drop support for ``kedro==0.18.X`` series.
- :boom: :sparkles: Change default ``copy_mode`` to ``"assign"`` in ``KedroPipelineModel`` because this is the most efficient setup (and usually the desired one) when serving a Kedro ``Pipeline`` as a Mlflow model. This is different from Kedro's default which is to deepcopy the dataset ([#463](https://github.com/Galileo-Galilei/kedro-mlflow/pull/463)).
- :boom: :recycle: ``MlflowArtifactDataset.__init__`` method ``data_set`` argument is renamed ``dataset`` to match new Kedro conventions ([#391](https://github.com/Galileo-Galilei/kedro-mlflow/pull/391)).
- :boom: :recycle: Rename the following ``DataSets`` with the ``Dataset`` suffix (without capitalized ``S``) to match new kedro conventions from ``kedro>=0.19`` and onwards ([#439, ShubhamZoro](https://github.com/Galileo-Galilei/kedro-mlflow/pull/439)):
- ``MlflowArtifactDataSet``->``MlflowArtifactDataset``
Expand All @@ -26,7 +28,7 @@

### Fixed

- :bug: Avoid error when using kedro==0.18.1 with `TemplatedConfigLoader` and no `mlflow.yml` configuration file ([#452, sami-sweng](https://github.com/Galileo-Galilei/kedro-mlflow/issues/452))
- :bug: Avoid error when using ``kedro==0.18.1`` with `TemplatedConfigLoader` and no `mlflow.yml` configuration file ([#452, sami-sweng](https://github.com/Galileo-Galilei/kedro-mlflow/issues/452))

## [0.11.9] - 2023-07-23

Expand Down
2 changes: 1 addition & 1 deletion docs/source/03_getting_started/01_example_project.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ pip install kedro-mlflow==0.11.10

## Install the toy project

For this end to end example, we will use the [kedro starter](https://kedro.readthedocs.io/en/stable/get_started/starters.html) with the [iris dataset](https://github.com/quantumblacklabs/kedro-starter-pandas-iris).
For this end to end example, we will use the [kedro starter](https://docs.kedro.org/en/stable/starters/starters.html#official-kedro-starters) with the [iris dataset](https://github.com/quantumblacklabs/kedro-starter-pandas-iris).

We use this project because:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/04_experimentation_tracking/06_mlflow_ui.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Mlflow offers a user interface (UI) that enable to browse the run history.

## The ``kedro-mlflow`` helper

When you use a local storage for kedro mlflow, you can call a [mlflow cli command](https://www.mlflow.org/docs/latest/quickstart.html#viewing-the-tracking-ui) to launch the UI if you do not have a [mlflow tracking server configured](https://www.mlflow.org/docs/latest/tracking.html#tracking-ui).
When you use a local storage for kedro mlflow, you can call a [mlflow cli command](https://www.mlflow.org/docs/latest/tracking.html#tracking-ui) to launch the UI if you do not have a [mlflow tracking server configured](https://www.mlflow.org/docs/latest/tracking.html#mlflow-tracking-server-optional).

To ensure this UI is linked to the tracking uri specified configuration, ``kedro-mlflow`` offers the following command:

Expand Down
2 changes: 1 addition & 1 deletion kedro_mlflow/framework/hooks/mlflow_hook.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ def after_context_created(
{"mlflow": ["mlflow*", "mlflow*/**", "**/mlflow*"]}
)
conf_mlflow_yml = context.config_loader["mlflow"]
except (MissingConfigException, AttributeError):
except MissingConfigException:
LOGGER.warning(
"No 'mlflow.yml' config file found in environment. Default configuration will be used. Use ``kedro mlflow init`` command in CLI to customize the configuration."
)
Expand Down
6 changes: 1 addition & 5 deletions kedro_mlflow/io/artifacts/mlflow_artifact_dataset.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
import shutil
from inspect import isclass
from pathlib import Path
from typing import Any, Dict, Union

import mlflow
from kedro.io import AbstractDataset, AbstractVersionedDataset
from kedro.io import AbstractVersionedDataset
from kedro.io.core import parse_dataset_definition
from mlflow.tracking import MlflowClient

Expand All @@ -21,9 +20,6 @@ def __new__(
artifact_path: str = None,
credentials: Dict[str, Any] = None,
):
if isclass(dataset["type"]) and issubclass(dataset["type"], AbstractDataset):
# parse_dataset_definition needs type to be a string, not the class itself
dataset["type"] = f"{dataset['type'].__module__}.{dataset['type'].__name__}"
dataset_obj, dataset_args = parse_dataset_definition(config=dataset)

# fake inheritance : this mlflow class should be a mother class which wraps
Expand Down
2 changes: 1 addition & 1 deletion kedro_mlflow/io/metrics/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from .mlflow_metric_dataset import MlflowMetricDataset
from .mlflow_metric_history_dataset import MlflowMetricHistoryDataset
from .mlflow_metrics_dataset import MlflowMetricsHistoryDataset
from .mlflow_metrics_history_dataset import MlflowMetricsHistoryDataset

__all__ = [
"MlflowMetricDataset",
Expand Down
6 changes: 3 additions & 3 deletions kedro_mlflow/io/models/mlflow_abstract_model_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
from kedro.io.core import DatasetError


class MlflowModelRegistryDataset(AbstractVersionedDataset):
class MlflowAbstractModelDataSet(AbstractVersionedDataset):
"""
Absract mother class for model datasets.
Abstract mother class for model datasets.
"""

def __init__(
Expand All @@ -21,7 +21,7 @@ def __init__(
save_args: Dict[str, Any] = None,
version: Version = None,
) -> None:
"""Initialize the Kedro MlflowModelDataSet.
"""Initialize the Kedro MlflowAbstractModelDataSet.

Parameters are passed from the Data Catalog.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@
from kedro.io import Version

from kedro_mlflow.io.models.mlflow_abstract_model_dataset import (
MlflowModelRegistryDataset,
MlflowAbstractModelDataSet,
)


class MlflowModelLocalFileSystemDataset(MlflowModelRegistryDataset):
class MlflowModelLocalFileSystemDataset(MlflowAbstractModelDataSet):
"""Wrapper for saving, logging and loading for all MLflow model flavor."""

def __init__(
Expand Down
4 changes: 2 additions & 2 deletions kedro_mlflow/io/models/mlflow_model_registry_dataset.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
from typing import Any, Dict, Optional, Union

from kedro_mlflow.io.models.mlflow_abstract_model_dataset import (
MlflowModelRegistryDataset,
MlflowAbstractModelDataSet,
)


class MlflowModelRegistryDataset(MlflowModelRegistryDataset):
class MlflowModelRegistryDataset(MlflowAbstractModelDataSet):
"""Wrapper for saving, logging and loading for all MLflow model flavor."""

def __init__(
Expand Down
4 changes: 2 additions & 2 deletions kedro_mlflow/io/models/mlflow_model_tracking_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@
from kedro.io.core import DatasetError

from kedro_mlflow.io.models.mlflow_abstract_model_dataset import (
MlflowModelRegistryDataset,
MlflowAbstractModelDataSet,
)


class MlflowModelTrackingDataset(MlflowModelRegistryDataset):
class MlflowModelTrackingDataset(MlflowAbstractModelDataSet):
"""Wrapper for saving, logging and loading for all MLflow model flavor."""

def __init__(
Expand Down
9 changes: 0 additions & 9 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,12 +1,3 @@
[tool.black]
exclude = "/template/"

[tool.isort]
profile = "black"
sections=["FUTURE","STDLIB","THIRDPARTY","FIRSTPARTY","LOCALFOLDER"]
known_first_party=["kedro_mlflow"]
known_third_party=["anyconfig","black","click","cookiecutter","flake8","isort","jinja2","kedro","mlflow","pandas","pytest","pytest_lazyfixture","setuptools","sklearn","yaml"]

[tool.ruff]
select = [
"F", # Pyflakes
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
kedro>=0.18.1, <0.19.0
kedro>=0.19.0, <0.20.0
kedro_datasets
mlflow>=1.0.0, <3.0.0
pydantic>=1.0.0, <3.0.0
22 changes: 9 additions & 13 deletions tests/config/test_get_mlflow_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ def test_mlflow_config_default(kedro_project):
def test_mlflow_config_in_uninitialized_project(kedro_project, package_name):
# config_with_base_mlflow_conf is a pytest.fixture in conftest
bootstrap_project(kedro_project)
session = KedroSession.create(project_path=kedro_project, package_name=package_name)
session = KedroSession.create(project_path=kedro_project)
context = session.load_context()
assert context.mlflow.dict() == dict(
server=dict(
Expand All @@ -90,9 +90,7 @@ def test_mlflow_config_with_no_experiment_name(kedro_project):
open((kedro_project / "conf" / "base" / "mlflow.yml").as_posix(), mode="w").close()

bootstrap_project(kedro_project)
session = KedroSession.create(
project_path=kedro_project, package_name="fake_project"
)
session = KedroSession.create(project_path=kedro_project)
context = session.load_context()
assert context.mlflow.dict() == dict(
server=dict(
Expand Down Expand Up @@ -203,9 +201,7 @@ def fake_project(tmp_path, local_logging_config):
)
def test_mlflow_config_correctly_set(kedro_project, project_settings):
bootstrap_project(kedro_project)
session = KedroSession.create(
project_path=kedro_project, package_name="fake_project"
)
session = KedroSession.create(project_path=kedro_project)
context = session.load_context()
assert context.mlflow.dict(exclude={"project_path"}) == dict(
server=dict(
Expand All @@ -229,7 +225,7 @@ def test_mlflow_config_correctly_set(kedro_project, project_settings):

# TODO: when OmegaConfigLoader will support templating with globals, add a similar test
@pytest.mark.usefixtures("mock_settings_omega_config_loader_class")
def test_mlflow_config_interpolated_with_globals_resolver(fake_project):
def test_mlflow_config_interpolated_with_globals_resolver(monkeypatch, fake_project):
dict_config = dict(
server=dict(
mlflow_tracking_uri="${globals: mlflow_tracking_uri}",
Expand Down Expand Up @@ -266,7 +262,7 @@ def test_mlflow_config_interpolated_with_globals_resolver(fake_project):
).as_uri()

bootstrap_project(fake_project)
with KedroSession.create("fake_package", fake_project) as session:
with KedroSession.create(fake_project) as session:
context = session.load_context()
assert context.mlflow.dict(exclude={"project_path"}) == expected

Expand Down Expand Up @@ -323,7 +319,7 @@ def request_headers(self):
_write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config)

bootstrap_project(fake_project)
with KedroSession.create("fake_package", fake_project) as session:
with KedroSession.create(project_path=fake_project) as session:
session.load_context() # trigger setup and request_header_provider registration

assert (
Expand Down Expand Up @@ -382,7 +378,7 @@ def request_headers(self):
_write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config)

bootstrap_project(fake_project)
with KedroSession.create("fake_package", fake_project) as session:
with KedroSession.create(project_path=fake_project) as session:
session.load_context() # trigger setup and request_header_provider registration

assert (
Expand Down Expand Up @@ -445,7 +441,7 @@ def request_headers(self):
_write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config)

bootstrap_project(fake_project)
with KedroSession.create("fake_package", fake_project) as session:
with KedroSession.create(project_path=fake_project) as session:
session.load_context() # trigger setup and request_header_provider registration

assert (
Expand Down Expand Up @@ -500,7 +496,7 @@ def request_headers(self):
_write_yaml(fake_project / "conf" / "local" / "mlflow.yml", dict_config)

bootstrap_project(fake_project)
with KedroSession.create("fake_package", fake_project) as session:
with KedroSession.create(project_path=fake_project) as session:
with pytest.raises(ValueError, match=r"should be a sublass of"):
session.load_context() # trigger setup and request_header_provider registration

Expand Down
4 changes: 1 addition & 3 deletions tests/framework/cli/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,9 +104,7 @@ def test_cli_init_existing_config(
monkeypatch.chdir(kedro_project_with_mlflow_conf)
bootstrap_project(kedro_project_with_mlflow_conf)

with KedroSession.create(
"fake_project", project_path=kedro_project_with_mlflow_conf
) as session:
with KedroSession.create(project_path=kedro_project_with_mlflow_conf) as session:
# emulate first call by writing a mlflow.yml file
yaml_str = yaml.dump(dict(server=dict(mlflow_tracking_uri="toto")))
(
Expand Down
4 changes: 2 additions & 2 deletions tests/framework/hooks/test_hook_deactivate_tracking.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,9 @@ def config_dir(
payload = {
"tool": {
"kedro": {
"project_version": kedro_version,
"project_name": MOCK_PACKAGE_NAME,
"package_name": MOCK_PACKAGE_NAME,
"kedro_init_version": kedro_version,
}
}
}
Expand Down Expand Up @@ -221,7 +221,7 @@ def mock_session(mocker, mock_settings_with_mlflow_hooks, kedro_project_path):
) # prevent registering the one of the plugins which are already installed

configure_project(MOCK_PACKAGE_NAME)
return KedroSession.create(MOCK_PACKAGE_NAME, kedro_project_path)
return KedroSession.create(kedro_project_path)


def test_deactivated_tracking_but_not_for_given_pipeline(mock_session):
Expand Down
6 changes: 4 additions & 2 deletions tests/framework/hooks/test_hook_log_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def dummy_catalog(tmp_path):
"raw_data": MemoryDataset(pd.DataFrame(data=[1], columns=["a"])),
"params:unused_param": MemoryDataset("blah"),
"data": MemoryDataset(),
"model": PickleDataset((tmp_path / "model.csv").as_posix()),
"model": PickleDataset(filepath=(tmp_path / "model.csv").as_posix()),
"my_metrics": MlflowMetricsHistoryDataset(),
"another_metrics": MlflowMetricsHistoryDataset(prefix="foo"),
"my_metric": MlflowMetricDataset(),
Expand Down Expand Up @@ -181,7 +181,9 @@ def test_mlflow_hook_metrics_dataset_with_run_id(
"params:unused_param": MemoryDataset("blah"),
"data": MemoryDataset(),
"model": PickleDataset(
(kedro_project_with_mlflow_conf / "data" / "model.csv").as_posix()
filepath=(
kedro_project_with_mlflow_conf / "data" / "model.csv"
).as_posix()
),
"my_metrics": MlflowMetricsHistoryDataset(run_id=existing_run_id),
"another_metrics": MlflowMetricsHistoryDataset(
Expand Down
Loading
Loading