Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Beginning to flesh out the interface #5

Merged
merged 17 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions changelog/5.feature.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Adds the concept of MetricProvider's and Metrics to the core.
These represent the functionality that metric providers must implement in order to be part of the REF.
The implementation is still a work in progress and will be expanding in follow-up PRs.
27 changes: 27 additions & 0 deletions docs/explanation.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,33 @@ Points we will aim to cover:
We will aim to avoid writing instructions or technical descriptions here,
they belong elsewhere.

## Metric Providers

The REF aims to support a variety of metric providers.
These providers are responsible for performing the calculations and analyses.

Each metric provider generally provides a number of different metrics that can be calculated.
An example implementation of a metric provider is provided in the `ref_metrics_example` package.

### Metrics

A metric represents a specific calculation or analysis that can be performed on a dataset
or set of datasets with the aim for benchmarking the performance of different models.
These metrics often represent a specific aspects of the Earth system and are compared against
observations of the same quantities.

A metric depends upon a set of input model data and observation datasets.

The result of a metric calculation can be a range of different outcomes:

* A single scalar value
* Timeseries
* Plots

The Earth System Metrics and Diagnostics Standards
([EMDS](https://github.com/Earth-System-Diagnostics-Standards/EMDS))
provide a community standard for reporting outputs.
This enables the ability to generate standardised outputs that can be distributed.

## Execution Environments

Expand Down
2 changes: 2 additions & 0 deletions packages/ref-core/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,12 @@ classifiers = [
"Topic :: Scientific/Engineering",
]
dependencies = [
"attrs"
]

[tool.uv]
dev-dependencies = [

]

[build-system]
Expand Down
50 changes: 36 additions & 14 deletions packages/ref-core/src/ref_core/executor/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,11 @@
"""

import os
from typing import Protocol, runtime_checkable
from typing import Any, Protocol, runtime_checkable

from .local import LocalExecutor
from ref_core.executor.local import LocalExecutor
from ref_core.metrics import Configuration, Metric, MetricResult, TriggerInfo
from ref_core.providers import MetricsProvider


@runtime_checkable
Expand All @@ -33,11 +35,30 @@ class Executor(Protocol):

name: str

def run_metric(self, metric: object, *args, **kwargs) -> object: # type: ignore
def run_metric(
self, metric: Metric, configuration: Configuration, trigger: TriggerInfo | None, **kwargs: Any
) -> MetricResult:
"""
Execute a metric

Parameters
----------
metric
Metric to run
configuration
Configuration to run the metric with
trigger
Information about the dataset that triggered the metric run

TODO: The optionality of this parameter is a placeholder and will be expanded in the future.
kwargs
Additional keyword arguments for the executor

Returns
-------
:
Results from running the metric
"""
# TODO: Add type hints for metric and return value in follow-up PR
...


Expand Down Expand Up @@ -94,7 +115,7 @@ def get(self, name: str) -> Executor:
get_executor = _default_manager.get


def run_metric(metric_name: str, *args, **kwargs) -> object: # type: ignore
def run_metric(metric_name: str, /, metrics_provider: MetricsProvider, **kwargs: Any) -> MetricResult:
"""
Run a metric using the default executor

Expand All @@ -107,13 +128,10 @@ def run_metric(metric_name: str, *args, **kwargs) -> object: # type: ignore
----------
metric_name
Name of the metric to run.

Eventually the metric will be sourced from via some kind of registry.
For now, it's just a placeholder.
args
Extra arguments passed to the metric of interest
metrics_provider
Provider from where to retrieve the metric
kwargs
Extra keyword arguments passed to the metric of interest
Additional options passed to the metric executor

Returns
-------
Expand All @@ -123,10 +141,14 @@ def run_metric(metric_name: str, *args, **kwargs) -> object: # type: ignore
executor_name = os.environ.get("CMIP_REF_EXECUTOR", "local")

executor = get_executor(executor_name)
# metric = get_metric(metric_name) # TODO: Implement this
metric = kwargs.pop("metric")
metric = metrics_provider.get(metric_name)

result = executor.run_metric(metric, trigger=None, **kwargs)

# TODO: Validate the result
# TODO: Log the result

return executor.run_metric(metric, *args, **kwargs)
return result


register_executor(LocalExecutor())
18 changes: 15 additions & 3 deletions packages/ref-core/src/ref_core/executor/local.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
from typing import Any

from ref_core.metrics import Configuration, Metric, MetricResult, TriggerInfo


class LocalExecutor:
"""
Run a metric locally, in-process.
Expand All @@ -9,19 +14,26 @@ class LocalExecutor:

name = "local"

def run_metric(self, metric, *args, **kwargs): # type: ignore
def run_metric(
self, metric: Metric, configuration: Configuration, trigger: TriggerInfo | None, **kwargs: Any
) -> MetricResult:
"""
Run a metric in process

Parameters
----------
metric
args
Metric to run
configuration
Configuration to run the metric with
trigger
Information about the dataset that triggered the metric run
kwargs
Additional keyword arguments for the executor

Returns
-------
:
Results from running the metric
"""
return metric.run(*args, **kwargs)
return metric.run(configuration=configuration, trigger=trigger)
144 changes: 144 additions & 0 deletions packages/ref-core/src/ref_core/metrics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
import json
import pathlib
from typing import Any, Protocol, runtime_checkable

from attrs import frozen


@frozen
class Configuration:
"""
Configuration that describes the input data sources
"""

output_directory: pathlib.Path
"""
Directory to write output files to
"""

# TODO: Add more configuration options here


@frozen
class MetricResult:
"""
The result of running a metric.

The content of the result follows the Earth System Metrics and Diagnostics Standards
([EMDS](https://github.com/Earth-System-Diagnostics-Standards/EMDS/blob/main/standards.md)).
"""

# Do we want to load a serialised version of the output bundle here or just a file path?

output_bundle: pathlib.Path | None
"""
Path to the output bundle file.

The contents of this file are defined by
[EMDS standard](https://github.com/Earth-System-Diagnostics-Standards/EMDS/blob/main/standards.md#common-output-bundle-format-)
"""
successful: bool
"""
Whether the metric ran successfully.
"""
# Log info is in the output bundle file already, but is definitely useful

@staticmethod
def build(configuration: Configuration, cmec_output_bundle: dict[str, Any]) -> "MetricResult":
"""
Build a MetricResult from a CMEC output bundle.

Parameters
----------
configuration
The configuration used to run the metric.
cmec_output_bundle
An output bundle in the CMEC format.

TODO: This needs a better type hint

Returns
-------
:
A prepared MetricResult object.
The output bundle will be written to the output directory.
"""
with open(configuration.output_directory / "output.json", "w") as file_handle:
json.dump(cmec_output_bundle, file_handle)
return MetricResult(
output_bundle=configuration.output_directory / "output.json",
successful=True,
)


@frozen
class TriggerInfo:
"""
The reason why the metric was run.
"""

dataset: pathlib.Path
"""
Path to the dataset that triggered the metric run.
"""

# TODO:
# Add/remove/modified?
# dataset metadata


@runtime_checkable
class Metric(Protocol):
"""
Interface for the calculation of a metric.

This is a very high-level interface to provide maximum scope for the metrics packages
to have differing assumptions.
The configuration and output of the metric should follow the
Earth System Metrics and Diagnostics Standards formats as much as possible.

See (ref_example.example.ExampleMetric)[] for an example implementation.
"""

name: str
"""
Name of the metric being run

This should be unique for a given provider,
but multiple providers can implement the same metric.
"""

# input_variable: list[VariableDefinition]
"""
TODO: implement VariableDefinition
Should be extend the configuration defined in EMDS

Variables that the metric requires to run
Any modifications to the input data will trigger a new metric calculation.
"""
# observation_dataset: list[ObservationDatasetDefinition]
"""
TODO: implement ObservationDatasetDefinition
Should be extend the configuration defined in EMDS. To check with Bouwe.
"""

def run(self, configuration: Configuration, trigger: TriggerInfo | None) -> MetricResult:
"""
Run the metric on the given configuration.

The implementation of this method method is left to the metrics providers.

A CMEC-compatible package can use: TODO: Add link to CMEC metric wrapper

Parameters
----------
configuration : Configuration
The configuration to run the metric on.
trigger : TriggerInfo | None
Optional information about the dataset that triggered the metric run.

Returns
-------
MetricResult
The result of running the metric.
"""
58 changes: 58 additions & 0 deletions packages/ref-core/src/ref_core/providers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
"""
Interfaces for metrics providers.

This defines how metrics packages interoperate with the REF framework.
"""

from ref_core.metrics import Metric


class MetricsProvider:
"""
Interface for that a metrics provider must implement.

This provides a consistent interface to multiple different metrics packages.
"""

def __init__(self, name: str, version: str) -> None:
self.name = name
self.version = version

self._metrics: dict[str, Metric] = {}

def __len__(self) -> int:
return len(self._metrics)

def register(self, metric: Metric) -> None:
"""
Register a metric with the manager.

Parameters
----------
metric : Metric
The metric to register.
"""
if not isinstance(metric, Metric):
raise ValueError("Metric must be an instance of Metric")
self._metrics[metric.name.lower()] = metric

def get(self, name: str) -> Metric:
"""
Get a metric by name.

Parameters
----------
name : str
Name of the metric (case-sensitive).

Raises
------
KeyError
If the metric with the given name is not found.

Returns
-------
Metric
The requested metric.
"""
return self._metrics[name.lower()]
Loading
Loading