Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp show: show metrics that include separator #9819

Merged
merged 7 commits into from
Aug 24, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 11 additions & 8 deletions dvc/repo/experiments/show.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,26 +177,29 @@ def _build_rows(
)


def _sort_column(
def _sort_column( # noqa: C901
sort_by: str,
metric_names: Mapping[str, Iterable[str]],
param_names: Mapping[str, Iterable[str]],
) -> Tuple[str, str, str]:
path, _, sort_name = sort_by.rpartition(":")
sep = ":"
parts = sort_by.split(sep)
matches: Set[Tuple[str, str, str]] = set()

if path:
for split_num in range(1, len(parts)):
path = sep.join(parts[:split_num])
sort_name = sep.join(parts[split_num:])
if path in metric_names and sort_name in metric_names[path]:
matches.add((path, sort_name, "metrics"))
if path in param_names and sort_name in param_names[path]:
matches.add((path, sort_name, "params"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would have to go from range(0, len(parts)), it has to cover the case where there is no path, and only a metric containing a colon

IMO this is also more confusing than requiring escaping, and I think there are still edge cases where the user could have filenames and metric names that overlap and break this kind of behavior

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would have to go from range(0, len(parts)), it has to cover the case where there is no path, and only a metric containing a colon

Thanks, fixed.

IMO this is also more confusing than requiring escaping, and I think there are still edge cases where the user could have filenames and metric names that overlap and break this kind of behavior

My thoughts on this PR:

  1. It requires nothing of the user
  2. Edge cases will show the same error about being ambiguous that shows now unless I miss something
  3. Managing escape characters between how they get read in the user's shell and how we handle them seems more confusing to me (and more for the user to understand)
  4. Don't see much harm since the implementation already exists

Do you feel it's a blocker and escaping is the only way to go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the existing behavior is already broken, we can merge this.

But my objection is more that this same problem affects every other command/flag combination that uses path:metric_or_param addressing (like exp run --set-param and probably also stage add --params). I would prefer to only have to fix this once, in a consistent and re-usable way. This PR is just patching over one symptom of a broader underlying problem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in the same vein, I'm not sure that we actually document that . is an invalid character in metric or parameter names, but for all intents and purposes we don't support it because we treat . as the dictionary level separator in metric/param addressing (even though it is valid to have string keys that contain . in yaml/json). It would be easier for us to just treat : the same way and disallow it entirely in metric/param names

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid points. As mentioned in the issue description, I don't see much harm in doing this unless/until we forbid these characters or have a better solution like escaping, so I'm going to merge but open an issue to discuss whether we should allow it.

else:
if not matches:
for path in metric_names:
if sort_name in metric_names[path]:
matches.add((path, sort_name, "metrics"))
if sort_by in metric_names[path]:
matches.add((path, sort_by, "metrics"))
for path in param_names:
if sort_name in param_names[path]:
matches.add((path, sort_name, "params"))
if sort_by in param_names[path]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to support it in --set-param or are we implying that this will only be supported for exp show / metrics?

(in the current fix we are also supporting params in the exp show)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth discussing, but I figured even this limited fix is better than the current situation.

matches.add((path, sort_by, "params"))

if len(matches) == 1:
return matches.pop()
Expand Down
13 changes: 13 additions & 0 deletions tests/func/experiments/test_show.py
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,19 @@ def test_show_sort(tmp_dir, scm, dvc, exp_stage, caplog):
assert main(["exp", "show", "--no-pager", "--sort-by=metrics.yaml:foo"]) == 0


def test_show_sort_metric_sep(tmp_dir, scm, dvc, caplog):
daavoo marked this conversation as resolved.
Show resolved Hide resolved
metrics_path = tmp_dir / "metrics:1.json"
metrics_path.write_text('{"my::metric": 1}')
metrics_path = tmp_dir / "metrics:2.json"
metrics_path.write_text('{"my::metric": 2}')
dvcyaml_path = tmp_dir / "dvc.yaml"
dvcyaml_path.write_text("metrics: ['metrics:1.json', 'metrics:2.json']")
dvc.experiments.save()
assert (
main(["exp", "show", "--no-pager", "--sort-by=metrics:1.json:my::metric"]) == 0
)


@pytest.mark.vscode
@pytest.mark.parametrize(
"status, pid_exists",
Expand Down
Loading