dvclive: update how to track results #4674

dberenbaum · 2023-07-05T18:11:19Z

Opening this in place of #4660 based on the comment to keep everything in one page.

Closes #4644.

This separates how dvclive tracks results and works with git and dvc into its own page. Before merging, we should decide where these explanations are sufficient, and where we need to make product updates to simplify.

shcheklein · 2023-07-05T22:57:04Z

content/docs/dvclive/how-it-works.md

+
+<admon type="tip">
+
+`save_dvc_exp=True` is ignored when [running with DVC](#run-with-dvc) since


should it save_dvc_exp=False that is ignored? or just save_dvc_exp

daavoo

The increased complexity worries me, although I don't have that many ideas on how to fight that. I would prefer to keep DVCLive docs about the happy path.

I see some potential changes like:

making save_dvc_exp=True by default in DVCLive so we could drop all the paragraphs about it.
Dropping Track large artifacts with DVC from here. We could say something like "use log_artifact to track with DVC" and redirect to a DVC page about data management.
Dropping Run with DVC . We could say "If you have or want to use a DVC pipeline go here" and link to a DVC page about pipelines.
Dropping Customize with DVC. It feels like it should be part of Running with DVC.

shcheklein · 2023-07-06T14:59:15Z

The increased complexity worries me, although I don't have that many ideas on how to fight that.

Same. And I also don't know a good solution for this yet. It feels we need to brainstorm the next iteration. What else we can do to make it simler.

dberenbaum · 2023-07-10T19:10:53Z

I would prefer to keep DVCLive docs about the happy path.

Not pretending to know the right balance of simplicity vs complexity which we are always struggling to get right, but my sense from recent feedback is that we have enough simple happy-path examples, and people struggle to understand how things work beyond that. This page to me is the equivalent of the dvclive user guide, where I would expect an in-depth explanation of how things work. How does it hurt the happy path?

making save_dvc_exp=True by default in DVCLive so we could drop all the paragraphs about it.

We can do this next release, but I think we should still mention here how it works or there's no way for people to understand what it does or the dangers of setting it to false.

Dropping Track large artifacts with DVC from here. We could say something like "use log_artifact to track with DVC" and redirect to a DVC page about data management.

Dropping Run with DVC . We could say "If you have or want to use a DVC pipeline go here" and link to a DVC page about pipelines.

This already links to those pages, but I think it's helpful to discuss how it specifically applies to the dvclive scenario.

Dropping Customize with DVC. It feels like it should be part of Running with DVC.

What about customizing plots? It doesn't feel to me like it belongs in Running with DVC.

dberenbaum · 2023-07-13T16:53:56Z

Discussed a couple concerns with @daavoo:

How much of this is about pipelines? Is it enough to better explain how to use dvclive with pipelines?
Can we put this info anywhere else to avoid developing a dvclive-specific guide?

Let me know if I missed anything. I'll think on these and try to do another draft.

github-actions · 2023-07-21T18:49:28Z

Link Check Report

There were no links to check!

dberenbaum · 2023-07-21T19:00:00Z

I took another pass at this and here's what I have:

Added a separate h2 for Run with DVC to discuss transitioning to pipelines in more depth. This section highlight the awkwardness of the current state, but I'd rather be explicit for now while we think of ways to make it smoother.
Under the existing h2 for Track the results, I added a short h3 for Customize with DVC and made a few minor updates but tried not to expand it much.

I'm also open to moving all the info into /docs/user-guide/experiment-management somewhere, but not strong opinion except that it probably doesn't belong in this PR.

@shcheklein @daavoo PTAL when you have a chance 🙏

dberenbaum · 2023-07-21T19:05:40Z

Also note that this would help with iterative/dvclive#631. We could catch cases where users call Live.log_artifact() inside dvc exp run but don't track the output in their pipeline and refer them back to this page.

dberenbaum · 2023-07-21T19:58:17Z

Seeing how much space we spend warning about not writing to dvclive/dvc.yaml, I'm very open to writing to the root dvc.yaml instead.

dberenbaum · 2023-08-03T22:02:31Z

@shcheklein @daavoo Any thoughts here? Do you feel it's better to close it?

shcheklein · 2023-08-04T01:50:13Z

content/docs/dvclive/how-it-works.md

-same path and overwrite the results each time. Include
+### Git integration
+
+Unlike other experiment trackers, DVCLive relies on Git to track the [directory]


My 2cs: I think track results can start with a bit basic stuff and something that I think more people can relate to / understands faster.

1.that we can track them in VS Code and Studio
2.may be ways to compare experiments, or just experiments, or tracking experiments - that where we can go into Git concept to a certain degree and large files, etc (even though I still think we need

The biggest issues with explanation is that people don't expect it / can't most likely even understand why we put it here until they hit some issues.

May be another idea - "DVCLive vs other trackers: important workflow details".

Renamed from "Track the results" to "Git and DVC integration" and introduced it by explaining that this differentiates it from other experiment trackers.

shcheklein · 2023-08-04T01:52:26Z

content/docs/dvclive/how-it-works.md


 Using `Live.log_image()` to log multiple images may also grow too large to track
 with Git, in which case you can use
 [`Live(cache_images=True)`](/doc/dvclive/live#parameters) to cache them.

-### Run with DVC
+### Customize with DVC


that probably also a bit too much? even if we keep it - should it be part of the Run with DVC?

Moved to part of Run with DVC and consolidated slightly.

shcheklein · 2023-08-04T01:53:30Z

content/docs/dvclive/how-it-works.md

+experiment run. Instead, write customizations to a new `dvc.yaml` file at the
+base of your repository or elsewhere outside the DVCLive directory.
+
+## Run with DVC

 Experimenting in Python interactively (like in notebooks) is great for


are there any other benefits?

There are more benefits listed later in the paragraph.

yep, that's fine - it's just a bit abstract to me (as an end user). I mean the "more structured way to run
reproducible experiments" part and parallelized hyperparameter search jumps right into the advanced case. Again, I'm paying a lot of attention to this here since I expect the readers of this won't be DVC, and even not necessarily advanced Git users. There should be a story using their language / terminology as much as possible. Sorry, Dave for all this iterations. no intent to block it. I'm fine to merge it any time since it's an improvement already.

Changed the examples here from parallelized hyperparameter search to multi-step pipeline or queueing multiple experiments.

daavoo · 2023-08-04T07:39:05Z

@shcheklein @daavoo Any thoughts here? Do you feel it's better to close it?

I think the added information is valuable, despite the concerns about formatting/location.
Better to have it (merge, iterate on follow-ups) than not.

dberenbaum · 2023-08-05T12:25:03Z

@shcheklein Did one more round of iterations. Let me know if you want to take a look.

shcheklein · 2023-08-05T16:23:23Z

content/docs/dvclive/how-it-works.md

-DVCLive expects each run to be tracked by Git, so it will save each run to the
-same path and overwrite the results each time. Include
+DVCLive differs from some other experiment trackers by relying on Git and DVC
+for tracking instead of a central database. This provides a closer connection to


quick thought: I guess it's somewhat similar to Tensorboard btw (no Git, but also not central database)

dvclive: update how to track results

b61eca3

dberenbaum mentioned this pull request Jul 5, 2023

dvc.yaml ignored if entire dvclive folder is tracked as a stage output iterative/dvclive#456

Closed

shcheklein deployed to dvc-org-dvclive-clarifi-pvevvl July 5, 2023 18:15 View deployment

shcheklein reviewed Jul 5, 2023

View reviewed changes

daavoo reviewed Jul 6, 2023

View reviewed changes

dberenbaum added 2 commits July 21, 2023 14:32

move dvclive run with dvc to new section

7a24fb3

merge

1589d59

shcheklein had a problem deploying to dvc-org-dvclive-clarifi-pvevvl July 21, 2023 18:42 Failure

minor consolidation

5b048b8

shcheklein had a problem deploying to dvc-org-dvclive-clarifi-pvevvl July 21, 2023 18:54 Failure

fix typo

5d329ea

shcheklein had a problem deploying to dvc-org-dvclive-clarifi-pvevvl July 21, 2023 19:36 Failure

add warning

e431055

shcheklein had a problem deploying to dvc-org-dvclive-clarifi-pvevvl July 21, 2023 19:52 Failure

shcheklein reviewed Aug 4, 2023

View reviewed changes

daavoo approved these changes Aug 4, 2023

View reviewed changes

dvclive: frame git/dvc integration

977c490

shcheklein had a problem deploying to dvc-org-dvclive-clarifi-pvevvl August 5, 2023 12:19 Failure

fix stage examples

fc31ee9

shcheklein had a problem deploying to dvc-org-dvclive-clarifi-pvevvl August 5, 2023 12:23 Failure

dberenbaum requested a review from shcheklein August 5, 2023 12:24

shcheklein reviewed Aug 5, 2023

View reviewed changes

shcheklein approved these changes Aug 5, 2023

View reviewed changes

change example of dvc exp run benefits

effbfb8

shcheklein requested a deployment to dvc-org-dvclive-clarifi-pvevvl August 8, 2023 14:35 Abandoned

fix merge conflicts

f57b700

shcheklein temporarily deployed to dvc-org-dvclive-clarifi-pvevvl August 8, 2023 14:37 Inactive

dberenbaum merged commit 7752926 into main Aug 8, 2023
2 checks passed

dberenbaum deleted the dvclive-clarifications-2 branch August 8, 2023 14:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dvclive: update how to track results #4674

dvclive: update how to track results #4674

dberenbaum commented Jul 5, 2023

shcheklein Jul 5, 2023

daavoo left a comment

shcheklein commented Jul 6, 2023

dberenbaum commented Jul 10, 2023

dberenbaum commented Jul 13, 2023

github-actions bot commented Jul 21, 2023 •

edited

Loading

dberenbaum commented Jul 21, 2023

dberenbaum commented Jul 21, 2023

dberenbaum commented Jul 21, 2023 •

edited

Loading

dberenbaum commented Aug 3, 2023

shcheklein Aug 4, 2023

dberenbaum Aug 5, 2023

shcheklein Aug 4, 2023

dberenbaum Aug 5, 2023

shcheklein Aug 4, 2023

dberenbaum Aug 5, 2023

shcheklein Aug 5, 2023

dberenbaum Aug 8, 2023

daavoo commented Aug 4, 2023

dberenbaum commented Aug 5, 2023

shcheklein Aug 5, 2023


		<admon type="tip">

		`save_dvc_exp=True` is ignored when [running with DVC](#run-with-dvc) since

dvclive: update how to track results #4674

dvclive: update how to track results #4674

Conversation

dberenbaum commented Jul 5, 2023

Choose a reason for hiding this comment

daavoo left a comment

Choose a reason for hiding this comment

shcheklein commented Jul 6, 2023

dberenbaum commented Jul 10, 2023

dberenbaum commented Jul 13, 2023

github-actions bot commented Jul 21, 2023 • edited Loading

Link Check Report

dberenbaum commented Jul 21, 2023

dberenbaum commented Jul 21, 2023

dberenbaum commented Jul 21, 2023 • edited Loading

dberenbaum commented Aug 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daavoo commented Aug 4, 2023

dberenbaum commented Aug 5, 2023

Choose a reason for hiding this comment

github-actions bot commented Jul 21, 2023 •

edited

Loading

dberenbaum commented Jul 21, 2023 •

edited

Loading