Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce prepare for eval, fix evaluation bug #789

Merged
merged 9 commits into from
Dec 5, 2024

Conversation

Niccolo-Ajroldi
Copy link
Contributor

@Niccolo-Ajroldi Niccolo-Ajroldi commented Sep 15, 2024

Description

This pull request introduces a prepare_for_eval function and updates the code to support it.

The implementation follows the blueprint of @fsschneider in #719 (comment) and fixes the bug of giving a free evaluation to a submission that goes out of max_runtime (again #719 (comment)).

Function signature

The arguments of prepare_for_eval are the same as update_params, except for batch. I believe that prepare_for_eval should indeed be agnostic to the last batch used during training. The return type is the same as update_params.

def prepare_for_eval(workload: spec.Workload,
                     current_param_container: spec.ParameterContainer,
                     current_params_types: spec.ParameterTypeTree,
                     model_state: spec.ModelAuxiliaryState,
                     hyperparameters: spec.Hyperparameters,
                     loss_type: spec.LossType,
                     optimizer_state: spec.OptimizerState,
                     eval_results: List[Tuple[int, float]],
                     global_step: int,
                     rng: spec.RandomState) -> spec.UpdateReturn:
  return (optimizer_state, current_param_container, model_state)

List of changes

In submission_runner.py:

  • add timed call to prepare_for_eval
  • add profiler
  • move del batch before prepare_for_eval (instead than before evaluation)
  • update accumulated_submission_time after prepare_for_eval
  • compute is_time_remaining after prepare_for_eval
  • proceed to eval iff is_time_remaining
  • add prep_eval_rng

Minor changes:

  • add PrepareForEvalFn to spec
  • add prepare_for_eval to submission template
  • add prepare_for_eval to all pytorch and jax submissions
  • update the docs

Fixes #719 and #758 .

@Niccolo-Ajroldi Niccolo-Ajroldi requested a review from a team as a code owner September 15, 2024 11:32
Copy link

github-actions bot commented Sep 15, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@priyakasimbeg priyakasimbeg changed the base branch from main to dev October 17, 2024 01:22
submission_runner.py Outdated Show resolved Hide resolved
@priyakasimbeg
Copy link
Contributor

@fsschneider can we get your LGTM on this?

Copy link
Contributor

@fsschneider fsschneider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the very late review.
It looks good to me! Thanks for all the work Niccolò!

Quick questions, the changes to the reference_algorithms and prize_qualification algorithms aren't strictly necessary, right? Since we now default to a no-op if prepare_for_eval is not defined.
I fully agree that making it backward-compatible in that way is really helpful.

Again thanks for this crucial PR!

@Niccolo-Ajroldi
Copy link
Contributor Author

Thank you for reviewing!

Correct, the updates to reference_algorithms and prize_qualification are not necessary thanks to backward-compatibility.

I figured it would be nice to have all the implemented submissions reflecting the latest stage of the code, but I can reverse those changes if needed! (:

@fsschneider
Copy link
Contributor

I figured it would be nice to have all the implemented submissions reflecting the latest stage of the code, but I can reverse those changes if needed! (:

I agree with you. No need to revert, just wanted to check if I understand it correctly :)

@priyakasimbeg priyakasimbeg merged commit ea66793 into mlcommons:dev Dec 5, 2024
16 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inform submission about evaluation step
3 participants