You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In scoring/score_submissions.py, the function get_summary_df is responsible for gathering evaluation statistics from submission logs into a DataFrame. When scoring a submission, this function is invoked on every workload, and the resulting concatenated DataFrames are saved as CSV files (<submission>_summary.csv ).
The current implementation computes the time needed to reach the validation target as follows:
summary_df['time to target on val (s)'] =summary_df.apply(
lambdax: x['time to best eval on val (s)']
ifx['val target reached'] elsenp.inf,
axis=1)
This results in a time to target on val (s) equal to time to best eval on val (s) if a submission reaches the target. However, usually the time to the validation target is usually lower than the time to best eval score.
Performance profiles are not affected
Fortunately, this bug does not affect the performance profiles, nor the final scores. Despite the concatenated DataFrames are used to compute the performance profiles, fortunately, we ignore the existing time to target on val (s) column and perform instead a correct computation of the time to eval target.
Source or Possible Fix
I have implemented a fix in #792. The final scores and the performance profiles are unaffected after the fix. However, <submission>_summary.csv changes drastically. Here is an example on two workloads for the prize qualification baseline algorithm (first study):
The text was updated successfully, but these errors were encountered:
Thank you for identifying this issue and submitting a fix with detailed analysis!
Confirming that this bug does not affect final scoring and performance profiles. The summary_df is only used for logging purposes and does not feed into the scoring pipeline.
Just verified that in the scoring pipeline the time to target is computed correctly here: https://github.com/mlcommons/algorithmic-efficiency/blob/main/scoring/performance_profile.py#L155
priyakasimbeg
changed the title
Scoring bug 🐛 - incorrect computation of time to target on val (s) in get_summary_df
Scoring logging bug 🐛 - incorrect computation of time to target on val (s) in get_summary_dfOct 11, 2024
Description
In
scoring/score_submissions.py
, the functionget_summary_df
is responsible for gathering evaluation statistics from submission logs into a DataFrame. When scoring a submission, this function is invoked on every workload, and the resulting concatenated DataFrames are saved as CSV files (<submission>_summary.csv
).The current implementation computes the time needed to reach the validation target as follows:
algorithmic-efficiency/scoring/score_submissions.py
Lines 91 to 94 in a23b5ea
This results in a
time to target on val (s)
equal totime to best eval on val (s)
if a submission reaches the target. However, usually the time to the validation target is usually lower than the time to best eval score.Performance profiles are not affected
Fortunately, this bug does not affect the performance profiles, nor the final scores. Despite the concatenated DataFrames are used to compute the performance profiles, fortunately, we ignore the existing
time to target on val (s)
column and perform instead a correct computation of the time to eval target.Source or Possible Fix
I have implemented a fix in #792. The final scores and the performance profiles are unaffected after the fix. However,
<submission>_summary.csv
changes drastically. Here is an example on two workloads for the prize qualification baseline algorithm (first study):The text was updated successfully, but these errors were encountered: