Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Benchmark Report Generation #2235

Open
pvk-developer opened this issue Sep 24, 2024 · 0 comments · May be fixed by #2248
Open

Enhance Benchmark Report Generation #2235

pvk-developer opened this issue Sep 24, 2024 · 0 comments · May be fixed by #2248
Assignees
Labels
feature request Request for a new feature

Comments

@pvk-developer
Copy link
Member

Problem Description

As an engineer, I need a clear and comprehensive overview of the new data types being supported and the methods involved, along with detailed feedback on any failures when running the data types benchmark once #2206 is merged.

Expected behavior

The spreadsheet generated after running the benchmarks should meet the following criteria:

  1. Visual Marking of Changes:

    • Mark cells with new True values (indicating successful support for a data type) in green.
    • Mark cells with new False values (indicating failures or lack of support for a data type) in red.
  2. Summary Sheet with Key Metrics:

    • Include columns for:
      • Dtype: The data type.
      • Sdtype: The semantic data type.
      • 3.8: The % support for this python version
      • 3.9: The % support for this python version
      • 3.10: The % support for this python version
      • 3.11: The % support for this python version
      • 3.12: The % support for this python version
      • Total % Support: The percentage of supported methods for each combination of dtype, sdtype, and Python version.
  3. Percentage Calculation:

    • For each dtype and sdtype combination, compute the percentage of True values across all tested methods for each Python version, representing the support level.
    • Compute the total percentage of support as the average of the True values across all Python versions.
    • Conditional Summing for Edge Cases:
      • Implement logic to adjust the percentage calculation for cases where non-support is expected. For example, we currently don't support FixedCombinations unless the sdtype is either categorical or boolean. Non-supported cases (e.g., numericals in this case) should not negatively impact the percentage.
  4. Order of the sheets:

    • Ideally we should land into the Summary sheet, then the previously_unseen and followed by the python versions.
  5. Overall Goal:

    • Provide a clear, actionable view of data type support, summarized across all Python versions, while highlighting areas of improvement and exceptions.
@pvk-developer pvk-developer added the feature request Request for a new feature label Sep 24, 2024
@pvk-developer pvk-developer self-assigned this Sep 27, 2024
@pvk-developer pvk-developer linked a pull request Oct 2, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant