Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return evaluation results to callers #71

Merged
merged 2 commits into from
Oct 26, 2023

Conversation

tleyden
Copy link
Contributor

@tleyden tleyden commented Oct 25, 2023

In order for callers to use the evaluation results, for example displaying them in a UI, the evaluation functions should return these results to the caller.

This PR:

  1. Defines a container class for the eval results
  2. Decouples the calculation and printing of eval results
  3. Returns eval results as a container class instance

It uses Pydantic for the container class in case we later need to use validation or any other features offered by pydantic. It requires the same pydantic version as used in the Arcee platform.

Testing

Manually tested, here is output:

10/25/2023 10:53:29 - INFO - dalm.eval.eval_retriever_only - Construct passage index
10/25/2023 10:53:29 - INFO - dalm.eval.eval_retriever_only - Evaluation start
10/25/2023 10:53:30 - INFO - dalm.eval.utils - Retriever results:
10/25/2023 10:53:30 - INFO - dalm.eval.utils - Recall: 0.883495145631068
10/25/2023 10:53:30 - INFO - dalm.eval.utils - Precision: 0.08834951456310675
10/25/2023 10:53:30 - INFO - dalm.eval.utils - Hit Rate: 0.883495145631068
10/25/2023 10:53:30 - INFO - dalm.eval.utils - *************
10/25/2023 10:53:30 - INFO - root - Retriever evaluation results: total_examples=206 recall=0.883495145631068 precision=0.08834951456310675 hit_rate=0.883495145631068

@tleyden tleyden merged commit e6c3d29 into arcee-ai:main Oct 26, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants