-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation functions #94
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
This would work better as a method in experiment. The user would run it outside of experiment.process() |
Restriction on passed function is it must take in a prompt dictionary and it must return a prompt dictionary |
Should support a list/tuple of functions. Don't support arguments. Encourage the user to use the prompt dictionary to parameterise. |
This has ended up being a very bare bones application of this and i'm not sure what value it actually adds over just running an evaluation function on the completed responses dictionary saved to disk after called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @EdwinB12 - looks great! this will definitely be useful in practice since having to manually post-run an evaluation on responses is not ideal. in the future, this will be a CLI command like the judge one
will merge this and I will add documentation pages for this
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #94 +/- ##
===========================================
+ Coverage 35.67% 52.41% +16.74%
===========================================
Files 38 38
Lines 1962 1984 +22
===========================================
+ Hits 700 1040 +340
+ Misses 1262 944 -318 ☔ View full report in Codecov by Sentry. |
Very simple application of using an evaluation function in prompto.