Add option to pass in multiple template text files for LLM-as-judge eval #99

rchan26 · 2024-09-10T16:06:59Z

Fix for #98.

With this PR, you can pass in a number of .txt files for a new --templates argument separated with commas, e.g.

prompto_run_experiment ... --templates temp1.txt,temp2.txt

or

prompto_create_judge_file ... --templates 'temp1.txt, temp2.txt'

Tests have been updated for this.

To do:

Write up docs for evaluation framework (for Improve evaluation functionalities #84)
Add notebook example (for Improve evaluation functionalities #84)

review-notebook-app · 2024-09-11T16:42:39Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov-commenter · 2024-09-11T17:38:09Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 63.86%. Comparing base (48d0440) to head (c3c9ee2).
Report is 16 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #99      +/-   ##
==========================================
+ Coverage   63.62%   63.86%   +0.23%     
==========================================
  Files          40       40              
  Lines        2128     2142      +14     
==========================================
+ Hits         1354     1368      +14     
  Misses        774      774

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rchan26 added 3 commits September 10, 2024 14:55

rename judge location to judge folder

babd868

add option to pass in multiple template prompts

0b80209

start updating docs

54cecbd

rchan26 marked this pull request as draft September 10, 2024 16:07

update docstring for load_judge_folder

a802fae

This was linked to issues Sep 10, 2024

Improve evaluation functionalities #84

Closed

Support multiple prompt templates for LLM-as-judge evaluation #98

Closed

rchan26 added 5 commits September 11, 2024 15:51

add docs for evaluation

a416a2b

clean up judge settings example

29a50c3

add spellchecker to pre-commit

247e6fe

add judge example

f70ecc3

update judge argument ordering

9c9f077

rchan26 added 3 commits September 11, 2024 18:14

add text to notebook example

7ddbb1a

more details in tqdm bar for creating judge inputs

57e22d5

update docs page for judge examples

53e2cce

rchan26 marked this pull request as ready for review September 11, 2024 17:21

rchan26 added 3 commits September 11, 2024 18:28

small fixes for eval docs

dc07252

remove duplicate line for notebook link

bf3197b

fix typo

c3c9ee2

rchan26 merged commit 7d4b251 into main Sep 11, 2024
6 checks passed

rchan26 deleted the eval-docs branch September 11, 2024 17:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to pass in multiple template text files for LLM-as-judge eval #99

Add option to pass in multiple template text files for LLM-as-judge eval #99

rchan26 commented Sep 10, 2024 •

edited

Loading

review-notebook-app bot commented Sep 11, 2024

codecov-commenter commented Sep 11, 2024 •

edited

Loading

Add option to pass in multiple template text files for LLM-as-judge eval #99

Add option to pass in multiple template text files for LLM-as-judge eval #99

Conversation

rchan26 commented Sep 10, 2024 • edited Loading

review-notebook-app bot commented Sep 11, 2024

codecov-commenter commented Sep 11, 2024 • edited Loading

Codecov Report

rchan26 commented Sep 10, 2024 •

edited

Loading

codecov-commenter commented Sep 11, 2024 •

edited

Loading