Skip to content

Latest commit

 

History

History
57 lines (37 loc) · 980 Bytes

commands.md

File metadata and controls

57 lines (37 loc) · 980 Bytes

Commands

HumanEval

HEval Generation

poetry run python zero_shot_replication/runner.py --model=... --pset=human-eval

HEval Evaluation

poetry run evalplus.evaluate --dataset humaneval --samples=... --parallel 4 --min-time-limit 0.5 --gt-time-limit-factor 5

LeetCode

LC Generation

poetry run python zero_shot_replication/runner.py --model=... --pset=leetcode

LC Evaluation

poetry run python zero_shot_replication/evals/run_leetcode_eval.py --model=...

GMS8K

GMS8K Generation

poetry run python zero_shot_replication/runner.py --model=... --pset=gsm8k

GMS8K Eval

# run_MATH_eval can service both MATH and GMS8K
poetry run python evals/run_gsm8k_eval.py --model=...

MATH

Generation

poetry run python runner.py --provider openai --pset math --model ...
poetry run python zero_shot_replication/evals/run_math_eval.py  --model=...