-
Notifications
You must be signed in to change notification settings - Fork 0
YizeWu00/APPS
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
APPS This is APPS, a code generation dataset for CodeLLM training and evaluation. 1.Run apps_create_split.py to get a data split. python apps_create_split.py \ -r /path/to/APPS You will have "train.jsonl" and "test.jsonl" in the APPS root directory. They are data splits. "test.jsonl" will be used in the evaluation process. In apps_create_split.py line 19, you can CUSTOMIZE data split as ("split_name",number_of_tasks_in_this_split), e.g. ("validate", 1000), and there will be a "{split_name}.jsonl" file in the APPS root. Of course, you can otherwise create new train.jsonl or test.jsonl file(s) directly. 2.Run example_generation.py to get an example of code results. python example_generation.py \ -t /path/to/test.jsonl \ -r /path/to/APPS \ --target /path/to/code/result.jsonl \ -k number of generations per task The target file (results of codes) can be anywhere, and can be any name. The default value of k is 1. If you wanna generate more than 1 codes per task, use -k. As in example, generation result SHOULD BE [{task_id:"test/0000", completion:"def code():..."}, ...] 3.Run test_one_solution.py to evaluate the code results. python test_one_solution.py \ -t /path/to/test.jsonl\ -r /path/to/APPS\ -p \ # if you want to print result --source /path/to/code/result.jsonl \ --target /path/to/evaluation/result.jsonl \ --tmp-dir /path/to/tmp/dir \ --delete \ # if you want to delete tmp dir after evaluation -l /path/to/log \ The target file (results of evaluation) can be anywhere. If -p, the result will be printed. In test_one_solution.py line 138, you can change k_list. The tmp dir can be anywhere. 4.Run print_results.py to print result of evaluations python print_results.py \ --source /path/to/evaluation/result.jsonl Results include: pass@k (k_list can be modified in file) unittest pass rate and error rate. NOTE: 1. train data is different from test data. Be careful when splitting train data into test set. ENV: ./conda_APPS.yml
About
APPS dataset for text-to-code generation
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published