-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.txt
executable file
·46 lines (41 loc) · 2.12 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
APPS
This is APPS, a code generation dataset for CodeLLM training and evaluation.
1.Run apps_create_split.py to get a data split.
python apps_create_split.py \
-r /path/to/APPS
You will have "train.jsonl" and "test.jsonl" in the APPS root directory.
They are data splits. "test.jsonl" will be used in the evaluation process.
In apps_create_split.py line 19, you can CUSTOMIZE data split as ("split_name",number_of_tasks_in_this_split), e.g. ("validate", 1000),
and there will be a "{split_name}.jsonl" file in the APPS root.
Of course, you can otherwise create new train.jsonl or test.jsonl file(s) directly.
2.Run example_generation.py to get an example of code results.
python example_generation.py \
-t /path/to/test.jsonl \
-r /path/to/APPS \
--target /path/to/code/result.jsonl \
-k number of generations per task
The target file (results of codes) can be anywhere, and can be any name.
The default value of k is 1. If you wanna generate more than 1 codes per task, use -k.
As in example, generation result SHOULD BE [{task_id:"test/0000", completion:"def code():..."}, ...]
3.Run test_one_solution.py to evaluate the code results.
python test_one_solution.py \
-t /path/to/test.jsonl\
-r /path/to/APPS\
-p \ # if you want to print result
--source /path/to/code/result.jsonl \
--target /path/to/evaluation/result.jsonl \
--tmp-dir /path/to/tmp/dir \
--delete \ # if you want to delete tmp dir after evaluation
-l /path/to/log \
The target file (results of evaluation) can be anywhere.
If -p, the result will be printed. In test_one_solution.py line 138, you can change k_list.
The tmp dir can be anywhere.
4.Run print_results.py to print result of evaluations
python print_results.py \
--source /path/to/evaluation/result.jsonl
Results include: pass@k (k_list can be modified in file)
unittest pass rate and error rate.
NOTE:
1. train data is different from test data. Be careful when splitting train data into test set.
ENV:
./conda_APPS.yml