[wip] Update

for-ai · Jul 13, 2024 · 0d9f682 · 0d9f682
1 parent 92c644a
commit 0d9f682
Show file tree

Hide file tree

Showing 2 changed files with 23 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -37,8 +37,28 @@ export GEMINI_API_KEY=<your gemini token>
 
 You can find all runnable experiments in the `scripts` directory.
 Their filename should explicitly tell you their purpose. 
-For example, `scripts/run_rm_evals.sh` runs the RewardBench inference pipeline on a select number of models given a dataset:
+
+### Getting rewards from a Reward Model (RM) on a HuggingFace dataset
+
+Here, we use the `rewardbench` command-line interface and pass a HuggingFace dataset.
+For example, if we want to get the reward score of the UltraRM-13b reward model on a preference dataset, we run:
 
 ```sh
-./scripts/run_rm_evals.sh
+rewardbench \
+    --model openbmb/UltraRM-13b \
+    --chat_template openbmb \
+    --dataset $DATASET \
+    --split $SPLIT \
+    --output_dir $OUTDIR \
+    --batch_size 8 \
+    --trust_remote_code \
+    --force_truncation \
+    --save_all 
 ```
+
+The evaluation parameters can be found in the [allenai/reward-bench](https://github.com/allenai/reward-bench/blob/main/scripts/configs/eval_configs.yaml) repository.
+This runs the reward model on the (prompt, chosen, rejected) triples and give us the reward score for each instance.
+The results are saved into a JSON file inside the `$OUTDIR` directory.
+Finally, you can find some experiments in the `scripts/run_rm_evals.sh` script.
+
+### 
diff --git a/scripts/run_generative.py b/scripts/run_generative.py
@@ -21,6 +21,7 @@
 # Examples:
 # python scripts/run_generative.py --dataset_name <DATASET_NAME> --model gpt-3.5-turbo
 # python scripts/run_generative.py --dataset_name <DATASET_NAME> --model=claude-3-haiku-20240307
+# python scripts/run_generative.py --dataset_name <DATASET_NAME> --model=CohereForAI/c4ai-command-r-v01 --num_gpus 2 --force_local
 
 # note: for none API models, this script uses vllm
 # pip install vllm