This folder has ALUE datasets and programs which can process datasets into suitable format to train and test. You can get more detail in this folder.
This program aims to translate the GPTreview results into English or Chinese.
This program aims to get metrics by ourselves.You can get more detail in this folder.
This program aims to let native speakers to compare which model's generation is better.
Translated vicuna dataset
vicuna answers by three models
alternative GPT apis by freedom intelligence
eval pipeline by freedom intelligence
this folder contains eval-harness library
another results folder