Skip to content

Datasets and description of Arabic model evaluation task

Notifications You must be signed in to change notification settings

AI-Initiative-KAUST/Arabic-eval

 
 

Repository files navigation

Arabic-eval

ALUE_main

This folder has ALUE datasets and programs which can process datasets into suitable format to train and test. You can get more detail in this folder.

translate_to_review

This program aims to translate the GPTreview results into English or Chinese.

bert-baselines

This program aims to get metrics by ourselves.You can get more detail in this folder.

eval_human

This program aims to let native speakers to compare which model's generation is better.

data

Translated vicuna dataset

gen_data,

vicuna answers by three models

gpt, GPT4-...,

alternative GPT apis by freedom intelligence

LLM,

eval pipeline by freedom intelligence

mmlu-chatgpt

this folder contains eval-harness library

vicuna_eval_results

another results folder

About

Datasets and description of Arabic model evaluation task

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 66.6%
  • Jupyter Notebook 29.8%
  • Shell 3.6%