Skip to content

Algorithmic-Q&A-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability

License

Notifications You must be signed in to change notification settings

UCSC-VLAA/AQA-Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AQA-Bench

Official Implementation for AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability in Algorithmic Environments

Acknowledge

This work is partially supported by a gift from Open Philanthropy. We thank the Center for AI Safety, the Microsoft Accelerate Foundation Models Research Program, the OpenAI Researcher Access Program, and the Google Cloud Research Credits Program for supporting our computing needs.

About

Algorithmic-Q&A-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages