wesholliday / llm-logic Public

Notifications You must be signed in to change notification settings
Fork 0
Star 8

Supplementary material for the EMNLP 2024 paper "Conditional and Modal Reasoning in Large Language Models" by Wesley H. Holliday, Matthew Mandelkern, and Cedegao E. Zhang

8 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
external_data		external_data
graphs		graphs
prompts		prompts
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
llm-logic.ipynb		llm-logic.ipynb

Repository files navigation

llm-logic

Supplementary Material for "Conditional and Modal Reasoning in Large Language Models"

Paper by Wesley H. Holliday (wesholliday@berkeley.edu), Matthew Mandelkern (mandelkern@nyu.edu), and Cedegao E. Zhang (cedzhang@mit.edu) available at https://arxiv.org/abs/2401.17169, forthcoming in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (https://2024.emnlp.org).

Contents

llm-logic.ipynb: The Jupyter notebook containing all the code used to run the experiments and generate the graphs presented in the paper.
prompts/: Includes all of the inference pattern questions used in the experiments.
data/: Stores the LLM responses to the inference pattern questions.
external_data/: Stores the GSM8K and MMLU scores for the tested models
graphs/ and results/: These directories are generated upon running the "llm-logic" notebook and contain the graphs and results of experiments, respectively.

Usage

To replicate the experiments or explore the results, open and run the "llm-logic.ipynb" Jupyter notebook.

Contact

For any queries related to the paper or the supplementary material, please write to wesholliday@berkeley.edu.

About

Supplementary material for the EMNLP 2024 paper "Conditional and Modal Reasoning in Large Language Models" by Wesley H. Holliday, Matthew Mandelkern, and Cedegao E. Zhang

Report repository

Releases

No releases published

Packages

No packages published