Evaluating ChatGPT’s Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness
-
Updated
Aug 17, 2024 - Python
Evaluating ChatGPT’s Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness
Official Implementation of ACL2024 paper "Direct Evaluation of Chain-of-Thought in Multi-hop Reasoning with Knowledge Graphs"(https://arxiv.org/abs/2402.11199).
Code and data for the ACL 2024 Findings paper "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning"
The Dataset and Official Implementation for <Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations> @ ACL 2024
About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning" . Do not hesitate to open an issue if you run into any trouble!
[NeurIPS 2024] An advanced persona-driven role-playing system with global faithfulness quantification and optimization. In memory of the Koishi's Day of 2024.
On the evaluation of deep learning interpretability methods for medical images under the scope of faithfulness
[EMNLP 2023] A Causal View of Entity Bias in (Large) Language Models
IBM AI explainability
Add a description, image, and links to the faithfulness topic page so that developers can more easily learn about it.
To associate your repository with the faithfulness topic, visit your repo's landing page and select "manage topics."