Semantically Diverse Language Generation for Uncertainty Estimation in Language Models

Lukas Aichberger¹, Kajetan Schweighofer¹, Mykyta Ielanskyi¹, Sepp Hochreiter^{1, 2}

¹ ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria
² NXAI GmbH, Linz, Austria

Key Insights

Provides a method to generate semantically diverse yet likely output sequences 🧠
Establishes a theoretical foundation for uncertainty measures in NLG 🧮
Outperforms existing uncertainty estimation methods in free-form question-answering tasks 📊

Abstract

Large language models (LLMs) can suffer from hallucinations when generating text. These hallucinations impede various applications in society and industry by making LLMs untrustworthy. Current LLMs generate text in an autoregressive fashion by predicting and appending text tokens. When an LLM is uncertain about the semantic meaning of the next tokens to generate, it is likely to start hallucinating. Thus, it has been suggested that hallucinations stem from predictive uncertainty. We introduce Semantically Diverse Language Generation (SDLG) to quantify predictive uncertainty in LLMs. SDLG steers the LLM to generate semantically diverse yet likely alternatives for an initially generated text. This approach provides a precise measure of aleatoric semantic uncertainty, detecting whether the initial text is likely to be hallucinated. Experiments on question-answering tasks demonstrate that SDLG consistently outperforms existing methods while being the most computationally efficient, setting a new standard for uncertainty estimation in LLMs.

Installation

Clone the repository:

git clone git@github.com:ml-jku/SDLG.git
cd SDLG

Install the required dependencies:

pip install -r requirements.txt

Running the Code

Set hyperparameters in args.py
Run experiments with run_experiments.py
Analyze results with analyze_results.ipynb

Contact

For support or queries, feel free to reach out at [aichberger@ml.jku.at].

Citation

Please consider giving our work a star ⭐ and cite it

@article{aichberger2024sdlg,
      title={Semantically Diverse Language Generation for Uncertainty Estimation in Language Models}, 
      author={Lukas Aichberger and Kajetan Schweighofer and Mykyta Ielanskyi and Sepp Hochreiter},
      journal={arXiv preprint arXiv:2406.04306},
      year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets		datasets
notebooks		notebooks
README.md		README.md
SDLG.pdf		SDLG.pdf
SDLG.png		SDLG.png
analyze_results.ipynb		analyze_results.ipynb
args.py		args.py
run_experiments.py		run_experiments.py
sdlg.py		sdlg.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantically Diverse Language Generation for Uncertainty Estimation in Language Models

Key Insights

Abstract

Installation

Running the Code

Contact

Citation

About

Releases

Packages

Contributors 2

Languages

ml-jku/SDLG

Folders and files

Latest commit

History

Repository files navigation

Semantically Diverse Language Generation for Uncertainty Estimation in Language Models

Key Insights

Abstract

Installation

Running the Code

Contact

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages