Skip to content

CPSC 440 project on speaker Identification in novels using large language models

Notifications You must be signed in to change notification settings

rrhan0/CPSC-440-540-speaker-identification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CPSC-440-550-speaker-identification

CPSC 440 project on speaker Identification in novels using large language models by Richard Han, Charles Yan, Mu-Chen Liu.

Abstract

Dialogue attribution in the context of text analysis refers to the process of associating dialogue with the correct speaker in a conversation or narrative text. This is a crucial task in various fields such as natural language processing (NLP), literary analysis, and dialogue systems. Automating this task can be challenging, especially in texts where multiple characters interact closely, or where there are limited clues to identify the speaker. In recent years, large language models (LLMs) have gained significant attention for their ability to handle a wide range of natural language tasks with high proficiency. In this paper, we explore the capabilities of zero-shot LLMs in dialogue attribution through experimental analysis. The Mistral 7B Instruct model achieves the highest overall accuracy, while the LLamA 2 Chat model struggles to follow the instructions in the prompt to perform the task. We observe a positive correlation between dialogue attribution performance and the amount of provided context, particularly when the speaker is not explicitly stated in the dialogue. Additionally, the Mistral 7B Instruct model shows a performance plateau when identifying speakers whose identities are directly mentioned in the dialogue.

About

CPSC 440 project on speaker Identification in novels using large language models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published