Skip to content

MIND-Lab/Soft-metrics-for-evaluation-with-disagreements

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Soft metrics for evaluation with disagreements

The move towards preserving judgement disagreements in NLP requires the identification of adequate evaluation metrics. We identify a set of key properties that such metrics should have, and assess the extent to which natural candidates for soft evaluation such as Cross Entropy satisfy such properties. We employ a theoretical framework, supported by a visual approach, by practical examples, and by the analysis of a real case scenario. Our results indicate that Cross Entropy can result in fairly paradoxical results in some cases, whereas other measures Manhattan distance and Euclidean distance exhibit a more intuitive behavior, at least for the case of binary classification.

Citation

If you found our work useful, please cite our papers:

Soft metrics for evaluation with disagreements: an assessment

@inproceedings{rizzi2024soft,
  title={Soft metrics for evaluation with disagreements: an assessment},
  author={Rizzi, Giulia and Leonardelli, Elisa and Poesio, Massimo and Uma, Alexandra and Pavlovic, Maja and Paun, Silviu and Rosso, Paolo and Fersini, Elisabetta},
  booktitle={Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024},
  pages={84--94},
  year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published