Skip to content

Latest commit

 

History

History
39 lines (33 loc) · 2.25 KB

README.md

File metadata and controls

39 lines (33 loc) · 2.25 KB

MiDe22

Official data repository of English and Turkish misinformation detection datasets from the LREC-COLING 2024 paper "MiDe22: An Annotated Multi-Event Tweet Dataset for Misinformation Detection".

Screenshot

Dataset

The dataset comprises 10,348 tweets: 5,284 for English and 5,064 for Turkish. Tweets in the dataset cover different topics: the Russia-Ukraine war, the COVID-19 pandemic, Refugees, and additional miscellaneous events. Three misinformation labels of the tweet are also given. Since we follow Twitter's Terms and Conditions, we publish tweet IDs, not the tweet content directly. Explanations of the columns of the file are as follows:

Column Name Description
Topic Topic of the tweet: Ukraine, Covid, Refugees or Misc
Event Event of the tweet: EN01-EN40 in English and TR01-TR40 in Turkish
Label Label of the tweet: True, False, or Other
Tweet_id Twitter ID of the tweet

The distribution of tweet counts in the dataset is as follows:

Lang Topic True False Other Total
EN Ukraine
Covid
Refugees
Misc
Total
320
167
94
146
727
393
514
328
494
1,729
618
663
796
751
2,828
1,331
1,344
1,218
1,391
5,284
TR Ukraine
Covid
Refugees
Misc
Total
129
190
61
289
669
338
558
202
634
1,732
477
816
298
1,072
2,663
944
1,564
561
1,995
5,064

Citation

If you make use of the datasets and codes, please cite the following paper:

@inproceedings{toraman-etal-2024-mide22,
    title = "{M}i{D}e22: An Annotated Multi-Event Tweet Dataset for Misinformation Detection",
    author = "Toraman, Cagri  and
      Ozcelik, Oguzhan  and
      Sahinuc, Furkan  and
      Can, Fazli",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.986",
    pages = "11283--11295"}