The total number of users of social media continues to grow worldwide, resulting in the generation of vast amounts of data. Popular social networking sites such as Facebook, Twitter and Instagram dominate this sphere. According to estimates, 500 million tweets and 4.3 billion Facebook messages are posted every day. According to the latest Pew Research Report, nearly half of adults worldwide and two-thirds of all American adults (65%) use social networking.
In recent decades we have noticed a considerable increase in reports or confession posts of abuse victims on twitter. Most of the time victims do not report it to their guardians or the concerned authorities. Teenagers and minorities are the most affected group of abuse. Part of these victims tweets about their incident to let go of pain and suffering or as a cry for help. Identifying such reports are challenging because of the unavailability of annotated training data, and a high degree of data sparsity. To address this we are hosting TRACT on kaggle.
- We release the first small scale manually annotated corpus for abuse classification problem
- Propose a shared task for this problem TRACT
This new, multi-class classification task involves distinguishing three classes of tweets that mention abuse reportings: "report" (annotated as 1); "empathy" (annotated as 2); and "general" (annotated as 3)
-
Automatic classification of tweets reporting abuse
- F1-score for each class
- Micro Averaged F1 scores
-
Exploratory Analysis
- Saichethan Miriyala Reddy
- Kanishk Tyagi
- Abhay Anand Tripathi
for more information regarding dataset and scripts contact
We would like to thank Dr. Ambika Vishal Pawar (associate professor) and Dr. Ketan Kotecha (director) at Symbiosis Institute of Technology, Pune for their help and support.
@misc{https://doi.org/10.17632/my2vkfyffd.1,
doi = {10.17632/MY2VKFYFFD.1},
url = {https://data.mendeley.com/datasets/my2vkfyffd/1},
author = {Miriyala Reddy, Saichethan},
keywords = {Data Mining, Social Media, Domestic Abuse, Twitter},
title = {TRACT: Tweets Reporting Abuse Classification Task Corpus},
publisher = {Mendeley},
year = {2020}
}