Skip to content

The first-ever small scale manually annotated corpus for abuse report identification

Notifications You must be signed in to change notification settings

Saichethan/TRACT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tweets Reporting Abuse Classification Task

The total number of users of social media continues to grow worldwide, resulting in the generation of vast amounts of data. Popular social networking sites such as Facebook, Twitter and Instagram dominate this sphere. According to estimates, 500 million tweets and 4.3 billion Facebook messages are posted every day. According to the latest Pew Research Report, nearly half of adults worldwide and two-thirds of all American adults (65%) use social networking.

In recent decades we have noticed a considerable increase in reports or confession posts of abuse victims on twitter. Most of the time victims do not report it to their guardians or the concerned authorities. Teenagers and minorities are the most affected group of abuse. Part of these victims tweets about their incident to let go of pain and suffering or as a cry for help. Identifying such reports are challenging because of the unavailability of annotated training data, and a high degree of data sparsity. To address this we are hosting TRACT on kaggle.

  • We release the first small scale manually annotated corpus for abuse classification problem
  • Propose a shared task for this problem TRACT

Task

This new, multi-class classification task involves distinguishing three classes of tweets that mention abuse reportings: "report" (annotated as 1); "empathy" (annotated as 2); and "general" (annotated as 3)

  1. Automatic classification of tweets reporting abuse

    • F1-score for each class
    • Micro Averaged F1 scores
  2. Exploratory Analysis

Contributors

  1. Saichethan Miriyala Reddy
  2. Kanishk Tyagi
  3. Abhay Anand Tripathi

Contact

for more information regarding dataset and scripts contact

Acknowledgment

We would like to thank Dr. Ambika Vishal Pawar (associate professor) and Dr. Ketan Kotecha (director) at Symbiosis Institute of Technology, Pune for their help and support.

References

@misc{https://doi.org/10.17632/my2vkfyffd.1,
  doi = {10.17632/MY2VKFYFFD.1},
  url = {https://data.mendeley.com/datasets/my2vkfyffd/1},
  author = {Miriyala Reddy, Saichethan},
  keywords = {Data Mining, Social Media, Domestic Abuse, Twitter},
  title = {TRACT: Tweets Reporting Abuse Classification Task Corpus},
  publisher = {Mendeley},
  year = {2020}
}

About

The first-ever small scale manually annotated corpus for abuse report identification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages