July 7, 2018
In this social media era, many businesses are moving to incorporate user interaction and discussion in their platforms. However, a minority of users use the technology inappropriately, using it to threaten, insult, or create a generally toxic atmosphere.
We work to classify these toxic comments, without censoring the population as a whole.
We investigate ~160,000 comments, about 89.8% are normal user interaction, and 10.2% are toxic. The comments are classified in the following 6 ways:
- Toxic
- Severely toxic
- Obscene
- Insult
- Threat
- Identity hate
Many toxic comments are classified with more than one label. The labels have been classified by human raters.
A link to the challenge can be found here: https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data