Research Collection School Of Computing and Information Systems

Anatomy of online hate: Developing a taxonomy and machine learning models for identifying and classifying hate in online news media

Joni SALMINEN
Hind ALMEREKHI
Milica MILENKOVIC
Soon-Gyu JUNG
Haewoon KWAK, Singapore Management UniversityFollow
Haewoon KWAK
Bernard J. JANSEN

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

1-2018

Abstract

Online social media platforms generally attempt to mitigate hateful expressions, as these comments can be detrimental to the health of the community. However, automatically identifying hateful comments can be challenging. We manually label 5,143 hateful expressions posted to YouTube and Facebook videos among a dataset of 137,098 comments from an online news media. We then create a granular taxonomy of different types and targets of online hate and train machine learning models to automatically detect and classify the hateful comments in the full dataset. Our contribution is twofold: 1) creating a granular taxonomy for hateful online comments that includes both types and targets of hateful comments, and 2) experimenting with machine learning, including Logistic Regression, Decision Tree, Random Forest, Adaboost, and Linear SVM, to generate a multiclass, multilabel classification model that automatically detects and categorizes hateful comments in the context of online news media. We find that the best performing model is Linear SVM, with an average F1 score of 0.79 using TF-IDF features. We validate the model by testing its predictive ability, and, relatedly, provide insights on distinct types of hate speech taking place on social media.

Keywords

Online hate, toxic comments, social media, machine learning

Discipline

Databases and Information Systems | Social Media

Research Areas

Data Science and Engineering

Publication

Proceedings of the 12th International AAAI Conference on Web and Social Media, ICWSM 2018, Palo Alto, California USA, June 25-28

First Page

330

Last Page

339

ISBN

9781577357988

Publisher

AAAI Press

City or Country

Palo Alto

Citation

SALMINEN, Joni; ALMEREKHI, Hind; MILENKOVIC, Milica; JUNG, Soon-Gyu; KWAK, Haewoon; KWAK, Haewoon; and JANSEN, Bernard J.. Anatomy of online hate: Developing a taxonomy and machine learning models for identifying and classifying hate in online news media. (2018). Proceedings of the 12th International AAAI Conference on Web and Social Media, ICWSM 2018, Palo Alto, California USA, June 25-28. 330-339.
Available at: https://ink.library.smu.edu.sg/sis_research/5336

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17885

Download

Included in

Databases and Information Systems Commons, Social Media Commons

COinS

Research Collection School Of Computing and Information Systems

Anatomy of online hate: Developing a taxonomy and machine learning models for identifying and classifying hate in online news media

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Anatomy of online hate: Developing a taxonomy and machine learning models for identifying and classifying hate in online news media

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links