Multi-Lingual Hate Speech Detection using Laser, Muse with deep learning
Loading...
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
UMT, Lahore
Abstract
According to the National Institute of Standards and Technology, hate speech is "any communication that disparages a person or a group based on any attribute such as race, ethnicity, gender, sexual orientation, nationality, religion, or other trait." Our study differs from previous efforts in that we conduct the experiment on a considerably broader variety of languages 11 and with more datasets 11. We conduct a total of 4 analysis that consist of models: Logic Regression, mBERT, BERT, and CNN-GRU with the help of LASER and MUSE embedding. We conclude that in the low resource model settings machine learning model as LASER embedding with LR gives the best results, while with the high resource model setting BERT based models give the high performance according to our observations. In low resource languages, Italian and Portuguese get the best results. Our research aims to use current hate speech resources to create models that detect hate speech using LASER, MUSE and Deep Learning Models