Multi-Lingual Hate Speech Detection using Laser, Muse with deep learning

Khalil ul Rehman

Multi-Lingual Hate Speech Detection using Laser, Muse with deep learning

Files

Khalil_Thesis_Final.pdf (868.31 KB)

Date

2022

Authors

Khalil ul Rehman

Publisher

UMT, Lahore

Abstract

According to the National Institute of Standards and Technology, hate speech is "any communication that disparages a person or a group based on any attribute such as race, ethnicity, gender, sexual orientation, nationality, religion, or other trait." Our study differs from previous efforts in that we conduct the experiment on a considerably broader variety of languages 11 and with more datasets 11. We conduct a total of 4 analysis that consist of models: Logic Regression, mBERT, BERT, and CNN-GRU with the help of LASER and MUSE embedding. We conclude that in the low resource model settings machine learning model as LASER embedding with LR gives the best results, while with the high resource model setting BERT based models give the high performance according to our observations. In low resource languages, Italian and Portuguese get the best results. Our research aims to use current hate speech resources to create models that detect hate speech using LASER, MUSE and Deep Learning Models

URI

https://escholar.umt.edu.pk/handle/123456789/7716

Collections

2022

Full item page