TY - GEN
T1 - A Deep Analysis of Textual Features Based Cyberbullying Detection Using Machine Learning
AU - Mahmud, Md Ishtyaq
AU - Mamun, Muntasir
AU - Abdelgawad, Ahmed
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Today's internet advancements boost our electronic connectivity to one another through the use of social media platforms. Using social media has facilitated us in many ways, but it has also negatively impacted us. One of the negative repercussions of utilizing social media is cyberbullying, which harms our reputation, privacy, and feelings, or harasses us. Cyberbullying can be controlled by early detection and legal action. By using machine learning and natural language processing (NLP), it is possible to automatically identify tweets, images, and videos that contain offensive language associated with bullying. In this study, we analyzed five distinct machine learning models, including LightGBM, XGBoost, Logistic Regression, Random Forest, and AdaBoost, to detect cyberbullying using the textual feature-based tweeters dataset. We used more than 47,000 tweets from our dataset, which were divided into six classes. We analyzed the machine learning model and observed that LightGBM performed significantly better than other models, reaching accuracy rates of 85.5%, precision rates of 84%, recall rates of 85%, and an F-1 score of 84.49%.
AB - Today's internet advancements boost our electronic connectivity to one another through the use of social media platforms. Using social media has facilitated us in many ways, but it has also negatively impacted us. One of the negative repercussions of utilizing social media is cyberbullying, which harms our reputation, privacy, and feelings, or harasses us. Cyberbullying can be controlled by early detection and legal action. By using machine learning and natural language processing (NLP), it is possible to automatically identify tweets, images, and videos that contain offensive language associated with bullying. In this study, we analyzed five distinct machine learning models, including LightGBM, XGBoost, Logistic Regression, Random Forest, and AdaBoost, to detect cyberbullying using the textual feature-based tweeters dataset. We used more than 47,000 tweets from our dataset, which were divided into six classes. We analyzed the machine learning model and observed that LightGBM performed significantly better than other models, reaching accuracy rates of 85.5%, precision rates of 84%, recall rates of 85%, and an F-1 score of 84.49%.
KW - Cyberbullying
KW - Machine Learning
KW - Text Classification
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85147690356&partnerID=8YFLogxK
U2 - 10.1109/GCAIoT57150.2022.10019058
DO - 10.1109/GCAIoT57150.2022.10019058
M3 - Conference contribution
AN - SCOPUS:85147690356
T3 - 2022 IEEE Global Conference on Artificial Intelligence and Internet of Things, GCAIoT 2022
SP - 166
EP - 170
BT - 2022 IEEE Global Conference on Artificial Intelligence and Internet of Things, GCAIoT 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE Global Conference on Artificial Intelligence and Internet of Things, GCAIoT 2022
Y2 - 18 December 2022 through 21 December 2022
ER -