An Ensemble Approach to Cyberbullying Detection and Prevention on Social Media
Main Article Content
Abstract
Over the past decade, digital communication has reached a massive scale globally. Unfortunately, cyberbullying has also seen a significant increase which commensurate with the growth of digital technology, and perpetrators hiding behind the cloak of relative internet anonymity. Studies have shown that cyberbullying leaves a lasting psychological scar on its victims and often have devastating outcome. This has necessitated the development of measures to curb cyberbullying. This study presents one of such measure in the form of an ensemble model for cyberbullying detection. The proposed model features a majority voting ensemble approach to cyberbullying detection using three (3) supervised machine learning classifiers: SVM, NB and K-NN,
as base learners. The malignant comment dataset, sourced from Kaggle.com. was used for model building at a split ratio of 70: 30 to achieve maximum model training and evaluation respectively. Evaluation result was based on standard metrics. The proposed ensemble model performed best of all the models implemented, with an accuracy of 95%. It was also observed to be the most consistent classifier across all the metrics considered. This showcased the efficacy of the ensemble model in cyberbullying comments detection.
Article Details
Authors hold the copyright of all published articles except otherwise stated.
References
Abro, S., Shaikh, S., Khand, Z. H., Zafar, A., Khan, S., & Mujtaba, G. (2020). Automatic hate speech detection using machine learning: A comparative study. International Journal of Advanced Computer Science and Applications, 11(8).
https://doi.org/10.14569/IJACSA.2020.0110861
Ademiluyi, A., Li, C., & Park, A. (2022). Implications and preventions of cyberbullying and social exclusion in social media: systematic review. JMIR formative research, 6(1), e30286.
Al-Garadi, M. A. Varathan, K. D. & Ravana, S. D. (2016). Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network,'' Comput. Hum. Behav., vol. 63, pp. 433-443.
https://doi.org/10.1016/j.chb.2016.05.051
Altay, O., & Ulas, M. (2018, March). Prediction of the autism spectrum disorder diagnosis with linear discriminant analysis classifier and K-nearest neighbor in children. In 2018 6th international symposium on digital forensic and security (ISDFS) (pp. 1-4). IEEE.
https://doi.org/10.1109/ISDFS.2018.8355354
Ayo, F. E., Folorunso, O., Ibharalu, F. T., & Osinuga, I. A. (2020). Hate speech detection in Twitter using hybrid embeddings and improved cuckoo search-based neural networks. International Journal of Intelligent Computing and Cybernetics, 13(4), 485-525.
https://doi.org/10.1108/IJICC-06-2020-0061
Balakrishnan, V., Khan, S. & Arabnia, H. R. (2020). Improving cyberbullying detection using Twitter users' psychological features and machine learning,'' Comput. Secur., vol. 90, Art. no. 101710,
https://doi.org/10.1016/j.cose.2019.101710
Dalvi, R. R., Chavan, S. B. & Halbe, A. (2021). Detecting a Twitter cyberbullying using machine learning,'' Ann. Romanian Soc. Cell Biol., vol. 25, no. 4, pp. 16307-16315.
https://doi.org/10.1109/ICICCS48265.2020.9120893
Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14, 241-258.
https://doi.org/10.1007/s11704-019-8208-z
Durgesh, K. S., & Lekha, B. (2010). Data classification using support vector machine. Journal of theoretical and applied information technology, 12(1), 1-7.
Faris, H., Habib, I.A.M. and Castillo, P.A. (2020), "Hate speech detection using word embedding and deep learning in the Arabic language context", Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2020), pp. 453-460.
https://doi.org/10.5220/0008954004530460
Ferrara, P., Ianniello, F., Villani, A., & Corsello, G. (2018). Cyberbullying a modern form of bullying: let's talk about this health and social problem. Italian journal of pediatrics, 44(1), 1-3.
https://doi.org/10.1186/s13052-018-0446-4
Fisher, H. L., Moffitt, T. E., Houts, R. M., Belsky, D. W., Arseneault, L., & Caspi, A. (2012). Bullying victimisation and risk of self harm in early adolescence: longitudinal cohort study. Bmj, 344, e2683.
https://doi.org/10.1136/bmj.e2683
Florio, K., Basile, V., Polignano, M., Basile, P. and Patti, V. (2020), "Time of your hate: the challenge of time in hate speech detection on social media", Applied Sciences, Vol. 10 No. 12, p. 4180.
https://doi.org/10.3390/app10124180
Gohal, G., Alqassim, A., Eltyeb, E., Rayyani, A., Hakami, B., Al Faqih, A., ... & Mahfouz, M. (2023). Prevalence and related risks of cyberbullying and its effects on adolescent. BMC psychiatry, 23(1), 39.
https://doi.org/10.1186/s12888-023-04542-0
Huang, Q., Singh, V. K. & Atrey, P. K. (2014). Cyber bullying detection using social and textual analysis,'' in Proc. 3rd Int. Workshop Socially-Aware Multimedia (SAM), pp. 3-6.
https://doi.org/10.1145/2661126.2661133
Joyce, B., & Deng, J. (2019). Sentiment Analysis Using Naive Bayes Approach with Weighted Reviews-A Case Study. In 2019 IEEE Global Communications Conference (GLOBECOM) (pp. 1-6). IEEE.
https://doi.org/10.1109/GLOBECOM38437.2019.9013588
Muneer, A. and Fati, S. M. (2020). A comparative analysis of machine learning techniques for cyberbullying detection on Twitter. Futur. Internet, 12(11), pp. 1-21, 2020.
https://doi.org/10.3390/fi12110187
Murshed, B. A. H., Abawajy, J., Mallappa, S., Saif, M. A. N., & Al-Ariki, H. D. E. (2022). DEA-RNN: A hybrid deep learning approach for cyberbullying detection in Twitter social media platform. IEEE Access, 10, 25857-25871.
https://doi.org/10.1109/ACCESS.2022.3153675
Perelló, C., Tomás, D., Garcia-Garcia, A., Garcia-Rodriguez, J., & Camacho-Collados, J. (2019). UA at SemEval-2019 task 5: setting a strong linear baseline for hate speech detection. In Proceedings of the 13th International Workshop on Semantic Evaluation (pp. 508-513).
https://doi.org/10.18653/v1/S19-2091
Roy, P. K., & Mali, F. U. (2022). Cyberbullying detection using deep transfer learning. Complex & Intelligent Systems, 8(6), 5449-5467.
https://doi.org/10.1007/s40747-022-00772-z
Sarker, I. H. (2022). Ai-based modeling: Techniques, applications and research issues towards automation, intelligent and smart systems. SN Computer Science, 3(2), 158.
https://doi.org/10.1007/s42979-022-01043-x
Shinde, P. P., & Shah, S. (2018). A review of machine learning and deep learning applications. In 2018 Fourth international conference on computing communication control and automation (ICCUBEA) (pp. 1-6). IEEE.
https://doi.org/10.1109/ICCUBEA.2018.8697857
Taye, M. M. (2023). Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions. Computers, 12(5), 91.
https://doi.org/10.3390/computers12050091
Williamson, R. (2010). Cyberbullying. Education Partnerships, Inc, pp 1-10
Yahyaoui, I., & Yumuşak, N. (2018). Machine learning techniques for data classification. In Advances in renewable energies and power technologies (pp. 441-450). Elsevier.