Machine Learning for Health Insurance Prediction in Nigeria

Main Article Content

Victor Enemona Ochigbo
Oluwasogo Adekunle Okunade
Emmanuel Gbenga Dada
Olayemi Mikail Olaniyi
Oluwatoyosi Victoria Oyewande

Abstract

Health insurance coverage remains critical to healthcare accessibility, particularly in developing nations like Nigeria. This paper focused on predicting the likelihood of medical insurance coverage among individuals in Nigeria by employing four prominent Machine learning techniques: Logistic Regression, Random Forest, Decision Tree, and Support Vector Machine classifiers. The dataset utilized for analysis comprises demographic information, socioeconomic factors, and health-related variables collected from a diverse sample across Nigeria. Four models are trained and evaluated: Logistic Regression widely accepted for its simplicity and interpretability. Random Forest is a robust ensemble learning algorithm capable of capturing complex relationships within the data. The decision Tree model is simple to understand and visualize and the Support Vector Machine model is known for producing a very good classification. Furthermore, the performance metrics uutilized to rate the predictive capabilities of the models are Accuracy, Precision, Sensitivity, F Score, and area under the Receiver Operating Characteristic (AUC & ROC Curve). Additionally, a features importance analysis is conducted for the identification of the dominant factors contributing to the prediction of the spread of medical insurance in Nigeria. The outcome of this paper gives insights in the efficiency of each machine learning models used to forecast medical insurance coverage, and identifying key determinants influencing insurance coverage can assist policymakers and healthcare stakeholders in devising targeted strategies to improve healthcare access and affordability for the Nigerian people.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
V. E. Ochigbo, O. A. Okunade, E. G. Dada, O. M. Olaniyi, and O. V. Oyewande, “Machine Learning for Health Insurance Prediction in Nigeria”, AJERD, vol. 7, no. 2, pp. 541–554, Dec. 2024.
Section
Articles

References

Onwujekwe, O., Ezumah, N., Mbachu, C., Obi, F., Ichoku, H., Uzochukwu, B., & Wang, H. (2019). Exploring Effectiveness of Different Health Financing Mechanism in Nigeria; What Need to Change and How Can It Happen? BMC Health Service Research 19:661 https://doi.org/10.1186/s12913-019-4512-4

Obikeze, E., Onyeje, D., Anyanti, J., Idogho, O., Ezenwaka, U., & Uguru, N. (2022). Assessment of Health Purchasing Functions for Universal Health Coverage in Nigeria: Evidence from Grey Literature and Key Informant Interviews. Health, 14, 330-341 https://doi.org/10.4236/health.2022.143026

Awosusi, A. (2022). Nigeria’s Mandatory Health Insurance and The March Towards Universal Health Coverage. The Lancet Global Health, 10, e1556.

Baba, M., & Omotara, B., (2013). Nigeria’s Public Health's Gains and Challenges

Badawy, M., Ramadan, N. & Hefny, H.A. (2023). Healthcare Predictive Analytics Using Machine Learning and Deep Learning Techniques: A Survey. Journal of Electical Systems and Inf Technol, 10:40. https://doi.org/10.1186/s43067-023-00108-y

Shaukat, Z., Zafar, W., Ahmad, W., Haq, I.U., Husnain, G., Al-Adhaileh, M.H., Ghadi, Y.Y. & Algarni, A. (2023). Revolutionizing Diabetes Diagnosis: Machine Learning Techniques Unleashed. Healthcare. 11, 2864. https://doi.org/10.3390/healthcare11212864

Rahman, M.M., Rahman, A., Akter, S. & Pinky, S.A. (2023). Hyperparameter Tuning Based Machine Learning Classifier for Breast Cancer Prediction. Journal of Computer and Communications. 11, 149-165 https://doi.org/10.4236/jcc.2023.114007

Han, H.J. & Suh, H.S. (2023). Predicting Unmet Healthcare Needs in Post-Disaster: A Machine Learning Approach. Int. J. Environ. Res. Public Health, 20, 6817. https://doi.org/10.3390/ijerph20196817

Chen, H., Wang, N., Zhou, Y., Mei, K., Tang, M. & Cai, G. (2023). Breast Cancer Prediction Based on Differential Privacy and Logistic Regression Optimization Model. Appl. Sci., 13, 10775. https://doi.org/10.3390/app131910755

Sun, H.T. & Pan, J.N. (2023). Heart Disease Prediction Using Machine Learning Algorithms with Self-Measureable Physical Condition Indicators. Journal of Data Analysis and Information Processing, 11(1), 1-10. https://doi.org/10.4236/jdaip.2023.111001

Wei, Y.Z., Zhang, D., Gao, M.Y., Tian, Y.H., He. Y., Huang, B.I. & Zheng, C.Y. (2023). Breast Cancer Prediction based on Machine Learning. Journal of Software Engineering and Applications, 6, 348-360. https://doi.org/10.4236/jsea/jsea/jsea-2023.168018

Reghunathan, R.K., Venkidusamy, P.N.P., Kurup, R.G., George, B. & Thomas, N. (2024). Machine Learning-Based Classification of Autism Spectrum Disorder across Age Groups. Eng. Proc., 62, 12. https://doi.org/10.3390/engproc2024062012

Cai, M.Y. (2023). A Novel Method for Disgnosis of Breast Cancer Tumors Based on Random Forest. Journal of Biosciences and Medicines, 11, 252-259. https://doi.org/10.4236/jbm.2023.114018

Gill, T.S., Shirazi, M.A. & Zaidi, S.S.H. (2023) Early Detection of Mesothelioma Using Machine Learning Algorithms Eng. Proc., 46, 6. https://doi.org/10.3390/engproc2023046006

Chae, M., Yoon, H., Lee, H. & Choi, J. (2024). Hearing Recovery Prediction for Patients with Chronic Otitis Media Who Underwent Canal-Wall-Down Mastoidectomy. J. Clin. Med, 13, 1557. https://doi.org/10.3390/jcm13061557

Zhang, Y., Zeng, H., Zhou, H., Li, J., Wang, T., Guo, Y., Cai, L., Hu, J., Zhang, X. & Chen, G. (2023). Predicting the Outcome of Patients with Aneurysmal Subarachnoid Hemorrhage: A Machine-Learning-Guided Scorecard. J. Clin. Med, 12, 7040. https://doi.org/10.3390/jcm12227040

Santana, I., Sobrinho, A., Silva, L. D. D., & Perkusich, A. (2023). A Machine Learning for COVID-19 and Influenza Classification during Coexisting Outbreak. Appl. Sci., 13, 11518. https://doi.org/10.3390/app132011518

Dipto, I.C., Islam, T., Rahman, H.M.M., & Rahman, A.A. (2020). Comparison of Different Machine Learning Algorithms for the Prediction of Coronary Artery Disease. Journal of Data Analysis and Information Processing 8(1), 41-68. https://doi.org/10.4236/jdaip.2020.82003.

Zheng, H. (2018). Analysis of Global Warming Using Machine Learning. Computational Water Energy, and Environmental Engineering, 7, 127-141. https://doi.org/10.4236/cweee.2018.73009

Oyoo, J.O., Wekesa, J.S. & Ogada, K.O. (2024). Predicting Road Traffic Collisions Using a Two-Layer Ensemble Machine Learning Algorithm. Appl. Syst. Innow, 7, 25. https://doi.org/10.3390/asi7020025

Almayyan, W. (2016). Lymph Disease Prediction Using Random Forest and Particle Swarm Optimization. Journal of Intelligent Learning Systems and Applications. 8, 51-62 http://dx.doi.org/10.4236/jilsa.2016.83005

Colot, C., Baecke, P, & Linden, I. (2021). Leveraging Fine-Grained Mobile Data for Churn Through Essence Random Forest. Journal Big Data, 8:63. https://doi.org/10.1186/s40537-021-00451-9

Getu, K. & Bhat, G. H. (2024). Application of Geospatial Techniques in Binary Logistic Regression Model for Analyzing Driving Factor of Urban Growth in Bhar Dar City Ethiopia. Heylion, 10. e25137. https://doi.org/10.1016/j.heliyon.2024.e25137

Wang, J., Ju, T., Li, B., Huang, C., Xia, X. & Li, Li. C. (2024). Characterization of Tropospheric Zone Pollution, Random Forest Trend Prediction and Analysis of Influencing Factors in South-Western Europe. Environmental Sciences Europe, 36:61, https://doi.org/10.1186/s12302-024-00863-3

Chen, H., Hu, S., Hua, R. & Zhao, X. (2021). Improved Naïve Bayes Classification Algorithm for Traffic Risk Management. EURASIP Journal on Advances in Signal Processing, 2021:30. https://doi.org/10.1186/s13634-021-00742-6

Gai, R. & Zhang, H. (2023). Prediction Model of Agricultural Water Quality Based on Optimized Logistic Regression Algorithm. EURASIP Journal on Advances in Signal Processing, 2023:21. https://doi.org/10.1186/s13634-023-00973-9

Liu, L., Luo, G. & Zhang, X. (2017). An algorithm based on logistic regression with data fusion in wireless sensor networks. EURASIP Journal on Wireless Communications and Networking, 2017:10. https://doi.org/10.1186/s13638-016-0793-z

Hancock, J.T., Bauder, R.A., Wang, H. & Khoshgoftaar, T.M. (2023). Explanable machine learning models for Medicare fraud detection. Journal of Big Data. 10:154. https://doi.org/10.1186/s40537-023-00821-5

Dahal, K.R. & Gautam, Y. (2020). Argumentative Comparative Analysis of Machine Learning on Coronary Artery Disease. Open Journal of Statistics, 10, 694-705. https://doi.org/10.4236/ojs.2020.104043

Rahman, M.M., Rahman, A., Akter, S. & Pinky, S.A. (2023). Hyperparameter Tuning Based MAchine Learning Classifier for Breast Cancer Prediction. Journal of Computer and Communications. 11, 149-165 https://doi.org/10.4236/jcc.2023.114007

Boateng, E.Y. & Abaye, D.A. (2019). A Review of the Logistic Regression model with Emphasis on Medical Research. Journal of Data Analysis and Information Processing. 7, 190-207. https://doi.org/10.4236/jdaip.2019.74012

Belgiu, M. & Dragout, L. (2016) Random Forest in Remote Sensing: A Review of Applications and Future Directions. ISPRS Journal of Photogrammetry and Remote Sensing. 114, 24-31. www.elsevier.com/locate/isprsjprs

Shaik, A.B., & Srinivasan, S. (2018). A Brief Survey on Random Forest Ensembles in Classification Model. International Conference on Innovative Computing and Communications. 253-256

Martinez-Taboada, F. & Redondo, J. I. (2020). The SIESTA (SEAAV Integrated Evaluation Sedation Tool for Anaesthesia) Project: Initial Development of a Malfactoral Sedation Assessment Tool for Dogs. PLoS ONE. 15(4): e0230799 https://doi.org/10.1371/Journal.pone.0230779

Lantz, B. (2013). Machine Learning with R. Packt Publishing Ltd. P308.

Cho, C.H., Yu, Y.W., & Kim, H.G. (2023). A Study on Dropout Prediction for University Students Using Machine Learning. Appl. Sci., 13, 12004. https://doi.org/10.3390/app132112004

Nordin, N.I., Mustafa, W.A., Lola, M.S., Madi, E.N., Kamil, A.A., Nasution, M.D.,Abdulhamid, K.A.A., Zainuddin, N.H., Aruchunan, E, & Abdullah, M.T. (2023). Enhancing COVID-19 Classification Accuracy with a Hybrid SVM-LR Model. Bioengineering, 10, 1318. https://doi.org/10.3390/bioengineering10111318

Zhang, J., Zhou, W., Yu, H., Wang, T., Wang, X., Liu, L. & Wen, Y. (2023). Prediction of Parkinson’s Disease Using Machine Learning Methods. Bioengineering, 13, 1761. https://doi.org/10.3390/biom13121761

Abbasi, E.Y., Zeng, Z., Magsi, A.H.., Ali, Q., Kumar, K. & Zubedi, A. (2023). Optimizing Skin Cancer Survival Prediction with Ensemble Techniques. Bioengineering, 11, 43. https://doi.org/10.3390/bioengineering11010043

Olaguez-Gonzalez, J.M., Chairez, I., Breton-Deval, L. & Alfaro-Ponce, M. (2023). Machine Learning Algorithm Applied to Predict Autism Spectrum Disorder Based on Gut Microbiome Composition. Biomedicines, 11, 2633. https://doi.org/10.3390/biomedicines11102633

Tu, K.-C., Tau, E.N.T., Chen, N. C. L., Chang, M.-C., Yu, T. C, Wang, C.-C., Liu, C.-F. & Kuo, C.-L. (2023). Machine Learning Algorithm Predicts Mortality Risk in Intensive Care Unit for Patients with traumatic Brain Injury. Diagonostic, 13, 3016. https://doi.org/10.3390/diagonostics13183016

Rojek, I., Kotlarz, P., Kozielski, M., Jagodzinski, M. & Krolikowski, Z. (2024). Development of AI-Based Prediction of Heart Attack Risk as an Element of Preventive Medicine. Electronics, 13, 272. https://doi.org/10.3390/electronics13020272

Singh, M.S., Thongam, K., Choudhary, P. & Bhagat, P.K. (2024). An Integrated Machine Learning Approach for Congestive Heart Failure Prediction. Diagnostics, 14, 736. https://doi.org/10.3390/diagnostics14070736