Single classifier vs. ensemble machine learning approaches for mental health prediction

Brain Informatics - Tập 10 - Trang 1-10 - 2023
Jetli Chung1, Jason Teo2,3
1Faculty of Computing and Informatics, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, Malaysia
2Advanced Machine Intelligence Research Group, Faculty of Computing and Informatics, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, Malaysia
3Evolutionary Computing Laboratory, Faculty of Computing and Informatics, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, Malaysia

Tóm tắt

Early prediction of mental health issues among individuals is paramount for early diagnosis and treatment by mental health professionals. One of the promising approaches to achieving fully automated computer-based approaches for predicting mental health problems is via machine learning. As such, this study aims to empirically evaluate several popular machine learning algorithms in classifying and predicting mental health problems based on a given data set, both from a single classifier approach as well as an ensemble machine learning approach. The data set contains responses to a survey questionnaire that was conducted by Open Sourcing Mental Illness (OSMI). Machine learning algorithms investigated in this study include Logistic Regression, Gradient Boosting, Neural Networks, K-Nearest Neighbours, and Support Vector Machine, as well as an ensemble approach using these algorithms. Comparisons were also made against more recent machine learning approaches, namely Extreme Gradient Boosting and Deep Neural Networks. Overall, Gradient Boosting achieved the highest overall accuracy of 88.80% followed by Neural Networks with 88.00%. This was followed by Extreme Gradient Boosting and Deep Neural Networks at 87.20% and 86.40%, respectively. The ensemble classifier achieved 85.60% while the remaining classifiers achieved between 82.40 and 84.00%. The findings indicate that Gradient Boosting provided the highest classification accuracy for this particular mental health bi-classification prediction task. In general, it was also demonstrated that the prediction results produced by all of the machine learning approaches studied here were able to achieve more than 80% accuracy, thereby indicating a highly promising approach for mental health professionals toward automated clinical diagnosis.

Tài liệu tham khảo

Wolpert DH, Macready WG (1995) No free lunch theorems for search. Technical Report SFI-TR-95-02-010, Santa Fe Institute Mourad CA, Joseph ZR, Zarrar S, Ralitza G, JohnsonMarcia K, TrivediMadhukar H, CannonTyrone D, Harrison KJ, Robert CP (2016) Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry 3(3):243–250 Sumathi MR, Poorna B (2016) Prediction of mental health problems among children using machine learning techniques. Int J Adv Comput Sci Appl 7(1):552–557 Galatzer-Levy IR, Ma S, Statnikov A, Yehuda R, Shalev AY (2017) Utilization of machine learning for prediction of post-traumatic stress: a re-examination of cortisol in the prediction and pathways to non-remitting PTSD. Transl Psychiatry 7(3):e1070–e1070 Sau Arkaprabha, Bhakta Ishita (2019) Screening of anxiety and depression among seafarers using machine learning technology. Informat Med Unlocked 16:100228 Zenebe RA, Xu AJ, Yuxin W, Yufei G, Anthony FM (2019) Machine learning for mental health detection. Technical report, 100 Institute Road, Worcester MA 01609-2280 USA, March Tak JY, Woo JS, Seung-Hyun S, Harin K, Yangsik K, Jungsun L (2020) Diagnosing schizophrenia with network analysis and a machine learning method. Int J Methods Psychiatr Res 29(1):e1818 Tate Ashley E, McCabe Ryan C, Larsson Henrik, Lundström Sebastian, Lichtenstein Paul, Kuja-Halkola Ralf (2020) Predicting mental health problems in adolescence using machine learning techniques. PLOS ONE 15(4):e0230389 Liu Yang S, Chokka Stefani, Cao Bo, Chokka Pratap R (2021) Screening for bipolar disorder in a tertiary mental health centre using EarlyDetect: a machine learning-based pilot study. J Affect Disord Rep 6:100215 Open Sourcing Mental Illness. Open sourcing mental illness 2014 Kim Ji-Hyun (2009) Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Comput Statis Data Anal 53(11):3735–3745 Gitte V, Hendrik B (2012) On estimating model accuracy with repeated cross-validation. p. 39–44. De Baets, Bernard Molinaro AM, Simon R, Pfeiffer RM (2005) Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15):3301–3307 Songthip O, Lensing Shelly Y, Spencer Horace J, Kodell Ralph L (2012) Estimating misclassification error: a closer look at cross-validation based methods. BMC Res Notes 5(1):656 Witten Ian H, Eibe F, Hall Mark A (2011) Data mining: practical machine learning tools and techniques. 01 Bouckaert Remco R (2003) Choosing between two learning algorithms based on calibrated tests. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML’03, pp 51–58. AAAI Press Sujatha J, Rajagopalan SP (2017) Performance evaluation of machine learning algorithms in the classification of Parkinson’s disease using voice attributes. Int J Appl Eng Res 12:10669–10675 Yogeswaran M, Seng CS, Pei XDK, Poh FL (2016) Artificial neural network for classification of depressive and normal in EEG. In: 2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES). IEEE Geng X-F, Xu J-H (2017) Application of autoencoder in depression diagnosis. DEStech Trans Comput Sci Eng (csma), pp 146–151