Single classifier vs. ensemble machine learning approaches for mental health prediction
Tóm tắt
Early prediction of mental health issues among individuals is paramount for early diagnosis and treatment by mental health professionals. One of the promising approaches to achieving fully automated computer-based approaches for predicting mental health problems is via machine learning. As such, this study aims to empirically evaluate several popular machine learning algorithms in classifying and predicting mental health problems based on a given data set, both from a single classifier approach as well as an ensemble machine learning approach. The data set contains responses to a survey questionnaire that was conducted by Open Sourcing Mental Illness (OSMI). Machine learning algorithms investigated in this study include Logistic Regression, Gradient Boosting, Neural Networks, K-Nearest Neighbours, and Support Vector Machine, as well as an ensemble approach using these algorithms. Comparisons were also made against more recent machine learning approaches, namely Extreme Gradient Boosting and Deep Neural Networks. Overall, Gradient Boosting achieved the highest overall accuracy of 88.80% followed by Neural Networks with 88.00%. This was followed by Extreme Gradient Boosting and Deep Neural Networks at 87.20% and 86.40%, respectively. The ensemble classifier achieved 85.60% while the remaining classifiers achieved between 82.40 and 84.00%. The findings indicate that Gradient Boosting provided the highest classification accuracy for this particular mental health bi-classification prediction task. In general, it was also demonstrated that the prediction results produced by all of the machine learning approaches studied here were able to achieve more than 80% accuracy, thereby indicating a highly promising approach for mental health professionals toward automated clinical diagnosis.
Tài liệu tham khảo
Wolpert DH, Macready WG (1995) No free lunch theorems for search. Technical Report SFI-TR-95-02-010, Santa Fe Institute
Mourad CA, Joseph ZR, Zarrar S, Ralitza G, JohnsonMarcia K, TrivediMadhukar H, CannonTyrone D, Harrison KJ, Robert CP (2016) Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry 3(3):243–250
Sumathi MR, Poorna B (2016) Prediction of mental health problems among children using machine learning techniques. Int J Adv Comput Sci Appl 7(1):552–557
Galatzer-Levy IR, Ma S, Statnikov A, Yehuda R, Shalev AY (2017) Utilization of machine learning for prediction of post-traumatic stress: a re-examination of cortisol in the prediction and pathways to non-remitting PTSD. Transl Psychiatry 7(3):e1070–e1070
Sau Arkaprabha, Bhakta Ishita (2019) Screening of anxiety and depression among seafarers using machine learning technology. Informat Med Unlocked 16:100228
Zenebe RA, Xu AJ, Yuxin W, Yufei G, Anthony FM (2019) Machine learning for mental health detection. Technical report, 100 Institute Road, Worcester MA 01609-2280 USA, March
Tak JY, Woo JS, Seung-Hyun S, Harin K, Yangsik K, Jungsun L (2020) Diagnosing schizophrenia with network analysis and a machine learning method. Int J Methods Psychiatr Res 29(1):e1818
Tate Ashley E, McCabe Ryan C, Larsson Henrik, Lundström Sebastian, Lichtenstein Paul, Kuja-Halkola Ralf (2020) Predicting mental health problems in adolescence using machine learning techniques. PLOS ONE 15(4):e0230389
Liu Yang S, Chokka Stefani, Cao Bo, Chokka Pratap R (2021) Screening for bipolar disorder in a tertiary mental health centre using EarlyDetect: a machine learning-based pilot study. J Affect Disord Rep 6:100215
Open Sourcing Mental Illness. Open sourcing mental illness 2014
Kim Ji-Hyun (2009) Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Comput Statis Data Anal 53(11):3735–3745
Gitte V, Hendrik B (2012) On estimating model accuracy with repeated cross-validation. p. 39–44. De Baets, Bernard
Molinaro AM, Simon R, Pfeiffer RM (2005) Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15):3301–3307
Songthip O, Lensing Shelly Y, Spencer Horace J, Kodell Ralph L (2012) Estimating misclassification error: a closer look at cross-validation based methods. BMC Res Notes 5(1):656
Witten Ian H, Eibe F, Hall Mark A (2011) Data mining: practical machine learning tools and techniques. 01
Bouckaert Remco R (2003) Choosing between two learning algorithms based on calibrated tests. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML’03, pp 51–58. AAAI Press
Sujatha J, Rajagopalan SP (2017) Performance evaluation of machine learning algorithms in the classification of Parkinson’s disease using voice attributes. Int J Appl Eng Res 12:10669–10675
Yogeswaran M, Seng CS, Pei XDK, Poh FL (2016) Artificial neural network for classification of depressive and normal in EEG. In: 2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES). IEEE
Geng X-F, Xu J-H (2017) Application of autoencoder in depression diagnosis. DEStech Trans Comput Sci Eng (csma), pp 146–151