A pragmatic ensemble learning approach for rainfall prediction
Tóm tắt
Heavy rainfall and precipitation play a massive role in shaping the socio-agricultural landscape of a country. Being one of the key indicators of climate change, natural disasters, and of the general topology of a region, rainfall prediction is a gift of estimation that can be used for multiple beneficial causes. Machine learning has an impressive repertoire in aiding prediction and estimation of rainfall. This paper aims to find the effect of ensemble learning, a subset of machine learning, on a rainfall prediction dataset, to increase the predictability of the models used. The classification models used in this paper were tested once individually, and then with applied ensemble techniques like bagging and boosting, on a rainfall dataset based in Australia. The objective of this paper is to demonstrate a reduction in bias and variance via ensemble learning techniques while also analyzing the increase or decrease in the aforementioned metrics. The study shows an overall reduction in bias by an average of 6% using boosting, and an average reduction in variance by 13.6%. Model performance was observed to become more generalized by lowering the false negative rate by an average of more than 20%. The techniques explored in this paper can be further utilized to improve model performance even further via hyper-parameter tuning.
Tài liệu tham khảo
Yilmaz AG. The effects of climate change on historical and future extreme rainfall in Antalya Turkey. Hydrol Sci J. 2015;60(12):2148–62.
Loo YY, Billa L, Singh A. Effect of climate change on seasonal monsoon in Asia and its impact on the variability of monsoon rainfall in Southeast Asia. Geosci Front. 2015;6(6):817–23.
Meynecke JO, Lee SY, Duke NC, Warnken J. Effect of rainfall as a component of climate change on estuarine fish production in Queensland, Australia. Estuar Coast Shelf Sci. 2006;69(3–4):491–504.
Kotz M, Levermann A, Wenz L. The effect of rainfall changes on economic production. Nature. 2022;601(7892):223–7.
Theis L, Oord AVD, Bethge M. 2015. A note on the evaluation of generative models. arXiv preprint arXiv:1511.01844. 2015
Pazos N, Favara M, Sánchez A, Scott D, Behrman J. Long-term effects of rainfall shocks on foundational cognitive skills: evidence from Peru. SSRN Electron J. 2023. https://doi.org/10.2139/ssrn.4360823.
Pariyar SK, Keenlyside N, Sorteberg A, Spengler T, Bhatt BC, Ogawa F. Factors affecting extreme rainfall events in the South Pacific. Weather Clim Extremes. 2020;29:100262.
Yue W, Wang Z, Chen H, Payne A, Liu X. Machine learning with applications in breast cancer diagnosis and prognosis. Designs. 2018;2(2):13.
Liyew CM, Melese HA. Machine learning techniques to predict daily rainfall amount. J Big Data. 2021;8:1–11.
Manandhar S, Dev S, Lee YH, Meng YS, Winkler S. A data-driven approach for accurate rainfall prediction. IEEE Trans Geosci Remote Sens. 2019;5(11):9323–31.
Zainudin S, Jasim DS, Bakar AA. Comparative analysis of data mining techniques for Malaysian rainfall prediction. Int J AdvSciEng Inform Technol. 2016;6(6):1148–53.
Chandra S, Gourisaria MK, Gm H, Konar D, Gao X, Wang T, Xu M. Prolificacy assessment of spermatozoan via state-of-the-art deep learning frameworks. IEEE Access. 2022;10:13715–27.
Jee G, Harshvardhan GM, Gourisaria MK. Juxtaposing inference capabilities of deep neural models over posteroanterior chest radiographs facilitating COVID-19 detection. J Interdiscip Math. 2021;24(2):299–325.
Agrawal R, Singh V, Gourisaria MK, Sharma A, Das H. Comparative analysis of CNN Architectures for maize crop disease. In: 2022 10th International conference on emerging trends in engineering and technology-signal and information processing (ICETET-SIP-22). IEEE. 2022. pp. 1–7
Khare S, Gourisaria MK, Harshvardhan GM, Joardar S, Singh V. Real estate cost estimation through data mining techniques. IOP Conf series Mater Sci Eng. 2021;1099(1):012053.
Pirone D, Cimorelli L, Del Giudice G, Pianese D. Short-term rainfall forecasting using cumulative precipitation fields from station data: a probabilistic machine learning approach. J Hydrol. 2023;617:128949.
Basha CZ, Bhavana N, Bhavya P, Sowmya V. Rainfall prediction using machine learning & deep learning techniques. In: 2020 international conference on electronics and sustainable communication systems (ICESC). IEEE. 2020. pp. 92–97
Fahad S, Su F, Khan SU, Naeem MR, Wei K. Implementing a novel deep learning technique for rainfall forecasting via climatic variables: an approach via hierarchical clustering analysis. Sci Total Environ. 2023;854:158760.
Rahman AU, Abbas S, Gollapalli M, Ahmed R, Aftab S, Ahmad M, Khan MA, Mosavi A. Rainfall prediction system using machine learning fusion for smart cities. Sensors. 2022;22(9):3504.
Barrera-Animas AY, Oyedele LO, Bilal M, Akinosho TD, Delgado JMD, Akanbi LA. Rainfall prediction: a comparative analysis of modern machine learning algorithms for time-series forecasting. Mach Learn Appl. 2022;7:100204.
Manna T, Anitha A. Precipitation prediction by integrating rough set on Fuzzy approximation space with deep learning techniques. Appl Soft Comput. 2023;139:110253.
Suparta W, Samah AA. Rainfall prediction by using ANFIS times series technique in South Tangerang Indonesia. Geod Geodyn. 2020;11(6):411–7.
Venkatachalam K, Trojovský P, Pamucar D, Bacanin N, Simic V. DWFH: an improved data-driven deep weather forecasting hybrid model using transductive long short term memory (T-LSTM). Expert Syst Appl. 2023;213:119270.
Kashiwao T, Nakayama K, Ando S, Ikeda K, Lee M, Bahadori A. A neural network-based local rainfall prediction system using meteorological data on the Internet: a case study using data from the Japan meteorological agency. Appl Soft Comput. 2017;56:317–30.
Van SP, Le HM, Thanh DV, Dang TD, Loc HH, Anh DT. Deep learning convolutional neural network in rainfall–runoff modelling. J Hydroinf. 2020;22(3):541–61.
Hudnurkar S, Rayavarapu N. On the performance analysis of rainfall prediction using mutual information with artificial neural network. Intl J Electr Computer Eng. 2023;13(2):2101.
Tran Anh D, Duc Dang T, Van Pham S. Improved rainfall prediction using combined pre-processing methods and feed-forward neural networks. J. 2019;2(1):65–83.
Khan MI, Maity R. Hybrid deep learning approach for multi-step-ahead daily rainfall prediction using GCM simulations. IEEE Access. 2020;8:52774–84.
Kaur H, Kumar M, Gupta A, Sachdeva M, Mittal A, Kumar K. Bagging: an ensemble approach for recognition of handwritten place-names in gurumukhi script. ACM Trans Asian Low-Resour Lang Inf Process. 2023. https://doi.org/10.1145/3593024.
Sarah S, Gourisaria MK, Khare S, Das H. Heart disease prediction using core machine learning techniques—a comparative study in advances in data and Information sciences proceedings of ICDIS 2021. Singapore: Springer Singapore; 2022. p. 247–60.
Ukey N, Yang Z, Li B, Zhang G, Hu Y, Zhang W. Survey on exact knn queries over high-dimensional data space. Sensors. 2023;23(2):629.
Azam Z, Islam MM, Huda MN. Comparative analysis of intrusion detection systems and machine learning based model analysis through decision tree. IEEE Access. 2023. https://doi.org/10.1109/ACCESS.2023.3296444.
Jain N, Jana PK. LRF: a logically randomized forest algorithm for classification and regression problems. Expert Syst Appl. 2023;213:119225.
Singh V, Gourisaria MK, Das H. Performance analysis of machine learning algorithms for prediction of liver disease. In: 2021 IEEE 4th international conference on computing, power and communication technologies (GUCON). IEEE. 2021. pp. 1–7
Jhaveri S, Khedkar I, Kantharia Y, Jaswal S. Success prediction using random forest, catboost, xgboost and adaboost for Kickstarter campaigns. In: 2019 3rd International conference on computing methodologies and communication (ICCMC). IEEE. 2019. pp. 1170–1173
Hancock J, Khoshgoftaar TM. Medicare fraud detection using catboost. In: 2020 IEEE 21st international conference on information reuse and Integration for data science (IRI). IEEE. 2020. pp. 97–103
Neo TKC, Ventura D. A direct boosting algorithm for the k-nearest neighbor classifier via local warping of the distance metric. Pattern Recogn Lett. 2012;33(1):92–102.