Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo
Một ứng dụng của hồi quy máy học trong việc chọn lựa đặc trưng: nghiên cứu về hiệu suất logistics và thuộc tính kinh tế
Tóm tắt
Nghiên cứu này trình bày cách thức thu lợi từ dữ liệu kinh tế lớn động thời gian thực, qua đó đóng góp vào việc chọn lựa các thuộc tính kinh tế phản ánh hiệu suất logistics như được thể hiện qua Chỉ số Hiệu suất Logistics (LPI). Kỹ thuật phân tích sử dụng mức độ năng suất cao trong học máy (ML) cho việc dự đoán hoặc hồi quy bằng cách sử dụng các đặc trưng kinh tế phù hợp. Mục tiêu của nghiên cứu này là xác định bộ sưu tập lý tưởng các thuộc tính kinh tế tốt nhất mô tả một biến số dự đoán cụ thể cho việc dự đoán hiệu suất logistics của một quốc gia. Ngoài ra, một số thuật toán hồi quy ML tiềm năng có thể được sử dụng để tối ưu hóa độ chính xác của dự đoán. Việc chọn lựa đặc trưng sử dụng các kỹ thuật lọc từ tương quan và phân tích các thành phần chính (PCA), cũng như kỹ thuật nhúng của hồi quy LASSO và hồi quy Elastic-net. Sau đó, dựa trên các đặc trưng đã chọn, các phương pháp hồi quy ML như mạng nơ-ron nhân tạo (ANN), perceptron đa lớp (MLP), hồi quy vector hỗ trợ (SVR), hồi quy rừng ngẫu nhiên (RFR), và hồi quy Ridge được sử dụng để huấn luyện và xác thực tập dữ liệu. Các phát hiện cho thấy rằng các tập hợp đặc trưng PCA và Elastic-net cung cấp hiệu suất gần nhất với tiêu chuẩn đo lường sai số. Một quy trình hợp nhất và giao nhau của bộ đặc trưng chấp nhận được được sử dụng để đưa ra quyết định chính xác hơn. Cuối cùng, hợp nhất các tập hợp đặc trưng mang lại kết quả tốt nhất. Các kết quả cho thấy các thuật toán ML có khả năng hỗ trợ trong việc chọn lựa một bộ các yếu tố kinh tế thích hợp phản ánh hiệu suất logistics của một quốc gia. Hơn nữa, ANN đã cho thấy là mô hình dự đoán hiệu quả nhất trong nghiên cứu này.
Từ khóa
#học máy #hồi quy #hiệu suất logistics #thuộc tính kinh tế #Chỉ số Hiệu suất LogisticsTài liệu tham khảo
World Bank (2018) Connecting to Compete 2018 Trade Logistics in the Global Economy The Logistics Performance Index and Its Indicators. http://hdl.handle.net/10986/29971. Accessed 31 August 2021
Gerschberger M, Manuj I, Freinberger PP (2017) Investigating supplier-induced complexity in supply chains. Int J of Phys Distrib Logist Manag 47(8):688–711
Wong WP, Tang CF (2018) The major determinants of logistic performance in a global perspective: evidence from panel data analysis. Int J of Logist Res Appl 21(4):431–443
D’Aleo V, Sergi BS (2017) Does logistics influence economic growth? European Exp Manag Decis 55(8):1613–1628
Takele TB (2019) The relevance of coordinated regional trade logistics for the implementation of regional free trade area of Africa. JTSCM 13(1):1–11
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Vieira SM, Sousa JM, Runkler TA (2010) Two cooperative ant colonies for feature selection using fuzzy models. Expert Syst Appl 37(4):2714–2723
Muthukrishnan R, Rohini R (2016) LASSO: A feature selection technique in predictive modeling for machine learning. In: Proceeding of the 2016 IEEE international conference on advances in computer applications (ICACA), pp. 18–20
Khmaissia F et al (2018) Accelerating band gap prediction for solar materials using feature selection and regression techniques. Comput Mater Sci 147:304–315
Sikora R, Piramuthu S (2007) Framework for efficient feature selection in genetic algorithm based data mining. Eur J Oper Res 180(2):723–737
Lu M (2019) Embedded feature selection accounting for unknown data heterogeneity. Expert Syst Appl 119:350–361
Lal TN et al (2006) Embedded methods, in Feature extraction. Springer, pp 137–165.
Jiang S et al (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2015) Recent advances and emerging challenges of feature selection in the context of big data. Knowl Based Syst 86:33–45
Henrique BM, Sobreiro VA, Kimura H (2019) Literature review: machine learning techniques applied to financial market prediction. Expert Syst Appl 124:226–251
Bayram S et al (2016) Comparison of multilayer perceptron (MLP) and radial basis function (RBF) for construction cost estimation: the case of Turkey. J Civ Eng Manag 22(4):480–490
Zarei FA, Baghban A (2017) Phase behavior modelling of asphaltene precipitation utilizing MLP-ANN approach. Pet Sci Technol 35(20):2009–2015
Luna A et al (2014) Prediction of ozone concentration in tropospheric levels using artificial neural networks and support vector machine at Rio de Janeiro, Brazil. Atmos Environ 98:98–104
Vaughan N et al (2014) Parametric model of human body shape and ligaments for patient-specific epidural simulation. Artif Intell Med 62(2):129–140
Coskuner G et al (2021) Application of artificial intelligence neural network modeling to predict the generation of domestic, commercial and construction wastes. Waste Manag Res 39(3):499–507
Jahn M (2020) Artificial neural network regression models in a panel setting: Predicting economic growth. Econ Model 91:148–154
Tümer AE, Akkuş A (2018) Forecasting gross domestic product per capita using artificial neural networks with non-economical parameters. Phys A: Stat Mech Appl 512:468–473
Ballestar MT, Grau-Carles PP, Sainz J (2019) Predicting customer quality in e-commerce social networks: a machine learning approach. Rev Manag Sci 13(3):589–603
Quan Q et al (2020) Research on water temperature prediction based on improved support vector regression. Neural Comput Appl. https://doi.org/10.1007/s00521-020-04836-4
Zhong H et al (2019) Vector field-based support vector regression for building energy consumption prediction. Appl Energy 242:403–414
García-Floriano A et al (2018) Support vector regression for predicting software enhancement effort. Inf Softw Technol 97:99–109
Yao X, Crook J, Andreeva G (2015) Support vector regression for loss given default modelling. Eur J Oper Res 240(2):528–538
Li Y et al (2018) Random forest regression for online capacity estimation of lithium-ion batteries. Appl Energy 232:197–210
Ouedraogo I, Defourny P, Vanclooster M (2019) Application of random forest regression and comparison of its performance to multiple linear regression in modeling groundwater nitrate concentration at the African continent scale. Hydrogeol J 27(3):1081–1098
Liang H et al (2020) GDP spatialization in Ningbo City based on NPP/VIIRS night-time light and auxiliary data using random forest regression. Adv Space Res 65(1):481–493
Bouktif S et al (2018) Optimal deep learning lstm model for electric load forecasting using feature selection and genetic algorithm: Comparison with machine learning approaches. Energies 11(7):1636
Alamoodi A et al (2021) Machine learning-based imputation soft computing approach for large missing scale and non-reference data imputation. Chaos Solit Fractals 151:111236
Cai J et al (2020) Prediction and analysis of net ecosystem carbon exchange based on gradient boosting regression and random forest. Appl Energy 262:114566
Cohen J (1992) Statistical power analysis. Curr Dir Psychol Sci 1(3):98–101
Lawrence S et al (2013) Source apportionment of traffic emissions of particulate matter using tunnel measurements. Atmos Environ 77:548–557
Abimbola O-PP et al (2020) Predicting Escherichia coli loads in cascading dams with machine learning: An integration of hydrometeorology, animal density and grazing pattern. Sci Total Environ 722:137894
Zhang H, Srinivasan R (2021) A biplot-based PCA approach to study the relations between indoor and outdoor air pollutants using case study buildings. Buildings 11(5):218
Das B et al (2018) Evaluation of multiple linear, neural network and penalised regression models for prediction of rice yield based on weather parameters for west coast of India. Int J Biometeorol 62(10):1809–1822
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol 58(1):267–288
Efron B et al (2004) Least angle regression. Ann Stat 32(2):407–499
Zhang X et al (2014) A causal feature selection algorithm for stock prediction modeling. Neurocomputing 142:48–59
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol 67(2):301–320
Osisanwo F et al (2017) Supervised machine learning algorithms: classification and comparison. Int J Comput 48(3):128–138
Lima-Junior FR, Carpinetti LC-R (2019) Predicting supply chain performance based on SCOR® metrics and multilayer perceptron neural networks. Int J Prod Econ 212:19–38
Laboissiere LA, Fernandes RA, Lage GG (2015) Maximum and minimum stock price forecasting of Brazilian power distribution companies based on artificial neural networks. Appl Soft Comput 35:66–74
Lahmiri S (2014) Improving forecasting accuracy of the S&P500 intra-day price direction using both wavelet low and high frequency coefficients. Fluct Noise Lett 13(01):1450008
Fath AH, Madanifar F, Abbasi M (2020) Implementation of multilayer perceptron (MLP) and radial basis function (RBF) neural networks to predict solution gas-oil ratio of crude oil systems. Petroleum 6(1):80–91
Heiat A (2002) Comparison of artificial neural network and regression models for estimating software development effort. Inf Softw Technol 44(15):911–922
Moayedi H, Rezaei A (2019) An artificial neural network approach for under-reamed piles subjected to uplift forces in dry sand. Neural Comput Appl 31(2):327–336
Kahani M et al (2018) Development of multilayer perceptron artificial neural network (MLP-ANN) and least square support vector machine (LSSVM) models to predict Nusselt number and pressure drop of TiO2/water nanofluid flows through non-straight pathways. Numer Heat Tr A-Appl 74(4):1190–1206
Zhang F, O'Donnell LJ (2020) Support vector regression, in Machine Learning. Elsevier, pp. 123–140
Ahmad MW, Reynolds J, Rezgui Y (2018) Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. J Clean Prod 203:810–821
Yuchi W et al (2019) Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city. Environ Pollut 245:746–753
Nandipati SC, XinYing C, Wah KK (2020) Hepatitis C virus (HCV) prediction by machine learning techniques. Appl Model Simul 4:89–100
García-Nieto PJ, García-Gonzalo E, Paredes-Sánchez JP (2021) Prediction of the critical temperature of a superconductor by using the WOA/MARS, ridge, lasso and elastic-net machine learning techniques. Neural Comput Appl 33:17131–17145
Kong X et al (2015) Wind speed prediction using reduced support vector machines with feature selection. Neurocomputing 169:449–456
Başakın EE et al (2021) A new insight to the wind speed forecasting: robust multi-stage ensemble soft computing approach based on pre-processing uncertainty assessment. Neural Comput Appl 34:783–812
Uncuoğlu E, Latifoğlu L, Özer AT (2021) Modelling of lateral effective stress using the particle swarm optimization with machine learning models. Arab J Geosci 14:2441
Lu X et al (2018) Daily pan evaporation modeling from local and cross-station data using three tree-based machine learning models. J Hydrol 566:668–684
Ullah QZ et al (2021) A Cartesian genetic programming based parallel neuroevolutionary model for cloud server’s CPU usage prediction. Electronics 10:67
Guo Y et al (2020) A spatiotemporal thermo guidance based real-time online ride-hailing dispatch framework. IEEE Access 8:115063–115077
Mohammed MS et al (2021) PEW: prediction-based early dark cores wake-up using online ridge regression for many-core systems. IEEE Access 9:124087–124099
Yang ZY et al (2019) Multi-view based integrative analysis of gene expression data for identifying biomarkers. Sci Rep 9:13504
Koç O, Peters J (2019) Learning to serve: an experimental study for a new learning from demonstrations framework. IEEE Robot Autom Lett 4(2):1784–1791
Karaman M (2019) Evaluation of bread wheat genotypes in irrigated and rainfed conditions using biplot analysis. Appl Ecol Environ Res 17(1):1431–1450
Tsai CF, Hsiao YC (2010) Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches. Decis Support Syst 50(1):258–269
Venkatesan D, Kannan K, Saravanan R (2009) A genetic algorithm-based artificial neural network model for the optimization of machining processes. Neural Comput Appl 18(2):135–140
Suryanarayana G et al (2018) Thermal load forecasting in district heating networks using deep learning and advanced feature selection methods. Energy 157:141–149
Citakoglu H (2021) Comparison of multiple learning artificial intelligence models for estimation of long-term monthly temperatures in Turkey. Arab J Geosci 14:2131
Guo J et al (2019) An XGBoost-based physical fitness evaluation model using advanced feature selection and Bayesian hyper-parameter optimization for wearable running monitoring. Comput Netw 151:166–180
Fauvel M, Chanussot J (2009) Benediktsson JA (2009) Kernel principal component analysis for the classification of hyperspectral remote sensing data over urban areas. EURASIP J Adv Signal Process 1:783194
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34(3):483–519
Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159
Syam N, Sharma A (2018) Waiting for a sales renaissance in the fourth industrial revolution: machine learning and artificial intelligence in sales research and practice. Ind Mark Manag 69:135–146
Hundi P, Shahsavari R (2020) Comparative studies among machine learning models for performance estimation and health monitoring of thermal power plants. Appl Energy 265:114775
Huang R et al (2021) Machine learning in natural and engineered water systems. Water Res 205:117666
Zhu R et al (2021) Application of machine learning techniques for predicting the consequences of construction accidents in China. Process Saf Environ Prot 145:293–302
Ahmadi-Nedushan B et al (2006) A review of statistical methods for the evaluation of aquatic habitat suitability for instream flow assessment. River Res Appl 22(5):503–523
Boucher TF et al (2015) A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy. Spectrochim Acta B 107:1–10