A novel imputation based predictive algorithm for reducing common cause variation from small and mixed datasets with missing values

Computers & Industrial Engineering - Tập 179 - Trang 109230 - 2023
Raed S. Batbooti1, Rajesh S. Ransing1
1Zienkiewicz Institute for Modelling, Data and AI, Department of Mechanical Engineering, Faculty of Science and Engineering, Swansea University, Swansea, SA1 8EN, UK

Tài liệu tham khảo

Agresti, 2002 Arteaga, 2002, 408 Arteaga, 2005, 439 Audigier, 2016, A principal component method to impute missing values for mixed data, Advances in Data Analysis and Classification, 10, 5, 10.1007/s11634-014-0195-1 Batbooti, 2017, A bootstrap method for uncertainty estimation in quality correlation algorithm for risk based tolerance synthesis, Computers & Industrial Engineering, 112, 654, 10.1016/j.cie.2016.09.002 Batista, 2003, An analysis of four missing data treatment methods for supervised learning, Applied Artificial Intelligence, 17, 519, 10.1080/713827181 Chen, 2022, Development of data-driven machine learning models for the prediction of casting surface defects, Metals, 12, 10.3390/met12010001 Christoffersson, 1970 Dempster, 1977, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B. Statistical Methodology, 39, 1 Diamantaras, 1996 Folch-fortuny, 2015, PCA model building with missing data : New proposals and a comparative study, Chemometrics and Intelligent Laboratory Systems, 146, 77, 10.1016/j.chemolab.2015.05.006 George, 2005 Giannetti, 2016, Risk based uncertainty quantification to improve robustness of manufacturing operations, Computers & Industrial Engineering, 101, 70, 10.1016/j.cie.2016.08.002 Giannetti, 2014, A novel variable selection approach based on co-linearity index to discover optimal process settings by analysing mixed data, Computers & Industrial Engineering, 72, 217, 10.1016/j.cie.2014.03.017 Grung, 1998, Missing values in principal component analysis, Chemometrics and Intelligent Laboratory Systems, 42, 125, 10.1016/S0169-7439(98)00031-8 Husson, 2013, Handling missing values in multiple factor analysis, Food Quality and Preference, 30, 77, 10.1016/j.foodqual.2013.04.013 Ilin, 2010, Practical approaches to principal component analysis in the presence of missing values, Journal of Machine Learning Research, 11, 1957 Josse, 2012, Handling missing values in exploratory multivariate data analysis methods, Journal de la Société Française de Statistique, 153, 79 Josse, 2012, Selecting the number of components in principal component analysis using cross-validation approximations, Computational Statistics & Data Analysis, 56, 1869, 10.1016/j.csda.2011.11.012 Laaksonen, 2000, Regression-based nearest neighbour hot decking, Computational Statistics, 15, 65, 10.1007/s001800050037 Li, 2004, vol. 3066, 573 Liberman, 2005, How much more likely? The implications of odds ratios for probabilities, American Journal of Evaluation, 26, 253, 10.1177/1098214005275825 Little, 2003 Montgomery, 2009 Mussa, 2005, The use of genetic algorithm and neural networks to approximate missing data, Computing and Informatics, 24, 577 Nelwamondo, 2007, Missing data : A comparison of neural network and expectation maximisation techniques, Current Science, 1514 Raiko, 2008, Principal component analysis for sparse high-dimensional data, 566 Ransing, 2016, A quality correlation algorithm for tolerance synthesis in manufacturing operations, Computers & Industrial Engineering, 93, 1, 10.1016/j.cie.2015.12.008 Ransing, 2013, A coupled penalty matrix approach and principal component based co-linearity index technique to discover product specific foundry process knowledge from in-process data in order to reduce defects, Computers in Industry, 64, 514, 10.1016/j.compind.2013.02.009 Roweis, 1998, EM algorithms for PCA and SPCA, 626 Schneider, 2001, Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values, Journal of Climate, 14, 853, 10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2 Sika, 2020, Cause-effect analysis using A&DM system for casting quality prediction, Archives of Foundry Engineering, 20, 5 Steiner, 2004 Stekhoven, 2012, Missforest-Non-parametric missing value imputation for mixed-type data, Bioinformatics, 28, 112, 10.1093/bioinformatics/btr597 Troyanskaya, 2001, Missing value estimation methods for DNA microarrays, Bioinformatics, 17, 520, 10.1093/bioinformatics/17.6.520 Uyan, 2022, Industry 4.0 foundry data management and supervised machine learning in low-pressure die casting quality improvement, International Journal of Metalcast Zhao, 2023, An earth mover’s distance based multivariate generalized likelihood ratio control chart for effective monitoring of 3D point cloud surface, Computers & Industrial Engineering, 175, 10.1016/j.cie.2022.108911