Techniques for evaluating fault prediction models

Empirical Software Engineering - Tập 13 Số 5 - Trang 561-595 - 2008
Yue Jiang1, Bojan Čukić1, Yan Ma2
1Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, USA 26506-6109
2Department of Statistics, West Virginia University, Morgantown, USA 26506-6109#TAB#

Tóm tắt

Từ khóa


Tài liệu tham khảo

Adams NM, Hand DJ (1999) Comparing classifiers when the misallocation costs are uncertain. Pattern Recognit 32:1139–1147. doi: 10.1016/S0031-3203(98)00154-X

Arisholm E, Briand LC (2006) Predicting fault-prone components in a java legacy system. Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering (ISESE’06)

Azar D, Precup D, Bouktif S, Kegl B, Sahraoui H (2002) Combining and adapting software quality predictive models by genetic algorithms. 17th IEEE International Conference on Automated Software Engineering. IEEE Computer Society

Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761. doi: 10.1109/32.544352

Boetticher GD (2005) Nearest neighbor sampling for better defect prediction. ACM SIGSOFT Software Engineering Notes, 30(4). ACM, New York, NY, pp 1–6

Braga AC, Costa L, Oliveira P (2006) A nonparametric method for the comparison of areas under two ROC curves. International Conference on Robust Statistics (ICORS06). Technical University of Lisbon, 16–21 July 2006, Lisbon, Portugal

Breiman L (2001) Random forests. Mach Learn 45:5–32. doi: 10.1023/A:1010933404324

Challagulla VUB, Bastani FB, Yen I-L, Paul RA (2005) Empirical assessment of machine learning based software defect prediction techniques. Proceedings of the 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems (WORDS’05), pp 263–270

Conover WJ (1999) Practical nonparametric statistics. Wiley, New York

Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, PA, pp 233–240

Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

Drummond C, Holte RC (2006) Cost curves: an improved method for visualizing classifier performance. Mach Learn 65(1):95–130. doi: 10.1007/s10994-006-8199-5

El-Emam K, Benlarbi S, Goel N, Rai SN (2001) Comparing case-based reasoning classifiers for predicting high-risk software components. J Syst Softw 55(3):301–320. doi: 10.1016/S0164-1212(00)00079-0

Fenton N, Neil M (1999) Software metrics and risk. The 2nd European Software Measurement Conference (FESMA 99), TI-KVIV, Amsterdam, pp 39–55

Gokhale SS, Lyu MR (1997) Regression tree modeling for the prediction of software quality. In: Pham H (ed) The third ISSAT International Conference on Reliability and Quality in Design. Anaheim, CA, pp 31–36

Guo L, Ma Y, Cukic B, Singh H (2004) Robust prediction of fault-proneness by random forests. Proceedings of the 15th IEEE International Symposium on Software Reliability Engineering (ISSRE 2004), IEEE Press

Khoshgoftaar TM, Lanning DL (1995) A neural network approach for early detection of program modules having high risk in the maintenance phase. J Syst Softw 29(1):85–91. doi: 10.1016/0164-1212(94)00130-F

Khoshgoftaar TM, Allen EB, Ross FD, Munikoti R, Goel N, Nandi A (1997) Predicting fault-prone modules with case-based reasoning. The Eighth International Symposium on Software Engineering (ISSRE '07). IEEE Computer Society, pp 27–35

Khoshgoftaar TM, Seliya N (2002) Tree-based software quality estimation models for fault prediction. The 8th IEEE Symposium on Software Metrics (METRICS’02), IEEE Computer Society, pp 203–214

Khoshgoftaar TM, Cukic B, Seliya N (2007) An empirical assessment on program module-order models. Qual Technol Quant Manag 4(2):171–190

Koru AG, Liu H (2005) Building effective defect-prediction models in practice. IEEE Softw 22(6):23–29. doi: 10.1109/MS.2005.149

Kubat M, Holte RC, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2–3):195–215. doi: 10.1023/A:1007452223027

Lewis D, Gale W (1994) A sequential algorithm for training text classifiers. Annual ACM Conference on Research and Development in Information Retrieval, the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Springer-Verlag, New York, NY, pp 3–12

Ling CX, Li C (1998) Data mining for direct marketing: problems and solutions. Proc. of the 4th Intern. Conf. on Knowledge Discovery and Data Mining, New York, pp 73–79

Ma Y (2007) An empirical investigation of tree ensembles in biometrics and bioinformatics. West Virginia University, PhD thesis, January 2007

Macskassy S, Provost F, Rosset S (2005a) Pointwise ROC confidence bounds: an empirical evaluation. Proceedings of the Workshop on ROC Analysis in Machine Learning (ROCML-2005)

Macskassy S, Provost F, Rosset S (2005b) ROC confidence bands: an empirical evaluation. Proceedings of the 22nd International Conference on Machine Learning (ICML). Bonn, Germany

Menzies T, Stefano JD, Ammar K, Chapman RM, McGill K, Callis P et al (2003) When can we test less? Proceedings of the Ninth International Software Metrics Symposium (METRICS’03), IEEE Computer Society

Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13. doi: 10.1109/TSE.2007.256941

Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355. doi: 10.1109/TSE.2005.49

Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Softw Eng 22(12):886–894. doi: 10.1109/32.553637

Ohlsson N, Eriksson AC, Helander ME (1997) Early risk-management by identification of fault-prone modules. Empir Softw Eng 2(2):166–173. doi: 10.1023/A:1009757419320

Selby RW, Porter AA (1988) Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Trans Softw Eng 14(12):1743–1757. doi: 10.1109/32.9061

Siegel S (1956) Nonparametric statistics. McGraw-Hill, New York

Vuk M, Curk T (2006) ROC curve, lift chart and calibration plot. Metodoloski zvezki 3:89–108

Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann

Youden W (1950) Index for rating diagnostic tests. Cancer 3:32–35. doi: 10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3

Yousef WA, Wagner RF, Loew MH (2004) Comparison of non-parametric methods for assessing classifier performance in terms of ROC parameters. In Proceedings of Applied Imagery Pattern Recognition Workshop, vol. 33, issue 13–15, pp 190–195

Zhang H, Zhang X (2007) Comments on ‘data mining static code attributes to learn defect predictors’. IEEE Trans Softw Eng 33(9):635–637