Studying just-in-time defect prediction using cross-project models

Yasutaka Kamei1, Takafumi Fukushima1, Shane McIntosh2, Kazuhiro Yamashita1, Naoyasu Ubayashi1, Ahmed E. Hassan3
1Principles of Software Languages Group (POSL), Kyushu University, Fukuoka-shi, Fukuoka, Japan
2Department of Electrical and Computer Engineering, McGill University, Montréal, Canada
3Software Analysis and Intelligence Lab (SAIL), Queen’s University, Kingston, Canada

Tóm tắt

Từ khóa


Tài liệu tham khảo

Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761

Bettenburg N, Nagappan M, Hassan AE (2012) Think locally, act globally: Improving defect and effort prediction models. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’12), pp 60–69

Breiman L (2001) Random forests. Mach Learn 45(1):5–32

Briand LC, Melo WL, Wüst J (2002) Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans Softw Eng 28(7):706–720

Coolidge FL (2012) Statistics: A Gentle Introduction. SAGE Publications (3rd ed.)

D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’10), pp 31–41

Fukushima T, Kamei Y, McIntosh S, Yamashita K, Ubayashi N (2014) An empirical study of just-in-time defect prediction using cross-project models. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’14), pp 172–181

Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661

Guo P J, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: An empirical study of microsoft windows. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’10) vol 1, pp 495–504

Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304

Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’09), pp 78–88

He Z, Shu F, Yang Y, Li M, Wang Q (2012) An investigation on the feasibility of cross-project defect prediction. Automated Software Engg 19(2):167–199

Jiang Y, Cukic B, Menzies T (2008) Can data transformation help in the detection of fault-prone modules?. In: Proc. Workshop on Defects in Large Software Systems (DEFECTS’08), pp 16–20

Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto Ki (2007) The effects of over and under sampling on fault-prone module detection. In: Proc. Int’l Symposium on Empirical Softw. Eng. and Measurement (ESEM’07), pp 196–204

Kamei Y, Matsumoto S, Monden A, Matsumoto K, Adams B, Hassan AE (2010) Revisiting common bug prediction findings using effort aware models. In: Proc. Int’l Conf. on Software Maintenance (ICSM’10), pp 1–10

Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773

Kampstra P (2008) Beanplot: A boxplot alternative for visual comparison of distributions. J Stat Softw,Code Snippets 28(1):1–9

Kim S, Whitehead EJ, Zhang Y (2008) Classifying software changes: Clean or buggy IEEE Trans Softw Eng 34(2):181–196

Kocaguneli E, Menzies T, Keung J (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416

Koru AG, Zhang D, El Emam K, Liu H (2009) An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans Softw Eng 35(2):293–304

Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496

Li PL, Herbsleb J, Shaw M, Robinson B (2006) Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’06), pp 413–422

Matsumoto S, Kamei Y, Monden A, Matsumoto K (2010) An analysis of developer metrics for fault prediction. In: Proc. Int’l Conf. on Predictive Models in Softw. Eng. (PROMISE’10), pp 18:1–18:9

McIntosh S, Nagappan M, Adams B, Mockus A, Hassan A E (2014) A large-scale empirical study of the relationship between build technology and build maintenance. Empirical Software Engineering. doi: 10.1.1/jpb001 . http://link.springer.com/article/10.1007

Menzies T, Turhan B, Bener A, Gay G, Cukic B, Jiang Y (2008) Implications of ceiling effects in defect predictors. In: Proc. Int’l Conf. on Predictive Models in Softw. Eng. (PROMISE’10), pp 47–54

Menzies T, Butcher A, Marcus A, Zimmermann T, Cok D (2011) Local vs. global models for effort estimation and defect prediction. In: Proc. Int’l Conf. on Automated Software Engineering (ASE’11), pp 343–351

Menzies T, Butcher A, Cok D, Marcus A, Layman L, Shull F, Turhan B, Zimmermann T (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39(6):822–834

Minku LL, Yao X (2014) How to make best use of cross-company data in software effort estimation?. In: Proc. Int’l Conf. on Software Engineering (ICSE’14), pp 446–456

Mısırlı AT, Bener AB, Turhan B (2011) An industrial case study of classifier ensembles for locating software defects. Softw Qual J 19(3):515–536

Mockus A (2009) Amassing and indexing a large sample of version control systems: Towards the census of public source code history. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’09), pp 11–20

Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180

Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’08), 181–190

Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’05), pp 284–292

Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’06), pp 452–461

Nam J, Pan S J, Kim S (2013) Transfer defect learning. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’13), pp 382–391

Purushothaman R, Perry DE (2005) Toward understanding the rhetoric of small source code changes. IEEE Trans Softw Eng 31(6):511–526

Rahman F, Posnett D, Devanbu P (2012) Recalling the ”imprecision” of cross-project defect prediction. In: Proc. Int’l Symposium on the Foundations of Softw. Eng. (FSE’12), pp 61:1–61:11

Ratzinger J, Sigmund T, Gall HC (2008) On the relation of refactorings and software defect prediction. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’08), pp 35–38

Shihab E (2012) An exploration of challenges limiting pragmatic software defect prediction. PhD thesis, Queen’s University

Shihab E, Hassan AE, Adams B, Jiang ZM (2012) An industrial study on the risk of software changes. In: Proc. Int’l Symposium on the Foundations of Softw. Eng. (FSE’12), pp 62:1–62:11

Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’05), pp 1–5

Tan M, Tan L, Dara S, Mayuex C (2015) Online defect prediction for imbalanced data. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’13 SEIP), (To appear)

Thomas SW, Nagappan M, Blostein D, Hassan AE (2013) The impact of classifier configuration and classifier combination on bug localization. IEEE Trans Softw Eng 39(10):1427–1443

Turhan B (2012) On the dataset shift problem in software engineering prediction models. Empirical Softw Engg 17(1-2):62–74

Turhan B, Menzies T, Bener AB, Di Stefano J (2009) On the relative value of cross-company and within-company data for defect prediction. Empir Softw Eng 14 (5):540–578

Turhan B, Tosun A, Bener A (2011) Empirical evaluation of mixed-project defect prediction models. In: Proc. EUROMICRO Conf. on Software Engineering and Advanced Applications (SEAA’11), pp 396–403

Wu R, Zhang H, Kim S, Cheung SC (2011) Relink: recovering links between bugs and changes. In: Proc. European Softw. Eng. Conf. and Symposium on the Foundations of Softw. Eng. (ESEC/FSE’11), pp 15–25

Zhang F, Mockus A, Zou Y, Khomh F, Hassan AE (2013) How does context affect the distribution of software maintainability metrics?. In: Proc. Int’l Conf. on Software Maintenance (ICSM’13), pp 350–359

Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’14), pp 182–191

Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proc. European Softw. Eng. Conf. and Symposium on the Foundations of Softw. Eng. (ESEC/FSE’09), pp 91–100