Investigating factors that affect the human perception on god class detection: an analysis based on a family of four controlled experiments

Sociedade Brasileira de Computacao - SB - Tập 5 - Trang 1-39 - 2017
José Amancio M. Santos1, João B. Rocha-Junior2,3, Manoel Gomes de Mendonça4,3
1Technology Department, State University of Feira de Santana, Avenida Transnordestina, s/n - Novo Horizonte, Feira de Santan, Bahia, Brazil
2Exact Science Department, State University of Feira de Santana, Bahia, Brazil
3Fraunhofer Project Center for Software & Systems Engineering, Federal University of Bahia, Bahia, Brazil
4Department of Computer Science, Federal University of Bahia, Bahia, Brazil

Tóm tắt

Evaluation of design problems in object oriented systems, which we call code smells, is mostly a human-based task. Several studies have investigated the impact of code smells in practice. Studies focusing on human identification of code smells have shown low agreement among developers. Unfortunately, those studies do not attempt to investigate the reasons behind this phenomenon. This paper aims to investigate factors affecting human perception of code smells. Specifically, it focuses on factors affecting god class detection, one of the most known code smells. The investigation encompassed a family of four controlled experiments, covering potential factors affecting human detection of code smells. The method is incremental. In other words, each experiment produces insights to the next one. This allows the investigators to control specific factors affecting the agreement on god class detection. The factors addressed in this study are: i) developer experience, ii) developer knowledge, iii) developer training, iv) tool support for design comprehension, and v) software size. Our findings show that tool support for design comprehension is the only factor that does not affect the human perception of god class. The other factors impact this perception in some way. The area still needs more investigation and discussion on what we call the code smell conceptualization problem, to ensure similar criteria and thresholds on human-based code smell detection.

Tài liệu tham khảo

Abbes, M, Khomh F, Guéhéneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension In: Proc. of 15th European Conference on Software Maintenance and Reengineering (CSMR), 181–190. Ahmed, I, Mannan UA, Gopinath R, Jensen C (2015) An empirical study of design degradation: How software projects get worse over time In: 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 1–10.. IEEE Computer Society, Washington, DC. Anda, B (2007) Assessing software system maintainability using structural measures and expert assessments In: Proc. of the 23rd IEEE International Conference on Software Maintenance (ICSM), 204–213.. IEEE Computer Society, Washington, DC. Basili, V, Shull F, Lanubile F (1999) Building knowledge through families of experiments. IEEE Trans Serv Comput 25(4):456–473. Carneiro, GF, Mendonça MG (2013) Sourceminer - a multi-perspective software visualization environment In: Proc. of the 15th International Conference on Enterprise Information Systems (ICEIS).. Springer, Berlin. Carneiro, G, Mendonça M (2014) Sourceminer: Towards an extensible multi-perspective software visualization environment. In: Hammoudi S, Cordeiro J, Maciaszek LA, Filipe J (eds)Enterprise Information Systems, Lecture Notes in Business Information Processing, vol 190, 242–263.. Springer International Publishing, Cham. Carneiro, G, Silva M, Maia L, Figueiredo E, Sant‘Anna C, Garcia A, Mendonça M (2010) Identifying code smells with multiple concern views In: Proc. of the 1th Brazilian Conference on Software: Theory and Practice (CBSOFT), 128–137.. IEEE Computer Society, Washington, DC. Carver, J, Jaccheri L, Morasca S, Shull F (2003) Issues in using students in empirical studies in software engineering education In: Proc. of the 9th International Software Metrics Symposium, 239–249.. IEEE Computer Society, Washington, DC. Cliff, N (1996) Ordinal Methods for Behavioral Data Analysis. Erlbaum, Mahwah. ISBN:9780805813333. Cohen, J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46. de Alwis, B, Murphy GC (2006) Using visual momentum to explain disorientation in the eclipse ide In: Proc. of the Visual Languages and Human-Centric Computing (VLHCC), 51–54.. IEEE Computer Society, Washington, DC. Dybå, T, Sjøberg DI, Cruzes DS (2012) What works for whom, where, when, and why?: On the role of context in empirical software engineering In: Proc. of the 6th International Symposium on Empirical Software Engineering and Measurement (ESEM), 19–28.. ACM, New York. Feinstein, AR, Cicchetti DV (1990) High agreement but low kappa: I. the problems of two paradoxes. J Clin Epidemiol 43(6):543–549. Fernandes, E, Oliveira J, Vale G, Paiva T, Figueiredo E (2016) A review-based comparative study of bad smell detection tools In: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, ACM, New York, NY, USA, EASE ’16, 18:1–18:12.. ACM, New York. Finn, RH (1970) A note on estimating the reliability of categorical data. Educ Psychol Meas 30:71–76. Fleiss, J, et al (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382. Fontana, FA, Braione P, Zanoni M (2012) Automatic detection of bad smells in code: An experimental assessment. J Object Technol 11(2):5:1–38. Fowler, M (1999) Refactoring: improving the design of existing code. Addison-Wesley Longman Publishing Co., Inc., Boston. Fu, S, Shen B (2015) Code bad smell detection through evolutionary data mining In: 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 1–9.. IEEE Computer Society, Washington, DC. Gwet, K (2002) Kappa statistic is not satisfactory for assessing the extent of agreement between raters. Stat Methods Inter-rater Reliab Assess 1(1):1–5. Höst, M, Regnell B, Wohlin C (2000) Using students as subjects: A comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng 5(3):201–214. Höst, M, Wohlin C, Thelin T (2005) Experimental context classification: incentives and experience of subjects In: Proc. of the 27th International Conference on Software Engineering (ICSE), 470–478.. IEEE Computer Society, Washington, DC. Jedlitschka, A, Ciolkowski M, Pfahl D (2008) Reporting experiments in software engineering. In: Shull F, Singer J, Sjberg DIK (eds)Guide to Advanced Empirical Software Engineering, 201–228.. Springer, London. Johnson, B, Shneiderman B (1991) Tree-maps: a space-filling approach to the visualization of hierarchical information structures In: Proc. of IEEE Conference on Visualization, 284–291.. IEEE Computer Society, Washington, DC. Jonathan, I, Maletic HK (2008) Expressiveness and effectiveness of program comprehension: Thoughts on future research directions In: Frontiers of Software Maintenance (FoSM), 31–37.. IEEE Computer Society, Washington, DC. Juristo, N, Vegas S (2009) Using differences among replications of software engineering experiments to gain knowledge In: Proc. of the 3rd International Symposium on Empirical Software Engineering and Measurement (ESEM), 356–366.. IEEE Computer Society, Washington, DC. Kersten, M, Murphy GC (2006) Using task context to improve programmer productivity In: Proc. of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’06/FSE-14), 1–11.. ACM, New York. Khomh, F, Vaucher S, Gueheneuc YG, Sahraoui H (2009) A bayesian approach for the detection of code and design smells In: Proc. of the 9th International Conference on Quality Software (QSIC), 305–314.. IEEE Computer Society, Washington, DC. Kitchenham, B, Madeyski L, Budgen D, Keung J, Brereton P, Charters S, Gibbs S, Pohthong A (2017) Robust statistical methods for empirical software engineering. Empir Softw Engg 22(2):579–630. Kreimer, J (2005) Adaptive detection of design flaws. Electron Notes Theor Comput Sci 141(4):117–136. Landis, JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174. Lanza, M, Ducasse S (2003) Polymetric views - a lightweight visual approach to reverse engineering. IEEE Trans Softw Eng 29(9):782–795. Lanza, M, Marinescu R (2005) Object-Oriented Metrics in Practice. Springer-Verlag New York, Inc., Secaucus. Li, W, Shatnawi R (2007) An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution. J Syst Softw 80(7):1120–1128. Linares-Vásquez, M, Klock S, McMillan C, Sabané A, Poshyvanyk D, Guéhéneuc YG (2014) Domain matters: Bringing further evidence of the relationships among anti-patterns, application domains, and quality-related metrics in java mobile apps In: Proceedings of the 22Nd International Conference on Program Comprehension, ICPC 2014, 232–243.. ACM, New York. Mäntylä, M (2005) An experiment on subjective evolvability evaluation of object-oriented software: explaining factors and interrater agreement In: Proc. of the 4th International Symposium on Empirical Software Engineering (ISESE).. IEEE Computer Society, Washington, DC. Mäntylä, M, Lassenius C (2006a) Subjective evaluation of software evolvability using code smells: An empirical study. Empir Softw Eng 11(3):395–431. Mäntylä, MV, Lassenius C (2006b) Drivers for software refactoring decisions In: Proc. of 5th International Symposium on Empirical Software Engineering (ISESE), 297–306.. Kluwer Academic Publishers, Hingham. Mäntylä, MV, Vanhanen J, Lassenius C (2004) Bad smells humans as code critics In: Proc. of the 20th IEEE International Conference on Software Maintenance (ICSM), 399–408.. IEEE Computer Society, Washington, DC. Mendonça, M, Maldonado J, de Oliveira M, Carver J, Fabbri S, Shull F, Travassos GH, Hohn E, Basili V (2008) A framework for software engineering experimental replications In: Proc. of the 13th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS), 203–212.. IEEE Computer Society, Washington, DC. Meyer, B (1988) Object-Oriented Software Construction. 1st edn. Prentice-Hall, Inc., Upper Saddle River. Moha, N, Guéhéneuc Y, Duchien L, Le Meur A (2010) Decor: A method for the specification and detection of code and design smells. Softw Eng. IEEE Trans 36(1):20–36. Moonen, L, Yamashita A (2012) Do code smells reflect important maintainability aspects? In: Proc. of 28th the International Conference on Software Maintenance (ICSM), 306–315.. IEEE Computer Society, Washington, DC. Murphy-Hill, E, Black AP (2010) An interactive ambient visualization for code smells In: Proc. of the 5th International Symposium on Software visualization (SOFTVIS), 5–14.. ACM, New York. Novais, R, Nunes C, Lima C, Cirilo E, Dantas F, Garcia A, Mendonça M (2012) On the proactive and interactive visualization for feature evolution comprehension: An industrial investigation In: Proceedings of the 34th International Conference on Software Engineering (ICSE), 1044–1053.. IEEE Press, Piscataway. Olbrich, SM, Cruzes DS, Sjøberg DIK (2010) Are all code smells harmful? a study of god classes and brain classes in the evolution of three open source systems In: Proc. of the 26th International Conference on Software Maintenance (ICSM), 1–10.. IEEE Computer Society, Washington, DC. Padilha, J, Figueiredo E, Sant’Anna C, Garcia A (2013) Detecting god methods with concern metrics: An exploratory study In: Proc. of the 7th Latin-American Workshop on Aspect-Oriented Software Development (LA-WASP), co-allocated with CBSoft.. ACM, New York. Palomba, F, Bavota G, Penta MD, Oliveto R, Lucia AD (2014) Do they really smell bad? a study on developers’ perception of bad code smells In: Proc. of the 30th IEEE International Conference on Software Maintenance and Evolution (ICSME).. IEEE Computer Society, Washington, DC. Parnin, C, Görg C, Nnadi O (2008) A catalogue of lightweight visualizations to support code smell inspection In: Proc. of the 4th international Smposium on Software visualization (SOFTVIS), 77–86.. ACM, New York. Powers, DMW (2012) The problem with kappa In: Proc. of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 345–355.. Association for Computational Linguistics, Stroudsburg. Price, BA, Baecker RM, Small IS (1998) An introduction to software visualization. In: Stasko J, Dominique J, Brown M, Price B (eds)Software Visualization, 4–26.. MIT Press, London. Rapu, D, Ducasse S, Girba T, Marinescu R (2004) Using history information to improve design flaws detection In: Proc. of 8th European Conference on Software Maintenance and Reengineering (CSRM), 223–232.. IEEE Computer Society, Washington, DC. Riel, AJ (1996) Object-Oriented Design Heuristics. 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston. Romano, J, Kromrey J, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’sd for evaluating group differences on the NSSE and other surveys? In: In annual meeting of the Florida Association of Institutional Research, 1–3. Santos, JA, de Mendonça MG, dos Santos CP, Novais RL (2014) The problem of conceptualization in god class detection: agreement, strategies and decision drivers. J Softw Eng Res Dev (JSERD) 2(11):1–33. Santos, JA, Mendonça M (2014) Identifying strategies on god class detection in two controlled experiments In: Proc. of the 26th International Conference on Software Engineering and Knowledge Engineering (SEKE), 244–249.. Knowledge Systems Institute Graduate School, Skokie. Santos, JA, Mendonça M (2015) Exploring decision drivers on god class detection in three controlled experiments In: Proc. of the 30th ACM/SIGAPP Symposium On Applied Computing (SAC), 1–8.. ACM, New York. Santos, JA, Mendonça M, Silva C (2013) An exploratory study to investigate the impact of conceptualization in god class detection In: Proc. of the 17th International Conference on Evaluation and Assessment in Software Engineering (EASE), 48–59.. ACM, New York. Schumacher, J, Zazworka N, Shull F, Seaman C, Shaw M (2010) Building empirical support for automated code smell detection In: Proc. of the 4th International Symposium on Empirical Software Engineering and Measurement (ESEM), 1–10.. ACM, New York. Simon, F, Steinbruckner F, Lewerentz C (2001) Metrics based refactoring In: Proc. of 5th European Conference on Software Maintenance and Reengineering (CSMR), 30–38.. IEEE Computer Society, Washington, DC. Sjøberg, D, Yamashita A, Anda B, Mockus A, Dyba T (2013) Quantifying the effect of code smells on maintenance effort. IEEE Trans Softw Eng 39(8):1144–1156. Van Emden, E, Moonen L (2002) Java quality assurance by detecting code smells In: Proc. of the 9th Working Conference on Reverse Engineering (WCRE), 97–106.. IEEE Computer Society, Washington, DC. Vinson, NG, Singer J (2008) A practical guide to ethical research involving humans. In: Shull F, Singer J, DIK Søberg (eds)Guide to Advanced Empirical Software Engineering, 229–256.. Springer, London. Whitehurst, GJ (1984) Interrater agreement for journal manuscript review. Am Psychol 39(1):22–28. Wohlin, C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2012) Experimentation in Software Engineering. Springer Berlin Heidelberg, Heidelberg. Yamashita, A, Counsell S (2013) Code smells as system-level indicators of maintainability: An empirical study. J Syst Softw 86(10):2639–2653. Yamashita, A, Moonen L (2013) Exploring the impact of inter-smell relations on software maintainability: An empirical study In: Proc. of the 35th International Conference on Software Engineering (ICSE), 682–691.. IEEE Press, Piscataway. Zhang, M, Hall T, Baddoo N (2011) Code bad smells: A review of current knowledge. Softw Maint Evol Res Pract 23(3):179–202.