GitHub Discussions: An exploratory study of early adoption

Empirical Software Engineering - Tập 27 - Trang 1-32 - 2021
Hideaki Hata1, Nicole Novielli2, Sebastian Baltes3, Raula Gaikovina Kula4, Christoph Treude5
1Shinshu University, Nagano, Japan
2University of Bari, Bari, Italy
3QAware GmbH, University of Adelaide, Adelaide, Australia
4Nara Institute of Science and Technology, Ikoma, Japan
5University of Melbourne, Parkville, Australia

Tóm tắt

Discussions is a new feature of GitHub for asking questions or discussing topics outside of specific Issues or Pull Requests. Before being available to all projects in December 2020, it had been tested on selected open source software projects. To understand how developers use this novel feature, how they perceive it, and how it impacts the development processes, we conducted a mixed-methods study based on early adopters of GitHub discussions from January until July 2020. We found that: (1) errors, unexpected behavior, and code reviews are prevalent discussion categories; (2) there is a positive relationship between project member involvement and discussion frequency; (3) developers consider GitHub Discussions useful but face the problem of topic duplication between Discussions and Issues; (4) Discussions play a crucial role in advancing the development of projects; and (5) positive sentiment in Discussions is more frequent than in Stack Overflow posts. Our findings are a first step towards data-informed guidance for using GitHub Discussions, opening up avenues for future work on this novel communication channel.

Tài liệu tham khảo

Abdellatif A, Badran K, Shihab E (2020) MSRBot: Using bots to answer questions from software repositories. Empir Softw Eng 25(3):1834–1863. https://doi.org/10.1007/s10664-019-09788-5 Allamanis M, Sutton C (2013) Why, when, and what: Analyzing stack overflow questions by topic, type, and code. In: Proc. of the 10th working conference on mining software repositories, MSR ’13. IEEE Press, pp 53–56 Aranda J, Venolia G (2009) The secret life of bugs: Going past the errors and omissions in software repositories. In: Proc. of the 31st international conference on software engineering, ICSE ’09. Association for Computing Machinery, New York, pp 298–308, DOI https://doi.org/10.1109/ICSE.2009.5070530 Arya D, Wang W, Guo JLC, Cheng J (2019) Analysis and detection of information types of open source software issue discussions. In: Proc. of the 41st international conference on software engineering, ICSE ’19. IEEE Press, pp 454–464, DOI https://doi.org/10.1109/ICSE.2019.00058 Balali S, Annamalai U, Padala HS, Trinkenreich B, Gerosa MA, Steinmacher I, Sarma A (2020) Recommending tasks to newcomers in oss projects: How do mentors handle it?. In: Proc. of the 16th international symposium on open collaboration, OpenSym ’20. Association for Computing Machinery, New York, DOI https://doi.org/10.1145/3412569.3412571 Baltes S, Diehl S (2019) Usage and attribution of stack overflow code snippets in github projects. Empir Softw Eng 24(3):1259–1295. https://doi.org/10.1007/s10664-018-9650-5 Baltes S, Dumani L, Treude C, Diehl S (2018) Sotorrent: reconstructing and analyzing the evolution of stack overflow posts. In: Proc. of the 15th international conference on mining software repositories, MSR ’18. ACM, pp 319–330, DOI https://doi.org/10.1145/3196398.3196430 Beyer S, Pinzger M (2016) Grouping android tag synonyms on stack overflow. In: Proc. of the 13th international conference on mining software repositories, MSR ’16. Association for Computing Machinery, New York, pp 430–440, DOI https://doi.org/10.1145/2901739.2901750 Beyer S, Macho C, Di Penta M, Pinzger M (2020) What kind of questions do developers ask on Stack Overflow? A comparison of automated approaches to classify posts into question categories. Empir Softw Eng 25(3):2258–2301. https://doi.org/10.1007/s10664-019-09758-x Borges H, Hora A, Valente MT (2016) Understanding the factors that impact the popularity of github repositories. In: Proc. of the 32nd IEEE international conference on software maintenance and evolution, ICSME ’16, pp 334–344 Braun V, Clarke V (2006) Using thematic analysis in psychology. Qualit Res Psychol 3(2):77–101 Calefato F, Lanubile F, Maiorano F, Novielli N (2018a) Sentiment polarity detection for software development. Empir Softw Eng 23(3):1352–1382. https://doi.org/10.1007/s10664-017-9546-9 Calefato F, Lanubile F, Novielli N (2018b) How to ask for technical help? evidence-based guidelines for writing questions on stack overflow. Inf Softw Technol 94:186–207. https://doi.org/10.1016/j.infsof.2017.10.009 Calefato F, Lanubile F, Novielli N, Quaranta L (2019) Emtk: The emotion mining toolkit. In: Proc. of the 4th international workshop on emotion awareness in software engineering, SEmotion ’19. IEEE Press, pp 34–37, DOI https://doi.org/10.1109/SEmotion.2019.00014 Chatterjee P, Damevski K, Pollock L (2021) Automatic extraction of opinion-based q&a from online developer chats. In: Proc. of the 43rd international conference on software engineering, ICSE. IEEE, pp 1260–1272 Cleary B, Gómez C, Storey MA, Singer L, Treude C (2013) Analyzing the friendliness of exchanges in an online software developer community. In: Proc. of the 6th international workshop on cooperative and human aspects of software engineering, CHASE. IEEE, pp 159–160 Cleveland WS, Loader C (1996) Smoothing by local regression: Principles and methods. In: Härdle W, Schimek MG (eds) Statistical theory and computational aspects of smoothing. Physica-Verlag HD, Heidelberg, pp 10–49 Cohen J (1988) Statistical power analysis for the behavioral sciences. Lawrence Erlbaum Associates Dias E, Meirelles P, Castor F, Steinmacher I, Wiese I, Pinto G (2021) What makes a great maintainer of open source projects?. In: Proc. of the 43rd international conference on software engineering, ICSE ’21, pp 982–994, DOI https://doi.org/10.1109/ICSE43902.2021.00093 Ebert F, Castor F, Novielli N, Serebrenik A (2019) Confusion in code reviews: Reasons, impacts, and coping strategies. In: Proc of the IEEE 26th international conference on software analysis, evolution and reengineering, SANER ’19, pp 49–60 Fleiss JL, Cohen J (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas 33 (3):613–619. https://doi.org/10.1177/001316447303300309 Gachechiladze D, Lanubile F, Novielli N, Serebrenik A (2017) Anger and its direction in collaborative software development. In: Proc. of the 39th international conference on software engineering: new ideas and emerging results track, ICSE-NIER ’17. IEEE Press, pp 11–14, DOI https://doi.org/10.1109/ICSE-NIER.2017.18 Giuffrida R, Dittrich Y (2013) Empirical studies on the use of social software in global software development–a systematic mapping study. Inf Softw Technol 55(7):1143–1164 Guzman E, Azócar D, Li Y (2014) Sentiment analysis of commit comments in github: An empirical study. In: Proc. of the 11th working conf. on mining software repositories, MSR ’14. ACM, New York, pp 352–355, DOI https://doi.org/10.1145/2597073.2597118 Guzzi A, Bacchelli A, Lanza M, Pinzger M, Van Deursen A (2013) Communication in open source software development mailing lists. In: Proc. of the 10th working conference on mining software repositories, MSR ’13. IEEE, pp 277–286 Hata H, Todo T, Onoue S, Matsumoto K (2015) Characteristics of sustainable oss projects: A theoretical and empirical study. In: Proc. of the IEEE/ACM 8th international workshop on cooperative and human aspects of software engineering, CHASE ’15. IEEE Computer Society, USA, pp 15–21, DOI https://doi.org/10.1109/CHASE.2015.9 Hata H, Treude C, Kula RG, Ishio T (2019) 9.6 million links in source code comments: Purpose, evolution, and decay. In: Proc. of the 41st international conference on software engineering, ICSE ’19. IEEE Press, pp 1211–1221, DOI https://doi.org/10.1109/ICSE.2019.00123 Hata H, Novielli N, Baltes S, Kula RG, Treude C (2021) Research Artifact: An Exploratory Study of GitHub Discussions Early Adoption. https://doi.org/10.5281/zenodo.5026134 Hirao T, Kula RG, Ihara A, Matsumoto K (2019) Understanding developer commenting in code reviews. IEICE Trans Inf Sys E102.D(12):2423–2432 Hirao T, McIntosh S, Ihara A, Matsumoto K (2020) Code reviews with divergent review scores: An empirical study of the openstack and qt communities. IEEE Trans Softw Eng Inokuchi A, Sulistyo Nugroho Y, Wattanakriengkrai S, Konishi F, Hata H, Treude C, Monden A, Matsumoto K (2019) From academia to software development: publication citations in source code comments. arXiv:1910.06932 Islam MR, Zibran MF (2017) Leveraging automated sentiment analysis in software engineering. In: Proc. of the 14th international conf. on mining software repositories, MSR ’17. IEEE Press, pp 203–214, DOI https://doi.org/10.1109/MSR.2017.9 Jiang J, Yang Y, He J, Blanc X, Zhang L (2017) Who should comment on this pull request? Analyzing attributes for more accurate commenter recommendation in pull-based development. Inf Softw Technol:48–62 Lin B, Zampetti F, Bavota G, Di Penta M, Lanza M (2019) Pattern-based mining of opinions in q&a websites. In: Proc. of the 41st international conference on software engineering, ICSE ’19. IEEE Press, pp 548–559, DOI https://doi.org/10.1109/ICSE.2019.00066 Maipradit R, Lin B, Nagy C, Bavota G, Lanza M, Hata H, Matsumoto K (2020a) Automated identification of on-hold self-admitted technical debt. In: Proc. of the IEEE 20th international working conference on source code analysis and manipulation, SCAM ’20. IEEE Computer Society, Los Alamitos, pp 54–64, DOI https://doi.org/10.1109/SCAM51674.2020.00011 Maipradit R, Treude C, Hata H, Matsumoto K (2020b) Wait for it: identifying ”On-Hold” self-admitted technical debt. Empir Softw Eng 25 (5):3770–3798. https://doi.org/10.1007/s10664-020-09854-3 Mäntylä M, Adams B, Destefanis G, Graziotin D, Ortu M (2016) Mining valence, arousal, and dominance: Possibilities for detecting burnout and productivity?. In: Proc. of the 13th international conf. on mining software repositories, MSR ’16. ACM, New York, pp 247–258, DOI https://doi.org/10.1145/2901739.2901752 Morrison P, Murphy-Hill E (2015) Is programming knowledge related to age? An exploration of stack overflow. In: Di Penta M, Pinzger M, Robbes R (eds) 12Th working conference on mining software repositories (MSR 2015). IEEE Computer Society, Florence, pp 69–72 Munaiah N, Kroh S, Cabrey C, Nagappan M (2017) Curating github for engineered software projects. Empir Softw Eng 22(6):3219–3253. https://doi.org/10.1007/s10664-017-9512-6 Murgia A, Tourani P, Adams B, Ortu M (2014) Do developers feel emotions? an exploratory analysis of emotions in software artifacts. In: Proc. of the 11th working conf. on mining software repositories, MSR ’14. ACM, New York, pp 262–271, DOI https://doi.org/10.1145/2597073.2597086 Novielli N, Serebrenik A (2019) Sentiment and emotion in software engineering. IEEE Softw 36(5):6–23. https://doi.org/10.1109/MS.2019.2924013 Novielli N, Girardi D, Lanubile F (2018) A benchmark study on sentiment analysis for software engineering research. In: Proc. of the 15th international conference on mining software repositories, MSR ’18. Association for Computing Machinery, New York, pp 364–375, DOI https://doi.org/10.1145/3196398.3196403 Novielli N, Begel A, Maalej W (2019) Introduction to the special issue on affect awareness in software engineering. J Sys Softw 148:180–182. https://doi.org/10.1016/j.jss.2018.11.016. http://www.sciencedirect.com/science/article/pii/S0164121218302504 Ortu M, Adams B, Destefanis G, Tourani P, Marchesi M, Tonelli R (2015) Are bullies more productive? empirical study of affectiveness vs. issue fixing time. In: Proc. of the 12th working conf. on mining software repositories, MSR ’15. IEEE Press, pp 303–313 Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retriev 2(1-2):1–135. https://doi.org/10.1561/1500000011 Pascarella L, Spadini D, Palomba F, Bruntink M, Bacchelli A (2018) Information needs in contemporary code review. In: Proc. of the 21st ACM conference on computer supported cooperative work, CSCW ’18, vol 2, pp 135:1–135:27 Pletea D, Vasilescu B, Serebrenik A (2014) Security and emotion: Sentiment analysis of security discussions on github. In: Proc. of the 11th working conf. on mining software repositories, MSR ’14. ACM, New York, pp 348–351, DOI https://doi.org/10.1145/2597073.2597117 Potdar A, Shihab E (2014) An exploratory study on self-admitted technical debt. In: Proc. of the 2014 IEEE international conference on software maintenance and evolution, ICSME ’14. IEEE Computer Society, USA, pp 91–100, DOI https://doi.org/10.1109/ICSME.2014.31 Rahman MM, Roy CK, Kula RG (2017) Predicting usefulness of code review comments using textual features and developer experience. In: Proc. of the 14th international conference on mining software repositories, MSR ’17, pp 215–226 Raman N, Cao M, Tsvetkov Y, Kästner C, Vasilescu B (2020) Stress and burnout in open source: Toward finding, understanding, and mitigating unhealthy interactions. In: Proc. of the ACM/IEEE 42nd international conference on software engineering: new ideas and emerging results, ICSE-NIER ’20. Association for Computing Machinery, New York, pp 57–60, DOI https://doi.org/10.1145/3377816.3381732 Robillard MP, Treude C (2020) Understanding wikipedia as a resource for opportunistic learning of computing concepts. In: Proc. of the 51st ACM technical symposium on computer science education, SIGCSE ’20. Association for Computing Machinery, New York, pp 72–78, DOI https://doi.org/10.1145/3328778.3366832 Rosen C, Shihab E (2016) What are mobile developers asking about? a large scale study using stack overflow. Empir Softw Eng 21(3):1192–1223. https://doi.org/10.1007/s10664-015-9379-3 Sahar H, Hindle A, Bezemer CP (2021) How are issue reports discussed in gitter chat rooms? J Syst Softw 172:110852 Sinha V, Lazar A, Sharif B (2016) Analyzing developer sentiment in commit logs. In: Proc. of the 13th international conf. on mining software repositories, MSR ’16. ACM, New York, pp 520–523, DOI https://doi.org/10.1145/2901739.2903501 Steinmacher I, Graciotto Silva MA, Gerosa MA, Redmiles D (2014) A systematic literature review on the barriers faced by newcomers to open source software projects. Inf Softw Technol 59:67–85. https://doi.org/10.1016/j.infsof.2014.11.001 Steinmacher I, Treude C, Gerosa MA (2019) Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Softw 36(4):41–49. https://doi.org/10.1109/MS.2018.110162131 Storey MA, Ryall J, Bull RI, Myers D, Singer J (2008) Todo or to bug: Exploring how task annotations play a role in the work practices of software developers. In: Proc. of the 30th international conference on software engineering, ICSE ’08. Association for Computing Machinery, New York, pp 251–260, DOI https://doi.org/10.1145/1368088.1368123 Storey MA, Zagalsky A, Figueira Filho F, Singer L, German DM (2016) How social and communication channels shape and challenge a participatory culture in software development. IEEE Trans Softw Eng 43(2):185–204 Sulistyo Nugroho Y, Islam S, Nakasai K, Rehman I, Hata H, Gaikovina Kula R, Nagappan M, Matsumoto K (2020) Sustaining a Healthy Ecosystem: Participation, Discussion, and Interaction in Eclipse Forums. arXiv:2009.09130 Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Am Soc Inf Sci Technol 61 (12):2544–2558 Treude C, Barzilay O, Storey MA (2011) How do programmers ask and answer questions on the web? (nier track). In: Proc. of the 33rd international conference on software engineering, ICSE ’11. Association for Computing Machinery, New York, pp 804–807, DOI https://doi.org/10.1145/1985793.1985907 Tsay J, Dabbish L, Herbsleb J (2014) Let’s talk about it: Evaluating contributions through discussion in github. In: Proc. of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, FSE ’14, pp 144–154 Uddin G, Khomh F (2017) Opiner: An opinion search and summarization engine for apis. In: Proc. of the 32nd IEEE/ACM international conf. on automated software engineering, ASE ’17. IEEE Press, pp 978–983 Vasilescu B, Capiluppi A, Serebrenik A (2012) Gender, representation and online participation: a quantitative study of StackOverflow. In: Aberer K, Flache A, Jager W, Liu L, Tang J, Gueret C (eds) Proc. of the 4th international conference on social informatics, springer, lausanne, switzerland, socinfo ’12, pp 332–338 Viera AJ, Garrett JM (2005) Understanding interobserver agreement: the kappa statistic. Family Med 37(5):360–3 Wang S, Lo D, Jiang L (2013) An empirical study on developer interactions in StackOverflow. In: Shin SY, Maldonado JC (eds) Proc. of the 28th annual ACM symposium on applied computing, SAC ’13. ACM, Coimbra, Portugal, pp 1019–1024 Wang S, Chen TP, Hassan AE (2018) How do users revise answers on technical q&a websites? a case study on stack overflow. IEEE Trans Softw Eng 46 (9):1024–1038 Yang D, Martins P, Saini V, Lopes CV (2017) Stack Overflow in github: any snippets there?. In: Gonzalez-Barahona JM, Hindle A, Tan L (eds) Proc. of the 14th international conference on mining software repositories, MSR ’17. IEEE Computer Society, Buenos Aires, Argentina, pp 280–290