Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Cải thiện hiệu suất phân loại báo cáo lỗi bằng cách sử dụng mô hình sinh tài liệu dựa trên trí tuệ nhân tạo

Human-centric Computing and Information Sciences - Tập 10 Số 1 - 2020

Dong-Gun Lee¹, Yeong-Seok Seo¹

¹Department of Computer Engineering, Yeungnam University, Gyeongsan, Republic of Korea

Tóm tắt

Tóm tắt Trí tuệ nhân tạo là một trong những công nghệ chính để phát triển cuộc cách mạng công nghiệp lần thứ tư. Công nghệ này cũng có tác động đáng kể đến các chuyên gia phần mềm, những người luôn nỗ lực đạt được phát triển phần mềm chất lượng cao bằng cách sửa các loại lỗi phần mềm khác nhau. Trong quá trình phát triển và bảo trì phần mềm, lỗi phần mềm là yếu tố chính có thể ảnh hưởng đến chi phí và thời gian giao hàng phần mềm. Để sửa lỗi phần mềm một cách hiệu quả, các kho lưu trữ lỗi mở được sử dụng để nhận diện các báo cáo lỗi và phân loại, ưu tiên các báo cáo được giao cho các nhà phát triển phần mềm phù hợp nhất dựa trên mức độ quan tâm và chuyên môn của họ. Do thiếu tài nguyên như thời gian và nhân lực, quá trình phân loại báo cáo lỗi này cực kỳ quan trọng trong phát triển phần mềm. Để cải thiện hiệu suất phân loại báo cáo lỗi, nhiều nghiên cứu đã tập trung vào việc sử dụng phân phối Dirichlet tiềm ẩn (LDA) kết hợp với k láng giềng gần nhất hoặc máy vector hỗ trợ. Mặc dù các phương pháp hiện có đã cải thiện độ chính xác của phân loại lỗi, nhưng chúng thường gây ra xung đột giữa các kỹ thuật kết hợp và tạo ra kết quả phân loại sai. Trong nghiên cứu này, chúng tôi đề xuất một phương pháp để cải thiện hiệu suất phân loại báo cáo lỗi bằng cách sử dụng nhiều tập chủ đề dựa trên LDA thông qua việc cải thiện LDA. Phương pháp đề xuất cải thiện các tập chủ đề hiện có của LDA bằng cách xây dựng hai tập chủ đề phụ. Trong thí nghiệm của chúng tôi, chúng tôi đã thu thập báo cáo lỗi từ một hệ thống theo dõi lỗi phổ biến, Bugzilla, cũng như báo cáo lỗi Android, để đánh giá phương pháp đề xuất và chứng minh việc đạt được hai mục tiêu sau: tăng độ chính xác của phân loại báo cáo lỗi và đảm bảo tính tương thích với các phương pháp hiện đại khác.

Từ khóa

Tài liệu tham khảo

Tunio MZ, Luo H, Wang C, Zhao F (2018) Crowdsourcing software development: task assignment using PDDL artificial intelligence planning. J Inf Processing Syst 14(1):129–139

Park JH, Salim MM, Jo JH, Sicato JCS, Rathore S, Park JH (2019) CIoT-Net: a scalable cognitive IoT based smart city network architecture. Hum Centric Comput Inf Sci 9(1):29

Jang Y, Park CH, Seo YS (2019) Fake news analysis modeling using quote retweet. Electronics 8(12):1–20

Kim SW, Gil JM (2019) Research paper classification systems based on TF-IDF and LDA schemes. Hum Centric Comput Inf Sci 9(1):30

Tian Y, Song W, Sun S, Fong S, Zou S (2019) 3D object recognition method with multiple feature extraction from LiDAR point clouds. J Supercomput 75(8):4430–4442

Song W, Tian Y, Fong S, Cho K, Wang W, Zhang W (2016) GPU-accelerated foreground segmentation and labeling for real-time video surveillance. Sustainability 8(10):916

Huh JH, Seo YS (2019) Understanding edge computing: engineering evolution with artificial intelligence. IEEE Access 7:164229–164245

Seo YS, Huh JH (2019) Automatic emotion-based music classification for supporting intelligent IoT applications. Electronics 8(2):1–20

Wang J, Ju C, Gao Y, Sangaiah AK, Kim GJ (2018) A PSO based energy efficient coverage control algorithm for wireless sensor networks. Comput Mater Contin 56(3):433–446

Wang J, Gao Y, Liu W, Sangaiah AK, Kim HJ (2019) An intelligent data gathering schema with data fusion supported for mobile sink in wireless sensor networks. Int J Distrib Sens Netw 15(3):1–9

Wang J, Gao Y, Yin X, Li F, Kim HJ (2018) An enhanced PEGASIS algorithm with mobile sink support for wireless sensor networks. Wirel Commun Mobile Comput 2018:1–9

Wang J, Wu W, Liao Z, Sangaiah AK, Sherratt RS (2019) An energy-efficient offloading scheme for low latency in collaborative edge computing. IEEE Access 7:149182–149190

Jimoh RG, Balogun AO, Bajeh AO, Ajayi S (2018) A PROMETHEE based evaluation of software defect predictors. J Comput Sci Appl 25(1):106–119

Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inf Softw Technol 58:388–402

Tran HM, Le ST, Nguyen SV, Ho PT (2020) An analysis of software bug reports using machine learning techniques. SN Comput Sci 1(1):4

García-Floriano A, López-Martín C, Yáñez-Márquez C, Abran A (2018) Support vector regression for predicting software enhancement effort. Inf Softw Technol 97:99–109

Alaqail H, Ahmed S (2018) Overview of software testing standard ISO/IEC/IEEE 29119. Inf Softw Technol 18(2):112–116

Mann M, Tomar P, Sangwan OP (2018) Bio-inspired metaheuristics: evolving and prioritizing software test data. Appl Intell 48(3):687–702

Zhang H (2019) Research on software development and test environment automation based on android platform. 3rd International Conference on mechatronics engineering and information technology. Atlantis Press, Paris

Thakur D, Types of software maintenance. http://ecomputernotes.com/software-engineering/types-of-software-maintenance. Accessed 22 Sep 2019

Stojanov Z, Stojanov J, Dobrilovic D, Petrov N (2017) Trends in software maintenance tasks distribution among programmers: A study in a micro software company. 2017 IEEE 15th International Symposium on intelligent systems and informatics, pp 23–28

Jang JW (2018) Improvement of the automobile control software testing process using a test maturity model. J Inf Process Syst 14(3):607–620

Life Cycle of a bug. https://www.bugzilla.org/docs/2.18/html/lifecycle.html

Anvik J, Hiew L, Murphy GC (2005) Coping with an open bug repository. Proceedings of the 2005 OOPSLA workshop on Eclipse technology eXchange—eclipse ‘05. pp 35–39

Ye X, Fang F, Wu J, Bunescu R, Liu C (2018) Bug Report Classification using LSTM architecture for more accurate software defect locating. 17th IEEE International Conference on machine learning and applications. pp 1438–1445

Terdchanakul P, Hata H, Phannachitta P, Matsumoto K (2017) Bug or not? bug report classification using N-gram IDF. IEEE International Conference on software maintenance and evolution. pp 534–538

Guo S, Chen R, Wei M, Li H, Liu Y (2018) Ensemble data reduction techniques and multi-RSMOTE via fuzzy integral for bug report classification. IEEE Access 6:45934–45950

Kukkar A, Mohana R (2018) A supervised bug report classification with incorporate and textual field knowledge. Procedia Comput Sci 132:352–361

Du X, Zheng Z, Xiao G, Yin B (2017) The automatic classification of fault trigger based bug report. IEEE International Symposium on software reliability engineering workshops. pp 259–265

Xu R, Ye L, Xu J (2013) Reader’s emotion prediction based on weighted Latent Dirichlet Allocation and multi-label k-nearest neighbor model. J Comput Inf Syst 9(6):2209–2216

Safi’ie MA, Utami E, Fatta HA (2018) Latent Dirichlet Allocation (LDA) model and knn algorithm to classify research project selection. IOP Conference Series Mater Sci Engin 333(1):012110

Chen W, Zhang X (2017) Research on text categorization model based on LDA—KNN. 2017 IEEE 2nd advanced information technology, electronic and automation Control Conference. pp 2719–2726

Liu X, Agarwal S, Ding C, Yu Q (2016) An LDA-SVM active learning framework for web service classification. 2016 IEEE International Conference on web services. pp 49–56

Wang X, Wang J, Yang Y, Duan J (2017) Labeled LDA-Kernel SVM: A short Chinese text supervised classification based on sina weibo. 2017 4th International Conference on information science and control engineering. pp 428–432

Deliu I, Leichter C, Franke K (2018) Collecting cyber threat intelligence from hacker forums via a two-stage, hybrid process using support vector machines and Latent Dirichlet Allocation. 2018 IEEE International Conference on Big Data. pp 5008–5013

Lee DG, Seo YS (2019) Systematic review of bug report processing techniques to improve software management performance. J Inf Processing Syst. 15(4):967–985

Bugzilla. https://bugzilla.mozilla.org/home. Accessed 22 Sep 2019

Mining challenge. http://2012.msrconf.org/challenge.php#challenge_data. Accessed 22 Sep 2019

Martie L, Palepu VK, Sajnani H, Lopes C (2012) Trendy bugs: topic trends in the android bug reports. In Proc. MSR. pp 120–123

Alipour A, Hindle A, Stroulia E (2013) A contextual approach towards more accurate duplicate bug report detection. Proceeding MSR ‘13 Proceedings of the 10th Working Conference on mining software repositories. pp 183–192

Hindle A, Alipour A, Stroulia E (2016) A contextual approach towards more accurate duplicate bug report detection and ranking. Empir Softw Eng 21(2):368–410

Guana V, Rocha F, Hindle A, Stroulia E (2012) Do the stars align? multidimensional analysis of android’s layered architecture. Mining Software Repositories (MSR) 2012 9th IEEE Working Conference on. IEEE, New York, pp 124–127

Hindle A, Ernst NA, Godfrey MW. Mylopoulos J (2011) Automated topic naming to support cross-project analysis of software maintenance activities. Proceedings of the 8th Working Conference on mining software repositories. ACM. pp 163–172

Han D, Zhang C, Fan X, Hindle A, Wong K, Stroulia E (2012) Understanding android fragmentation with topic analysis of vendorspecific bugs. 19th Working Conference on reverse engineering. pp 83–92

Sun C, Lo D, Khoo S, Jiang J (2011) Towards more accurate retrieval of duplicate bug reports. Proceedings of the 2011 26th IEEE/ACM International Conference on automated software engineering. IEEE Computer Society. pp 253–262

Budhiraja A, Dutta K, Shrivastava M, Reddy R (2018) Towards Word Embeddings for Improved Duplicate Bug Report Retrieval in Software Repositories. Proceedings of the 2018 ACM SIGIR International Conference on theory of information retrieval. pp 167–170

Aggarwal K, Rutgers T, Timbers F, Hindle A, Greiner R, Stroulia E (2015) Detecting duplicate bug reports with software engineering domain knowledge. In: SANER 2015: International Conference on software analysis, evolution and reengineering. pp 211–220

Aggarwal K, Timbers F, Rutgers T, Hindle A, Stroulia E, Greiner R (2017) Detecting duplicate bug reports with software engineering domain knowledge. J Softw Evol Process 29(3):e1821

Campbell JC, Santos EA, Hindle A (2016) The unreasonable effectiveness of traditional information retrieval in crash report deduplication. 2016 IEEE/ACM 13th Working Conference on mining software repositories (MSR). pp 269–280

Hindle A, Onuczko C (2019) Preventing duplicate bug reports by continuously querying bug reports. Empir Softw Eng. 24(2):902–936

Nguyen AT, Nguyen TT, Nguyen TN, Lo D, Sun C (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. Proc. ASE’12. pp 70–79

Chang J, Blei DM (2009) Relational topic models for document networks, In AIStats. pp 81–88

Tian Y, Sun C, Lo D (2012) Improved duplicate bug report identification. In CSMR, 2012. pp 385–390

Jalbert N, Weimer W (2008) Automated duplicate detection for bug tracking systems, in dependable systems and networks with FTCS and DCC 2008. DSN 2008. IEEE International Conference on. IEEE, New York. pp 52–61

Ebrahimi N, Trabelsi A, Islam MS, Hamou-Lhadj A, Khanmohammadi K (2019) An HMM-based approach for automatic detection and classification of duplicate bug reports. Inf Softw Technol 113:98–109

Budhiraja A, Dutta K, Reddy R, Shrivastava M (2018) DWEN: deep word embedding network for duplicate bug report detection in software repositories. Proceedings of the 40th International Conference on software engineering: companion proceeedings. pp 193–194

Tamrawi A, Nguyen TT, Al-Kofahi JM, Nguyen TN (2011) Fuzzy-set and cache-based approach for bug triaging. Proc. 19th ACM SIGSOFT Symp. Foundations of software engineering (FSE’11). pp 365–375

Wang S, Zhang W, Wang Q (2014) Fixercache: unsupervised caching active developers for diverse bug triage. In ACM/IEEE International Symposium on empirical software engineering and measurement 25

Wen W, Yu T, Hayes JH (2016) Colua: Automatically predicting configuration bug reports and extracting configuration options. in 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE). pp 150–161

Zhang W, Wang S, Wang Q (2016) KSAP: an approach to bug report assignment using KNN search and heterogeneous proximity. Inf Softw Technol 70:68–84

Zhang ML, Zhou ZH (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048

Xia X, Lo D, Wang X, Zhou B (2013) Accurate developer recommendation for bug resolution. In WCRE’13. pp 72–81

Wu W, Zhang W, Yang Y, Wang Q (2011) DREX: Developer recommendation with k-nearest-neighbor search and expertise ranking. in: APSEC, IEEE, New York, pp 389–396

Xie X, Zhang W, Yang Y, Wang Q (2012) DRETOM: developer recommendation based on topic models for bug resolution. In PROMISE’12. pp 19–28

Prabhakar RN, Ranjith KS (2016) Effective bug triage with software data reduction techniques using clustering mechanism. i-Manager’s J Inf Technol 5(3):15–23

Chaudhari RA, Bodake SV (2017) Effective bug triage using software data reduction techniques. Int J Innovative Res Sci Technol 4(1):214–220

Kirubakaran S, Maheswari K (2016) Auto-bug triager for assisting manual bug triage. Asian J Inf Technol 15(8):1334–1339

Govindasamy V, Akila V, Anjanadevi G, Deepika H, Sivasankari G (2016) Data reduction for bug triage using effective prediction of reduction order techniques. 2016 International Conference on Computation of power, energy information and communication. pp 85–90

Sahu K, Lilhore UK, Agarwal N (2018) An improved data reduction technique based on KNN & NB with hybrid selection method for effective software bugs triage. Eng Inf Technol 3(5):1835146

Yin Y, Dong X, Xu T (2018) Rapid and efficient bug assignment using ELM for IOT software. IEEE Access 6:52713–52724

Florea AC, Anvik J, Andonie R (2017) Spark-based cluster implementation of a bug report assignment recommender system. International Conference on artificial intelligence and soft computing. pp 31–42

Florea AC, Anvik J, Andonie R (2017) Parallel implementation of a bug report assignment recommender using deep learning. International Conference on artificial neural networks. pp 64–71

Lee SR, Heo MJ, Lee CG, Kim M, Jeong G (2017) Applying deep learning based automatic bug triager to industrial projects. Proceedings of the 2017 11th Joint Meeting on foundations of software engineering. pp 926–931

Bug report from Bugzilla. https://bugzilla.mozilla.org/show_bug.cgi?id=1511914. Accessed 30 Jan 2020

Bug report from Github. https://github.com/glfw/glfw/pull/1602. Accessed 30 Jan 2020

Git. https://git-scm.com/. Accessed 30 Jan 2020

Zou D, Liang J, Xiong Y, Ernst MD, Zhang L (2019) An empirical study of fault localization families and their combinations. IEEE Transactions on Software Engineering (Early access)

Cleophas TJ, Zwinderman AH (2018) Bayesian paired T-Test. Modern bayesian statistics in clinical research. pp 49–58

Seo YS, Bae DH (2013) On the value of outlier elimination on software effort estimation research. Empir Softw Eng 18(4):659–698

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA