Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Khám Phá Các Mẫu Liên Tiếp Với Ràng Buộc Tập Mục

Springer Science and Business Media LLC - Tập 53 - Trang 19827-19842 - 2023

Anh Nguyen¹, Ngoc-Thanh Nguyen^1,2, Loan T.T. Nguyen^3,4, Bay Vo⁵

¹Department of Applied Informatics, Wroclaw University of Science and Technology, Wrocław, Poland

²Faculty of Information Technology, Nguyen Tat Thanh University, Ho Chi Minh City, Vietnam

³School of Computer Science and Engineering, International University, Ho Chi Minh City, Viet Nam

⁴Vietnam National University Ho Chi Minh City, Viet Nam

⁵Faculty of Information Technology, HUTECH University, Ho Chi Minh City, Vietnam

Tóm tắt

Ngày nay, dữ liệu thô hiếm khi được sử dụng trực tiếp. Trong các ứng dụng thực tế, dữ liệu thường được xử lý, và kiến thức cần thiết được trích xuất, tùy thuộc vào mục đích của người dùng. Việc áp dụng các ràng buộc trong khai thác mẫu là một yếu tố quan trọng trong việc giảm bớt các mẫu kết quả để giúp các hệ thống hỗ trợ quyết định hoạt động hiệu quả. Năm 2018, một phương pháp dựa trên ràng buộc đã được phát triển để khám phá các mẫu liên tiếp. Tuy nhiên, phương pháp này chỉ tập trung vào các ràng buộc với các mục đơn lẻ. Nhiệm vụ khám phá các mẫu liên tiếp dựa trên ràng buộc là mục tiêu của chúng tôi trong công trình này. Chúng tôi đề xuất thuật toán DBV-ISPMIC, một cấu trúc dựa trên DBV-PatternList, để khai thác các mẫu liên tiếp với các ràng buộc tập mục. Thuật toán đề xuất sử dụng một cấu trúc cây tìm kiếm có tổ chức được lưu trữ dưới dạng vector động để tính toán nhanh chóng sự hỗ trợ của các mẫu. Ngoài ra, chúng tôi cũng phát triển một thuộc tính và, dựa trên nó, một thuật toán cải tiến được đề xuất để giảm thiểu việc kiểm tra các ứng viên. Cuối cùng, chúng tôi phát triển thuật toán pDBV-ISPMIC như một phương pháp song song của thuật toán DBV-ISPMIC. Các đánh giá thực nghiệm cho thấy DBV-ISPMIC có hiệu suất tốt hơn so với các thuật toán xử lý sau trong các cơ sở dữ liệu thử nghiệm và pDBV-ISPMIC tốt hơn DBV-ISPMIC về thời gian chạy.

Từ khóa

#khai thác mẫu #ràng buộc #mẫu liên tiếp #thuật toán #hỗ trợ quyết định #khai thác dữ liệu

Tài liệu tham khảo

Huynh HM, Nguyen LTT, Vo B, Nguyen A, Tseng VS (Mar. 2020) Efficient methods for mining weighted clickstream patterns. Expert Syst Appl 142:112993. https://doi.org/10.1016/j.eswa.2019.112993 Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings - International Conference on Data Engineering, pp. 3–14, https://doi.org/10.1109/icde.1995.380415 Zaki MJ (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60. https://doi.org/10.1023/A:1007652502315 Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M-C (2000) FreeSpan: frequent pattern-projected sequential pattern mining. In: Proceedings - the sixth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ‘00, pp. 355–359, https://doi.org/10.1145/347090.347167 Pei J et al (2001) PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings - International Conference on Data Engineering, pp. 215–224, https://doi.org/10.1109/icde.2001.914830 Fournier-Viger P, Gomariz A, Campos M, Thomas R Fast vertical mining of sequential patterns using co-occurrence information. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, vol. 8443 LNAI, no. PART 1, pp. 40–52, https://doi.org/10.1007/978-3-319-06608-0_4 Wang CS, Lee AJT (May 2009) Mining inter-sequence patterns. Expert Syst Appl 36(4):8649–8658. https://doi.org/10.1016/j.eswa.2008.10.008 Le T, Nguyen A, Huynh B, Vo B, Pedrycz W (May 2018) Mining constrained inter-sequence patterns: a novel approach to cope with item constraints. Appl Intell 48(5):1327–1343. https://doi.org/10.1007/s10489-017-1123-9 Vo B, Tran MT, Hong TP, Nguyen H, Le B (2012) A dynamic bit-vector approach for efficiently mining inter-sequence patterns. In: Proceedings - 3rd International Conference on Innovations in Bio-Inspired Computing and Applications, IBICA 2012, pp. 51–56, https://doi.org/10.1109/IBICA.2012.31 Le B, Tran MT, Vo B (Jul. 2015) Mining frequent closed inter-sequence patterns efficiently using dynamic bit vectors. Appl Intell 43(1):74–84. https://doi.org/10.1007/s10489-014-0630-1 Wang CS, Liu YH, Chu KC (Jun. 2013) Closed inter-sequence pattern mining. J Syst Softw 86(6):1603–1612. https://doi.org/10.1016/J.JSS.2013.02.010 Liao W, Wang Q, Yang L, Ren J, Davis DN, Hu C (Apr. 2018) Mining frequent intra-sequence and inter-sequence patterns using bitmap with a maximal span, Proc. - 2017 14th web Inf. Syst Appl Conf WISA 2017, vol 2018-January, pp 56–61, https://doi.org/10.1109/WISA.2017.70 Van T, Le B (Mar. 2021) Mining sequential rules with itemset constraints. Appl Intell 51:1–13. https://doi.org/10.1007/s10489-020-02153-w Van T, Vo B, Le B (Nov. 2018) Mining sequential patterns with itemset constraints. Knowl Inf Syst 57(2):311–330. https://doi.org/10.1007/s10115-018-1161-6 Gouda K, Hassaan M, Zaki MJ (2007) PRISM: A prime-encoding approach for frequent sequence mining. In: Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 487–492, https://doi.org/10.1109/ICDM.2007.33 Gouda K, Hassaan M, Zaki MJ (Feb. 2010) Prism: an effective approach for frequent sequence mining via prime-block encoding. J Comput Syst Sci 76(1):88–102. https://doi.org/10.1016/J.JCSS.2009.05.008 Huynh HM, Nguyen LTT, Vo B, Yun U, Oplatková ZK, Hong TP (Jun. 2020) Efficient algorithms for mining clickstream patterns using pseudo-IDLists. Futur Gener Comput Syst 107:18–30. https://doi.org/10.1016/j.future.2020.01.034 Huynh HM, Nguyen LTT, Vo B, Oplatková ZK, Fournier-Viger P, Yun U (Jan. 2022) An efficient parallel algorithm for mining weighted clickstream patterns. Inf Sci (NY) 582:349–368. https://doi.org/10.1016/J.INS.2021.08.070 Gan W, Lin JCW, Zhang J, Fournier-Viger P, Chao HC, Yu PS (Feb. 2021) Fast utility mining on sequence data. IEEE Trans Cybern 51(2):487–500. https://doi.org/10.1109/TCYB.2020.2970176 Gan W et al (May 2021) Utility Mining Across Multi-Dimensional Sequences. ACM Trans Knowl Discov Data 15(5):1–24. https://doi.org/10.1145/3446938 Lin JCW, Li Y, Fournier-Viger P, Djenouri Y, Zhang J (2020) Efficient chain structure for high-utility sequential pattern mining. IEEE Access 8:40714–40722. https://doi.org/10.1109/ACCESS.2020.2976662 Gan W, Lin JCW, Zhang J, Chao HC, Fujita H, Yu PS (Mar. 2020) ProUM: Projection-based utility mining on sequence data. Inf Sci (NY) 513:222–240. https://doi.org/10.1016/J.INS.2019.10.033 Wu Y, Geng M, Li Y, Guo L, Li Z, Fournier-Viger P, Zhu X, Wu X (Oct. 2021) HANP-miner: high average utility nonoverlapping sequential pattern mining. Knowledge-Based Syst 229:107361. https://doi.org/10.1016/J.KNOSYS.2021.107361 Chun-wei Lin J et al (Nov. 2021) Scalable Mining of High-Utility Sequential Patterns with Three-Tier MapReduce model. ACM Trans Knowl Discov Data 16(3):1–26. https://doi.org/10.1145/3487046 Truong T, Duong H, Le B, Fournier-Viger P, Yun U, Fujita H (Aug. 2021) Efficient algorithms for mining frequent high utility sequences with constraints. Inf Sci (NY) 568:239–264. https://doi.org/10.1016/J.INS.2021.01.060

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]