Local support-based partition algorithm for frequent pattern mining

Pattern Analysis and Applications - Tập 22 - Trang 1137-1147 - 2018

Vijayakumar Kadappa¹, Shivaraju Nagesh¹

¹Data Analytics Research Lab, Department of Computer Applications, B.M.S. College of Engineering, Bangalore, India

Tóm tắt

Frequent pattern (itemset) mining is one of the established approaches for knowledge discovery. Minimizing the number of database scans (I/O overhead) is a challenging task in Frequent itemset mining. Partition algorithm is one of the early novel approaches to reduce the database I/O overhead as compared to Apriori algorithm and other related methods. However, Partition algorithm suffers from a significant database I/O overhead (that is, it reads the database twice from the secondary storage) and higher time complexity for computation of frequent itemsets in large databases. In this work, an improved partition algorithm is proposed, which reads the database only once and makes use of local support information to avoid further scans of the database. The proposed algorithm outperforms Apriori and Partition algorithms and shows closer performance to FP-Growth algorithm, in terms of computational time. The proposed method outpaces FP-Growth algorithm in terms of memory usage and is competitive to other algorithms. In terms of database access time, the proposed method exhibits better performance over FP-Growth, Partition and Apriori methods.

Tài liệu tham khảo

Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86 Zaki MJ (1999) Parallel and distributed association mining: a survey. IEEE Concur 7(4):14–25 Boukerche A, Samarah S (2008) A novel algorithm for mining association rules in wireless ad hoc sensor networks. IEEE Trans Parallel Distrib Syst 19(7):865–877 Gu X, Zhu Y, Zhou S, Wang C, Qiu M, Wang G (2016) A real-time fpga-based accelerator for ecg analysis and diagnosis using association-rule mining. ACM Trans Embed Comput Syst 15(2):25:1–25:23 Vathsala H, Koolagudi SG (2017) Prediction model for peninsular indian summer monsoon rainfall using data mining and statistical approaches. Comput Geosci 98:55–63 Weng C-H (2016) Discovering highly expected utility itemsets for revenue prediction. Knowl Based Syst 104:39–51 Qu Z, Keeney J, Robitzsch S, Zaman F, Wang X (2016) Multilevel pattern mining architecture for automatic network monitoring in heterogeneous wireless communication networks. China Commun 13(7):108–116 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases, VLDB, Santiago, Chile, pp 487–499 Pujari AK (2013) Data mining techniques, 2nd edn. Universities Press (India) Pvt. Ltd., Hyderabad Brin S, Motwani R, Ullman J, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. ISIGMOD Record 6(2):255–264 Lin DI, Kedem ZM (2002) Pincer-search: an efficient algorithm for discovering the maximum frequent set. IEEE Trans Knowl Data Eng 14(3):553–566 Savasere A, Omiecinski E, Navathe SB (1995) An efficient algorithm for mining association rules in large databases. In: Proceedings of the 21th international conference on very large data bases, ser. VLDB’95, pp 432–444 Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, ser. SIGMOD’00. ACM, New York, NY, pp 1–12 Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390 Belbachir K, Belbachir H (2012) The parallelization of algorithm based on partition principle for association rules discovery. In 2012 international conference on multimedia computing and systems, May 2012, pp 1–6 Amphawan K, Lenca P, Surarerks A (2012) Mining top-k regular-frequent itemsets using database partitioning and support estimation. Expert Syst Appl 39(2):1924–1936 Liu C, Liu Y (2008) Two phases association rule mining of remote sensing images based partition and bitset. In: 2008 IEEE international symposium on knowledge acquisition and modeling workshop, Dec 2008, pp 161–164 Subramanyam RBV, Suvvari SR (2012) Partition-based approach for fast mining of transitional patterns. In: 2012 UKSim 14th international conference on computer modelling and simulation, Mar 2012, pp 151–155 Lee C-H, Chen M-S, Lin C-R (2003) Progressive partition miner: an efficient algorithm for mining general temporal association rules. IEEE Trans Knowl Data Eng 15(4):1004–1017 Li Z, Liu J, Tang J, Lu H (2015) Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Intell 37(10):2085–2098 Li Z, Liu J, Yang Y, Zhou X, Lu H (2014) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150 FIM-Repository. Frequent itemset mining dataset repository. http://fimi.ua.ac.be/data/, Visited in Dec 2017 pyfpgrowth 1.0 A python implementation of the frequent pattern growth algorithm. https://pypi.python.org/pypi/pyfpgrowth. Visited in Dec 2017 Koshy T (2004) Discrete mathematics with applications. Academic Press, Burlington

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]