A Multi-Core Approach to Efficiently Mining High-Utility Itemsets in Dynamic Profit Databases

IEEE Access - Tập 8 - Trang 85890-85899 - 2020
Bay Vo1, Loan T.T. Nguyen2,3, Trinh D.D. Nguyen4, Philippe Fournier-Viger5, Unil Yun6
1Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Vietnam
2School of Computer Science and Engineering, International University, Ho Chi Minh City, Vietnam
4Institute of Research and Development, Duy Tan University, Da Nang, Vietnam
5School of Humanities and Social Sciences, Harbin Institute of Technology (Shenzhen), China
6Department of Computer Engineering, Sejong University, Seoul, South Korea

Tóm tắt

Analyzing customer transactions to discover high-utility itemsets is a popular task, which consists of finding the sets of items that are purchased together and yield a high profit. However, many studies assume that transactional data is static while in real-life, it changes over time. For example, the unit profits of items may vary from one week to another because sale prices and production costs may change. Many algorithms for mining high-utility itemsets (HUI) ignore this important property and thus are inapplicable or generate inaccurate results on real data. To address this issue, this paper proposes a novel algorithm named Multi-Core HUI Miner (MCH-Miner). It adapts techniques introduced in the iMEFIM algorithm to run on a parallel multi-core architecture to efficiently mine HUIs in dynamic transaction databases. An empirical evaluation shows that in most cases, MCH-Miner is significantly faster than iMEFIM, and that the cost of database scans is reduced.

Từ khóa

#Data mining #high utility itemset #dynamic profit #parallel #multithread

Tài liệu tham khảo

10.1145/2485278.2485281 10.1109/HPCS.2010.5547082 li, 2007, Optimization of frequent itemset mining on multiple-core processor, Proc Int Conf On Very Large Data Bases, 1275 10.3390/s20041078 10.1016/j.ins.2017.02.058 10.1016/j.knosys.2013.02.003 10.1109/TrustCom.2011.192 10.1109/AINAW.2007.40 10.1145/1081870.1081937 10.1109/ITSIM.2008.4631672 10.1109/TCYB.2015.2496175 10.1109/JSYST.2020.2979279 10.1109/ICCIE.2009.5223866 10.1109/TKDE.2019.2942594 agrawal, 1994, Fast algorithms for mining association rules in large databases, Proc 20th Int Conf Very Large Data Bases, 487 10.1007/978-3-319-08326-1_9 10.1109/ACCESS.2020.2979289 10.1016/j.eswa.2014.11.001 10.1109/ACCESS.2017.2788083 10.1016/j.future.2019.09.024 10.1109/TCYB.2019.2896267 10.1109/ACCESS.2020.2974104 10.1145/1835804.1835839 10.1007/s10115-016-0986-0 10.1007/s10618-013-0313-2 10.1016/j.knosys.2019.03.022 10.1016/j.datak.2005.10.004 10.1145/3363571 10.1016/j.eswa.2013.11.038 10.1109/ACIIDS.2009.55 10.1007/978-3-642-04595-0_31 10.1504/IJIIDS.2011.038970 10.1023/B:DAMI.0000005258.31418.83 10.1109/ACCESS.2019.2919524 10.1145/2396761.2396773 10.1109/ACCESS.2018.2819162 10.1109/ACCESS.2019.2958150 10.1007/11430919_79 10.1137/1.9781611972740.51 10.1007/978-3-319-46131-1_8 10.1109/TKDE.2009.46 10.1016/j.bdr.2016.07.001 10.1007/s10489-016-0859-y 10.1109/4434.806975 10.1007/978-3-319-99996-8_26 10.1016/j.eswa.2014.01.038 10.1007/978-3-642-40511-2_9 huynh, 2016, Parallel frequent subgraph mining on multi-core processor systems, ICIC Express Lett, 10, 2105 10.1145/3314107