SRIHASS - a similarity measure for discovery of hidden time profiled temporal associations

Multimedia Tools and Applications - Tập 77 - Trang 17643-17692 - 2017

Vangipuram Radhakrishna¹, Puligadda Veereswara Kumar², Vinjamuri Janaki³

¹Department of Information Technology, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India

²Department of Computer Science and Engineering, Acharya Institute of Technology, Bangalore, India

³Department Computer Science and Engineering, Vaagdevi College of Engineering, Warangal, India

Tóm tắt

Mining and visualization of time profiled temporal associations is an important research problem that is not addressed in a wider perspective and is understudied. Visual analysis of time profiled temporal associations helps to better understand hidden seasonal, emerging, and diminishing temporal trends. The pioneering work by Yoo and Shashi Sekhar termed as SPAMINE applied the Euclidean distance measure. Following their research, subsequent studies were only restricted to the use of Euclidean distance. However, with an increase in the number of time slots, the dimensionality of a prevalence time sequence of temporal association, also increases, and this high dimensionality makes the Euclidean distance not suitable for the higher dimensions. Some of our previous studies, proposed Gaussian based dissimilarity measures and prevalence estimation approaches to discover time profiled temporal associations. To the best of our knowledge, there is no research that has addressed a similarity measure which is based on the standard score and normal probability to find the similarity between temporal patterns in z-space and retains monotonicity. Our research is pioneering work in this direction. This research has three contributions. First, we introduce a novel similarity (or dissimilarity) measure, SRIHASS to find the similarity between temporal associations. The basic idea behind the design of dissimilarity measure is to transform support values of temporal associations onto z-space and then obtain probability sequences of temporal associations using a normal distribution chart. The dissimilarity measure uses these probability sequences to estimate the similarity between patterns in z-space. The second contribution is the prevalence bound estimation approach. Finally, we give the algorithm for time profiled associating mining called Z-SPAMINE that is primarily inspired from SPAMINE. Experiment results prove that our approach, Z-SPAMINE is computationally more efficient and scalable compared to existing approaches such as Naïve, Sequential and SPAMINE that applies the Euclidean distance.

Tài liệu tham khảo

Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969 https://doi.org/10.1109/69.553164 Agrawal R, Srikant R (1994) Fast Algorithms for Mining Association Rules in Large Databases. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of the 20th International Conference on Very Large Data Bases (VLDB ‘94). Morgan Kaufmann Publishers Inc., San Francisco, pp 487–499 Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD Rec 22(2):207–216 https://doi.org/10.1145/170036.170072 Ale JM, Rossi GH (2000) An approach to discovering temporal association rules. In: Carroll J, Damiani E, Haddad H, Oppenheim D (eds) Proceedings of the 2000 ACM symposium on Applied computing - Volume 1 (SAC ‘00), vol 1. ACM, New York, pp 294–300 https://doi.org/10.1145/335603.335770 Aljawarneh S, Radhakrishna V, Kumar PV, Janaki V (2016) A similarity measure for temporal pattern discovery in time series data generated by IoT. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–4 https://doi.org/10.1109/ICEMIS.2016.7745355 Aljawarneh SA, Elkobaisi MR, Maatuk AM (2016) A new agent approach for recognizing research trends in wearable systems. Computers & Electrical Engineering, Available online 16 December 2016, https://doi.org/10.1016/j.compeleceng.2016.12.003 Aljawarneh SA, Moftah RA, Maatuk AM (2016) Investigations of automatic methods for detecting the polymorphic worms signatures. Futur Gener Comput Syst 60:67–77 ISSN 0167-739X, https://doi.org/10.1016/j.future.2016.01.020 Aljawarneh SA, Radhakrishna V, Kumar PV, Janaki V (2017) G-SPAMINE: An approach to discover temporal association patterns and trends in internet of things. Futur Gener Comput Syst 74:430–443 ISSN 0167-739X, https://doi.org/10.1016/j.future.2017.01.013 Aljawarneh S, Aldwairi M, Yassein MB (2017) Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J Comput Sci ISSN 1877-7503, https://doi.org/10.1016/j.jocs.2017.03.006 Bettini C, Wang XS, Jajodia S, Lin JL (1998) Discovering frequent event patterns with multiple granularities in time sequences. IEEE Trans Knowl Data Eng 10(2):222–237 https://doi.org/10.1109/69.683754 Christian Borgelt (2005) Keeping things simple: finding frequent item sets by recursive elimination. In: Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations (OSDM ‘05). ACM, New York, pp 66–70. https://doi.org/10.1145/1133905.1133914 Chen X, Petrounias I (2000) Discovering temporal association rules: algorithms, language and system. In: Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073), pp 306–306. https://doi.org/10.1109/ICDE.2000.839423 Chen X, Petrounias I (1999) Mining temporal features in association rules. In: Żytkow JM, Rauch J (eds) Principles of data mining and knowledge discovery. PKDD 1999. Lecture Notes in Computer Science, vol 1704. Springer, Berlin, Heidelberg Chen YC, Peng WC, Lee SY (2015) Mining Temporal Patterns in Time Interval-Based Data. IEEE Trans Knowl Data Eng 27(12):3318–3331 https://doi.org/10.1109/TKDE.2015.2454515 Cheruvu A, Radhakrishna V (2016) Estimating temporal pattern bounds using negative support computations. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–4 https://doi.org/10.1109/ICEMIS.2016.7745352 Cheung D, Han J, Ng V, Wong CY (1996) Maintenance of discovered association rules in large databases: an incremental updating technique. Proc. 1996 Int’l Conf. Data Eng, pp 106–114. https://doi.org/10.1109/ICDE.1996.492094 Cohen E, Datar M, Fujiwara S, Gionis A, Indyk P, Motwani R, Ullman JD, Cheng Y (2001) Finding Interesting Associations without Support Pruning. IEEE Trans on Knowl and Data Eng 13(1):64–78 https://doi.org/10.1109/69.908981 Dong G, Li J (1999) Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ‘99). ACM, New York, pp 43–52. https://doi.org/10.1145/312129.312191 Han J, Fu Y (1995) Discovery of multiple-level association rules from large databases. In: Dayal U, PMD G, Nishio S (eds) Proceedings of the 21th International Conference on Very Large Data Bases (VLDB ‘95). Morgan Kaufmann Publishers Inc., San Francisco, pp 420–431 Han J, Dong G, Yin Y (1999) Efficient mining of partial periodic patterns in time series database. In: Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337), Sydney, pp 106–115. https://doi.org/10.1109/ICDE.1999.754913 Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87. Kluwer Academic Publishers. https://doi.org/10.1023/B:DAMI.0000005258.31418.83 Imran A, Aljawarneh SA, Sakib K Web Data Amalgamation for Security Engineering: Digital Forensic Investigation of Open Source Cloud. J Univers Comput Sci 22(4):494–520 https://doi.org/10.3217/jucs-022-04-0494 Jiang JY, Liou RJ, Lee SJ (2011) A Fuzzy Self-Constructing Feature Clustering Algorithm for Text Classification. IEEE Trans Knowl Data Eng 23(3):335–349 https://doi.org/10.1109/TKDE.2010.122 Kumar GR, Mangathayaru N, Narasimha G (2015) An improved k-means clustering algorithm for intrusion detection using gaussian function. In: Proceedings of the The International Conference on Engineering & MIS 2015 (ICEMIS ‘15). ACM, New York, pp 69:1–69:7. https://doi.org/10.1145/2832987.2833082 Kumar GR, Mangathayaru N, Narasimha G (2016a) An approach for intrusion detection using novel Gaussian based Kernel function. J Univers Comput Sci 22(4):589–604. https://doi.org/10.3217/jucs-022-04-0589 Kumar GR, Mangathayaru N, Narsimha G (2016b) Design of novel fuzzy distribution function for dimensionality reduction and intrusion detection. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–6 https://doi.org/10.1109/ICEMIS.2016.7745346 Kumar GR, Mangathayaru N, Gugulothu N, Suresh Reddy G (2016) CLAPP: A self constructing feature clustering approach for anomaly detection, Future Generation Computer Systems, Available online 4 January 2017, ISSN 0167-739X, https://doi.org/10.1016/j.future.2016.12.040 Last M, Klein Y, Kandel A (2001) Knowledge discovery in time series databases. IEEE Trans Syst Man Cybern Part B Cybern 31(1):160–169 https://doi.org/10.1109/3477.907576 Lee W-J, Lee S-J (2004) Discovery of fuzzy temporal association rules. IEEE Trans Syst Man Cybern Part B Cybern 34(6):2330–2342 https://doi.org/10.1109/TSMCB.2004.835352 Lee C-H, Lin C-R, Chen M-S (2001) Sliding-window filtering: an efficient algorithm for incremental mining. In: Paques H, Liu L, Grossman D (eds) Proceedings of the tenth international conference on Information and knowledge management (CIKM ‘01). ACM, New York, pp 263–270 https://doi.org/10.1145/502585.502630 Lee C-H, Chen M-S, Lin C-R (2003) Progressive partition miner: an efficient algorithm for mining general temporal association rules. IEEE Trans Knowl Data Eng 15(4):1004–1017 https://doi.org/10.1109/TKDE.2003.1209015 Li Y, Ning P, Wang XS, Jajodia S (2001) Discovering calendar-based temporal association rules. In: Proceedings Eighth International Symposium on Temporal Representation and Reasoning. TIME 2001, Cividale del Friuli, pp. 111–118. https://doi.org/10.1109/TIME.2001.930706 Li Y, Ning P, Wang XS, Jajodia S (2003) Discovering calendar-based temporal association rules, data & knowledge engineering, Volume 44, Issue 2, Pages 193-218, ISSN 0169-023X, https://doi.org/10.1016/S0169-023X(02)00135-0 Lin YS, Jiang JY, Lee SJ (2014) A Similarity Measure for Text Classification and Clustering. IEEE Trans Knowl Data Eng 26(7):1575–1590 https://doi.org/10.1109/TKDE.2013.19 Lind DA, Marchal WG, Wathen SA (2004) Statistical techniques in business and economics, 12e: Chapter 7: Continuous Probability Distributions. The McGraw-Hill Companies, New York Liu B, Hsu W, Ma Y (1999) Mining association rules with multiple minimum supports. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ‘99). ACM, New York, pp 337–341. https://doi.org/10.1145/312129.312274 Ozden B, Ramaswamy S, Silberschatz A (1998) Cyclic association rules. In: Proceedings 14th International Conference on Data Engineering, pp 412–421. https://doi.org/10.1109/ICDE.1998.655804 Park JS, Yu PS, Chen M-S (1997) Mining association rules with adjustable accuracy. In: Proceedings of the sixth international conference on Information and knowledge management (CIKM ‘97). ACM, New York, 151–160. https://doi.org/10.1145/266714.266886 Radhakrishna V, Kumar PV, Janaki V (2016) A computationally optimal approach for extracting similar temporal patterns. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–6 https://doi.org/10.1109/ICEMIS.2016.7745344 Radhakrishna V, Kumar PV, Janaki V (2016) Mining of outlier temporal patterns. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–6 https://doi.org/10.1109/ICEMIS.2016.7745343 Radhakrishna V, Kumar PV, Janaki V, Aljawarneh S (2016) A similarity measure for outlier detection in timestamped temporal databases. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–5 https://doi.org/10.1109/ICEMIS.2016.7745347 Radhakrishna V, Kumar PV, Janaki V (2016) Looking into the possibility of novel dissimilarity measure to discover similarity profiled temporal association patterns in IoT. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–5 https://doi.org/10.1109/ICEMIS.2016.7745353 Radhakrishna V, Kumar PV, Janaki V, Aljawarneh S (2016) A computationally efficient approach for temporal pattern mining in IoT. 2016 International Conference on Engineering & MIS (ICEMIS), Agadir, pp 1–4 https://doi.org/10.1109/ICEMIS.2016.7745354 Radhakrishna V, Aljawarneh SA, Kumar PV, Janaki V (2017) A novel fuzzy similarity measure and prevalence estimation approach for similarity profiled temporal association pattern mining, future generation computer systems, Available online 14 March 2017, ISSN 0167-739X, https://doi.org/10.1016/j.future.2017.03.016 Radhakrishna V, Kumar PV, Janaki V (2017) Design and analysis of similarity measure for discovering similarity profiled temporal association patterns. IADIS International Journal on Computer Science and Information Systems 12(1):45–60 http://www.iadisportal.org/ijcsis/papers/2017200104.pdf Radhakrishna V, Kumar PV, Janaki V, Cheruvu A (2017) A dissimilarity measure for mining similar temporal association patterns. IADIS International Journal on Computer Science and Information Systems 12(1):126–142 http://www.iadisportal.org/ijcsis/papers/2017200109.pdf Radhakrishna V, Kumar PV, Janaki V (2017) Normal distribution based similarity profiled temporal association pattern mining (N-SPAMINE). Database Systems Journal 7(3):22–33 Radhakrishna V, Kumar PV, Janaki V (2017) A Novel Similar Temporal System Call Pattern Mining for Efficient Intrusion Detection. J Univers Comput Sci 22(4):475–493 https://doi.org/10.3217/jucs-022-04-0475 Radhakrishna V, Kumar PV, Janaki V (2017) A computationally efficient approach for mining similar temporal patterns. In: Matoušek R (ed) Recent advances in soft computing. ICSC-MENDEL 2016. Advances in intelligent systems and computing, vol 576. Springer, Cham Radhakrishna V, Kumar PV, Janaki V, Rajasekhar N (2017) Estimating prevalence bounds of temporal association patterns to discover temporally similar patterns. In: Matoušek R (ed) Recent advances in soft computing. ICSC-MENDEL 2016. Advances in intelligent systems and computing, vol 576. Springer, Cham. https://doi.org/10.1007/978-3-319-58088-3_20 Ramaswamy S, Mahajan S, Silberschatz A (1998) On the discovery of interesting patterns in association rules. In: Gupta A, Shmueli O, Widom J (eds) Proceedings of the 24rd International Conference on Very Large Data Bases (VLDB ‘98). Morgan Kaufmann Publishers Inc., San Francisco, pp 368–379 Srikant R, Agrawal R (1995) Mining generalized association rules. In: Proceedings of the 21th international conference on very large data bases (VLDB ‘95). Morgan Kaufmann Publishers Inc., San Francisco, pp 407–419 Srikant R, Agrawal R (1996) Mining quantitative association rules in large relational tables. In: Proceedings of the 1996 ACM SIGMOD international conference on management of data (SIGMOD ‘96). ACM, New York, pp 1–12. https://doi.org/10.1145/233269.233311 Srikant R, Agrawal R (1997) Mining generalized association rules. Futur Gener Comput Syst 13(2):161–180. https://doi.org/10.1016/S0167-739X(97)00019-8 Tung AKH, Ng RT, Lakshmanan LVS, Han J (2001) Constraint-based clustering in large databases. In: Proceedings of the 8th international conference on database theory (ICDT ‘01). Springer, Verlag, 405–419 Radhakrishna V, Aljawarneh SA, Kumar PV, Choo KKR (2016) A novel fuzzy gaussian-based dissimilarity measure for discovering similarity temporal association patterns. Soft Comput: 1–17. https://doi.org/10.1007/s00500-016-2445-y Villafane R, Hua KA, Tran D, Maulik B (1999) Mining interval time series. In: Mohania M, Tjoa AM (eds) DataWarehousing and Knowledge Discovery. DaWaK 1999. Lecture Notes in Computer Science, vol 1676. Springer, Berlin https://doi.org/10.1007/3-540-48298-9_34 Yang C, Fayyad U, Bradley PS (2001) Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining (KDD ‘01). ACM, New York, 194–203. https://doi.org/10.1145/502512.502539 Yoo JS (2012) Temporal data mining: similarity-profiled association pattern. In: Holmes DE, Jain LC (eds) Data mining: foundations and intelligent paradigms. Intelligent systems reference library, vol 23. Springer, Berlin https://doi.org/10.1007/978-3-642-23166-7_3 Yoo JS, Shekhar S (2008) Mining Temporal Association Patterns under a Similarity Constraint. In: Ludäscher B, Mamoulis N (eds) Scientific and Statistical Database Management. SSDBM 2008. Lecture Notes in Computer Science, vol 5069. Springer, Berlin https://doi.org/10.1007/978-3-540-69497-7_26 Yoo JS, Shekhar S (2009) Similarity-Profiled Temporal Association Mining. IEEE Trans Knowl Data Eng 21(8):1147–1161 https://doi.org/10.1109/TKDE.2008.185 Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390 https://doi.org/10.1109/69.846291 Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ‘03). ACM, New York, p 326–335. https://doi.org/10.1145/956750.956788 Zhuang DEH, Li GCL, Wong AKC (2014) Discovery of temporal associations in multivariate time series. IEEE Trans Knowl Data Eng 26(12):2969–2982 https://doi.org/10.1109/TKDE.2014.2310219

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA