Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo
Phân cụm dấu hiệu dựa trên kiến thức chuyên gia với các ràng buộc ở cấp độ thực thể
Tóm tắt
Trong lĩnh vực khai thác quy trình, có nhiều phương pháp phân cụm dấu hiệu khác nhau nhằm phân chia các dấu hiệu hoặc các trường hợp quy trình thành các nhóm tương tự. Thông thường, việc phân chia này dựa trên một số mẫu hoặc sự tương đồng giữa các dấu hiệu, hoặc được dẫn dắt bởi việc phát hiện một mô hình quy trình cho mỗi cụm. Tuy nhiên, nhược điểm chính của các kỹ thuật này là giải pháp của chúng thường khó đánh giá hoặc biện minh bởi các chuyên gia trong lĩnh vực. Trong bài báo này, chúng tôi trình bày hai kỹ thuật phân cụm dấu hiệu có ràng buộc có khả năng tận dụng kiến thức chuyên gia dưới hình thức các ràng buộc ở cấp độ thực thể. Qua một đánh giá thực nghiệm rộng rãi với hai bộ dữ liệu thực tế, chúng tôi chỉ ra rằng các kỹ thuật mới của chúng tôi thực sự có khả năng tạo ra các giải pháp phân cụm có thể biện minh tốt hơn mà không có ảnh hưởng tiêu cực đáng kể đến chất lượng của chúng.
Từ khóa
#khai thác quy trình #phân cụm dấu hiệu #ràng buộc cấp độ thực thể #kiến thức chuyên giaTài liệu tham khảo
Van der Aalst W, Adriansyah A, van Dongen B (2012) Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip Data Min Knowl Discov 2(2):182–192
Augusto A, Conforti R, Dumas M, La Rosa M, Polyvyanyy A (2018) Split miner: automated discovery of accurate and simple business process models from event logs. Knowl Inf Syst. https://doi.org/10.1007/s10115-018-1214-x
Ben-Hur A, Elisseeff A, Guyon I (2001) A stability based method for discovering structure in clustered data. In: Pacific symposium on biocomputing, vol 7, pp 6–17
Bose RPJC, van der Aalst WMP (2009) Context aware trace clustering: towards improving process mining results. Sdm, pp 401–412. https://doi.org/10.1137/1.9781611972795.35
Bose RPJC, van der Aalst WMP (2010) Trace clustering based on conserved patterns: Towards achieving better process models. In: Lect. Notes Bus. Inf. Process., vol 43 LNBIP, pp 170–181. https://doi.org/10.1007/978-3-642-12186-9_16
Chen J, Huang X, Kanj IA, Xia G (2006) Strong computational lower bounds via parameterized complexity. J Comput Syst Sci 72(8):1346–1367
Davidson I, Ravi SS (2005) Agglomerative hierarchical clustering with constraints: theoretical and empirical results. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 3721 LNAI, pp 59–70. https://doi.org/10.1007/11564126_11
Davidson I, Wagstaff KL, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: 10th European conference on principles and practice of knowledge discovery in databases, pp 115–126. https://doi.org/10.1007/11871637_15
De Koninck P, De Weerdt J, vanden Broucke SKLM (2017) Explaining clusterings of process instances. Data Min Knowl Disc 31(3):774–808. https://doi.org/10.1007/s10618-016-0488-4
De Koninck P, Nelissen K, Baesens B, vanden Broucke S, Snoeck M, De Weerdt J (2017) An approach for incorporating expert knowledge in trace clustering. In: Dubois E, Pohl K (eds) Advanced information systems engineering29th international conference, CAiSE 2017, Essen, Germany, June 12–16, 2017, proceedings. Springer, Cham, pp 561–576. https://doi.org/10.1007/978-3-319-59536-8_35
De Smedt J, De Weerdt J, Vanthienen J, Poels G (2016) Mixed-paradigm process modeling with intertwined state spaces. Bus Inf Syst Eng 58(1):19–29. https://doi.org/10.1007/s12599-015-0416-y
De Weerdt J, De Backer M, Vanthienen J, Baesens B (2011) A robust f-measure for evaluating discovered process models. In: 2011 IEEE symposium on computational intelligence and data mining (CIDM). IEEE, pp 148–155. https://doi.org/10.1109/CIDM.2011.5949428
De Weerdt J, De Backer M, Vanthienen J, Baesens B (2012) A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs. Inf Syst 37(7):654–676. https://doi.org/10.1016/j.is.2012.02.004
De Weerdt J, vanden Broucke S, Vanthienen J, Baesens B (2013) Active trace clustering for improved process discovery. IEEE Trans Knowl Data Eng 25(12):2708–2720. https://doi.org/10.1109/TKDE.2013.64
Delias P, Doumpos M, Grigoroudis E, Manolitzas P, Matsatsinis N (2015) Supporting healthcare management decisions via robust clustering of event logs. Knowl Based Syst 84:203–213. https://doi.org/10.1016/j.knosys.2015.04.012
Dumas M, Rosa ML, Mendling J, Reijers HA (2018) Fundamentals of business process management, 2nd edn. Springer, Berlin. https://doi.org/10.1007/978-3-662-56509-4
Eaton E, des Jardins M, Jacob S (2014) Multi-view constrained clustering with an incomplete mapping between views. Knowl Inf Syst 38(1):231–257. https://doi.org/10.1007/s10115-012-0577-7
Goedertier S, Martens D, Vanthienen J, Baesens B (2009) Robust process discovery with artificial negative events. J Mach Learn Res 10:1305–1340
Klein D, Kamvar SD, Manning CD (2002) From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. Technical report, Stanford
Law M, Topchy A, Jain A (2005) Model-based clustering with probabilistic constraints. Sdm pp 1–5, https://doi.org/10.1137/1.9781611972757.77
Leemans SJJ, Fahland D, van der Aalst WMP (2013) Discovering block-structured process models from event logs: a constructive approach. In: Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). Springer, Berlin, pp 311–329. https://doi.org/10.1007/978-3-642-38697-8_17
Mabroukeh NR, Ezeife CI (2010) A taxonomy of sequential pattern mining algorithms. ACM Comput Surv 43(1):3:1-3:41. https://doi.org/10.1145/1824795.1824798
Mannhardt F, de Leoni M, Reijers HA, van der Aalst WM, Toussaint PJ (2016) From low-level events to activities—a pattern-based approach. In: 14th international conference, BPM 2016, Rio de Janeiro, Brazil, September 18–22, LNCS. Springer, Berlin, pp 125–141. https://doi.org/10.1007/978-3-319-45348-4_8
Martens D, Vanthienen J, Verbeke W, Baesens B (2011) Performance of classification models from a user perspective. Decis Support Syst 51(4):782–793. https://doi.org/10.1016/j.dss.2011.01.013
Mu noz-Gama J, Carmona J (2010) A fresh look at precision in process conformance. In: Hull R, Mendling J, Tai S (eds) Business process management: 8th international conference, BPM 2010, Hoboken, NJ, USA, September 13–16. Proceedings. Springer, Berlin, pp 211–226. https://doi.org/10.1007/978-3-642-15618-2_16
Murtagh F (1984) A survey of recent advances in hierarchical clustering algorithms which use cluster centers. Comput J 26:354–359
Rozinat A, Van der Aalst WM (2008) Conformance checking of processes based on monitoring real behavior. Inf Syst 33(1):64–95
Song M, Günther C, van der Aalst WMP (2009) Trace clustering in business process mining. In: Bus. Process Manag. Work. Springer, Berlin, vol 17, pp 109–120. https://doi.org/10.1007/978-3-642-00328-8_11
Tax N, Sidorova N, Haakma R, van der Aalst WMP (2016) Mining local process models. J Innov Dig Ecosyst 3(2):183–196. https://doi.org/10.1016/j.jides.2016.11.001
van der Aalst WMP, Adriansyah A, Van Dongen B (2012) Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip Rev Data Min Knowl Discov 2(2):182–192. https://doi.org/10.1002/widm.1045
Van Dongen B (2015) Bpi challenge 2015 (dataset). https://doi.org/10.4121/uuid:31a308ef-c844-48da-948c-305d167a0ec1
vanden Broucke S, De Weerdt J (2017) Fodina: a robust and flexible heuristic process discovery technique. Decision Support Syst 100(Supplement C):109–118. https://doi.org/10.1016/j.dss.2017.04.005 (Ssmart Business Process Management)
vanden Broucke S, De Weerdt J, Vanthienen J, Baesens B (2014) Determining process model precision and generalization with weighted artificial negative events. IEEE Trans Knowl Data Eng 26(8):1877–1889
Veiga GM, Ferreira DR (2010) Understanding spaghetti models with sequence clustering for prom. In: Rinderle-Ma S, Sadiq S, Leymann F (eds) Business process management workshops. Springer, Berlin, pp 92–103
Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained k-means clustering with background knowledge. In: ICML. Morgan Kaufmann, pp 577–584
Wang N, Sun S, OuYang D (2016) Business process modeling abstraction based on semi-supervised clustering analysis. Bus Inf Syst Eng. https://doi.org/10.1007/s12599-016-0457-x
Wang X, Davidson I (2010) Flexible constrained spectral clustering. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, KDD ’10, pp 563–572. https://doi.org/10.1145/1835804.1835877
Weijters A, van der Aalst WMP, De Medeiros AA (2006) Process mining with the heuristics miner-algorithm. Technische Universiteit Eindhoven, Technical Report, WP, vol 166, pp 1–34
Zhu S, Wang D, Li T (2010) Data clustering with size constraints. Knowl Based Syst 23(8):883–889. https://doi.org/10.1016/j.knosys.2010.06.003
