Graph-based feature extraction on object-centric event logs
International Journal of Data Science and Analytics - Trang 1-17 - 2023
Tóm tắt
Process mining techniques have proven crucial in identifying performance and compliance issues. Traditional process mining, however, is primarily case-centric and does not fully capture the complexity of real-life information systems, leading to a growing interest in object-centric process mining. This paper presents a novel graph-based approach for feature extraction from object-centric event logs. In contrast to established methods for feature extraction from traditional event logs, object-centric logs present a greater challenge due to the interconnected nature of events related to multiple objects. This paper addresses this gap by proposing techniques and tools for feature extraction specifically designed for object-centric event logs. In this work, we focus on features pertaining to the lifecycle of the objects and their interaction. These features enable a more comprehensive understanding of the process and its inherent complexities. We demonstrate the applicability of our approach through its implementation in two significant areas: anomaly detection and throughput time prediction for objects in the process. Our results, based on four problems in a Procure-to-Pay process, affirm the potential of our proposed features in enhancing the scope of process mining. By effectively transforming object-centric event logs into numeric vectors, we pave the way for the application of a broader range of machine learning techniques, such as classification, prediction, clustering, and anomaly detection, thereby extending the capabilities of process mining.
Tài liệu tham khảo
Adams, J.N., van der Aalst, W.M.P.: Oc\(\pi \): object-centric process insights. In: Bernardinello, L., Petrucci, L. (eds.) Application and Theory of Petri Nets and Concurrency - 43rd International Conference, PETRI NETS 2022, Bergen, Norway, June 19–24, 2022, Proceedings, Lecture Notes in Computer Science, vol 13288. Springer, New York City, pp. 139–150 (2022). https://doi.org/10.1007/978-3-031-06653-5_8
Adams, J.N., Park, G., Levich, S., et al.: A framework for extracting and encoding features from object-centric event data. In: Troya, J., Medjahed, B., Piattini, M., et al. (eds.) Service-oriented computing, pp. 36–53. Springer, Cham (2022)
Berti, A., van der Aalst, W.M.P.: Extracting multiple viewpoint models from relational databases. In: Ceravolo, P., van Keulen, M., López, M.T.G. (eds.) Data-Driven Process Discovery and Analysis - 8th IFIP WG 2.6 International Symposium, SIMPDA 2018, Seville, Spain, December 13–14, 2018, and 9th International Symposium, SIMPDA 2019, Bled, Slovenia, September 8, 2019, Revised Selected Papers, Lecture Notes in Business Information Processing, vol 379. Springer, New York City, pp 24–51 (2019). https://doi.org/10.1007/978-3-030-46633-6_2
Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Machine Learning, Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, California, USA, July 9–12, 1995. Morgan Kaufmann, Massachusetts, USA, pp 115–123 (1995). https://doi.org/10.1016/b978-1-55860-377-6.50023-2
Denisov, V., Fahland, D., van der Aalst, W.M.P.: Predictive performance monitoring of material handling systems using the performance spectrum. In: International Conference on Process Mining, ICPM 2019, Aachen, Germany, June 24–26, 2019. IEEE, New York City, pp 137–144 (2019). https://doi.org/10.1109/ICPM.2019.00029
de Leoni, M., van der Aalst, W.M.P., Dees, M.: A general process mining framework for correlating, predicting and clustering dynamic behavior based on event logs. Inf. Syst. 56, 235–257 (2016). https://doi.org/10.1016/j.is.2015.07.003
de Lima Bezerra F, Wainer J, van der Aalst, W.M.P.: Anomaly detection using process mining. In: Halpin TA, Krogstie, J., Nurcan, S., et al. (eds.) Enterprise, Business-Process and Information Systems Modeling, 10th International Workshop, BPMDS 2009, and 14th International Conference, EMMSAD 2009, held at CAiSE 2009, Amsterdam, The Netherlands, June 8–9, 2009. Proceedings, Lecture Notes in Business Information Processing, vol 29. Springer, New York City, pp 149–161 (2009). https://doi.org/10.1007/978-3-642-01862-6_13
Elkhovskaya, L., Kovalchuk, S.V.: Feature engineering with process mining technique for patient state predictions. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., et al. (eds.) Computational Science - ICCS 2021 - 21st International Conference, Krakow, Poland, June 16–18, 2021, Proceedings, Part III, Lecture Notes in Computer Science, vol 12744. Springer, pp 584–592 (2021). https://doi.org/10.1007/978-3-030-77967-2_48
Esser, S., Fahland, D.: Multi-dimensional event data in graph databases. J. Data Semant. 10(1–2), 109–141 (2021). https://doi.org/10.1007/s13740-021-00122-1
Galanti, R., de Leoni, M., Navarin, N., et al.: Object-centric process predictive analytics. Expert Syst. Appl. 213, 119173 (2023). https://doi.org/10.1016/j.eswa.2022.119173
Ghahfarokhi, A.F., Park, G., Berti, A., et al.: OCEL: a standard for object-centric event logs. In: New Trends in Database and Information Systems - ADBIS 2021 Short Papers, Doctoral Consortium and Workshops: DOING, SIMPDA, MADEISD, MegaData, CAoNS, Tartu, Estonia, August 24–26, 2021, Proceedings, Communications in Computer and Information Science, vol 1450. Springer, New York City, pp 169–175 (2021). https://doi.org/10.1007/978-3-030-85082-1_16
Gherissi, W., Haddad, J.E., Grigori, D.: Object-centric predictive process monitoring. In: Troya J, Mirandola R, Navarro E, et al (eds) Service-Oriented Computing - ICSOC 2022 Workshops - ASOCA, AI-PA, FMCIoT, WESOACS 2022, Sevilla, Spain, November 29–December 2, 2022 Proceedings, Lecture Notes in Computer Science, vol 13821. Springer, pp 27–39 (2022). https://doi.org/10.1007/978-3-031-26507-5_3
Junior, S.B., Ceravolo, P., Damiani, E., et al.: Evaluating trace encoding methods in process mining. In: Bowles, J., Broccia, G., Nanni, M. (eds.) From Data to Models and Back - 9th International Symposium, DataMod 2020, Virtual Event, October 20, 2020, Revised Selected Papers, Lecture Notes in Computer Science, vol 12611. Springer, New York City, pp 174–189 (2020). https://doi.org/10.1007/978-3-030-70650-0_11
Klijn, E.L., Fahland, D.: Identifying and reducing errors in remaining time prediction due to inter-case dynamics. In: van Dongen, B.F., Montali, M., Wynn, M.T. (eds.) 2nd International Conference on Process Mining, ICPM 2020, Padua, Italy, October 4–9, 2020. IEEE, New York City, pp 25–32 (2020). https://doi.org/10.1109/ICPM49681.2020.00015
Mensi, A., Bicego, M.: A novel anomaly score for isolation forests. In: Ricci, E., Bulò, S.R., Snoek, C., et al. (eds.) Image Analysis and Processing - ICIAP 2019 - 20th International Conference, Trento, Italy, September 9–13, 2019, Proceedings, Part I, Lecture Notes in Computer Science, vol 11751. Springer, New York City, pp 152–163 (2019). https://doi.org/10.1007/978-3-030-30642-7_14
Pourbafrani, M., van der Aalst, W.M.P.: Extracting process features from event logs to learn coarse-grained simulation models. In: Rosa, M.L., Sadiq, S.W., Teniente, E. (eds.) Advanced Information Systems Engineering - 33rd International Conference, CAiSE 2021, Melbourne, VIC, Australia, June 28–July 2, 2021, Proceedings, Lecture Notes in Computer Science, vol 12751. Springer, New York City, pp 125–140 (2021). https://doi.org/10.1007/978-3-030-79382-1_8
Pourbafrani, M., van Zelst, S.J., van der Aalst, W.M.P.: Supporting decisions in production line processes by combining process mining and system dynamics. In: Ahram, T.Z., Karwowski, W., Vergnano, A., et al. (eds.) Intelligent Human Systems Integration 2020 - Proceedings of the 3rd International Conference on Intelligent Human Systems Integration (IHSI 2020): Integrating People and Intelligent Systems, February 19–21, 2020, Modena, Italy, Advances in Intelligent Systems and Computing, vol 1131. Springer, New York City, pp 461–467 (2020). https://doi.org/10.1007/978-3-030-39512-4_72
Pourbafrani, M., Kar, S., Kaiser, S., et al.: Remaining time prediction for processes with inter-case dynamics. In: Munoz-Gama, J., Lu, X. (eds.) Process Mining Workshops - ICPM 2021 International Workshops, Eindhoven, The Netherlands, October 31–November 4, 2021, Revised Selected Papers, Lecture Notes in Business Information Processing, vol 433. Springer, New York City, pp 140–153 (2021). https://doi.org/10.1007/978-3-030-98581-3_11
Qafari, M.S., van der Aalst, W.M.P.: Root cause analysis in process mining using structural equation models. In: del-Río-Ortega, A., Leopold, H., Santoro, F.M. (eds,) Business Process Management Workshops - BPM 2020 International Workshops, Seville, Spain, September 13–18, 2020, Revised Selected Papers, Lecture Notes in Business Information Processing, vol 397. Springer, New York City, pp 155–167 (2020). https://doi.org/10.1007/978-3-030-66498-5_12
Qafari, M.S., van der Aalst, W.M.P.: Case level counterfactual reasoning in process mining. In: Nurcan, S., Korthaus, A. (eds.) Intelligent Information Systems - CAiSE Forum 2021, Melbourne, VIC, Australia, June 28–July 2, 2021, Proceedings, Lecture Notes in Business Information Processing, vol 424. Springer, New York City, pp 55–63 (2021). https://doi.org/10.1007/978-3-030-79108-7_7
Sato, D.M.V., Freitas, S.C.D., Barddal, J.P., et al.: A survey on concept drift in process mining. ACM Comput. Surv. 54(9), 189:1-189:38 (2022). https://doi.org/10.1145/3472752
Tax, N., Verenich, I., Rosa, M.L., et al.: Predictive business process monitoring with LSTM neural networks. In: Dubois, E., Pohl, K. (eds.) Advanced Information Systems Engineering - 29th International Conference, CAiSE 2017, Essen, Germany, June 12–16, 2017, Proceedings, Lecture Notes in Computer Science, vol 10253. Springer, New York City, pp 477–492 (2017). https://doi.org/10.1007/978-3-319-59536-8_30
Tax, N., Teinemaa, I., van Zelst, S.J.: An interdisciplinary comparison of sequence modeling methods for next-element prediction. Softw. Syst. Model. 19(6), 1345–1365 (2020). https://doi.org/10.1007/s10270-020-00789-3
van der Aalst, W.M.P.: Object-centric process mining: dealing with divergence and convergence in event data. In: Ölveczky, P.C., Salaün, G. (eds.) Software Engineering and Formal Methods—17th International Conference, SEFM 2019, Oslo, Norway, September 18–20, 2019, Proceedings, Lecture Notes in Computer Science, vol 11724. Springer, New York City, pp 3–25 (2019). https://doi.org/10.1007/978-3-030-30446-1_1
Vazifehdoostirani, M., Genga, L., Dijkman, R.M.: Encoding high-level control-flow construct information for process outcome prediction. In: Burattin, A., Polyvyanyy, A., Weber, B. (eds.) 4th International Conference on Process Mining, ICPM 2022, Bolzano, Italy, October 23–28, 2022. IEEE, New York City, pp 48–55 (2022). https://doi.org/10.1109/ICPM57379.2022.9980737
Winter, K., Stertz, F., Rinderle-Ma, S.: Discovering instance and process spanning constraints from process execution logs. Inf. Syst. 89, 101484 (2020). https://doi.org/10.1016/j.is.2019.101484