Scalable graph-based OLAP analytics over process execution data

Amin Beheshti1, Boualem Benatallah1, Hamid Reza Motahari-Nezhad1
1School of Computer Science and Engineering, University of New South Wales, Sydney, Australia

Tóm tắt

Từ khóa


Tài liệu tham khảo

Aalst, W.M.P.V.D., Dongen, B.F.V., Günther, C.W., Rozinat, A., Verbeek, E., Weijters, T.: ProM: the process mining toolkit. In: Proceedings of the BPM (2009)

Aalst, W.M.P.V.D., Dongen, B.F.V., Herbst, J., Maruster, L., Schimm, G., Weijters, A.J.M.M.: Workflow mining: a survey of issues and approaches. Data Knowl. Eng. 47, 237–267 (2003)

Aalst, W.M.P.V.D.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011)

Aalst, W.M.P.V.D.: Service mining: using process mining to discover, check, and improve service behavior. IEEE Trans. Serv. Comput. 99(PrePrints), 1 (2012)

Abadi, D., Marcus, A., Madden, S., Hollenbach, K.: Scalable semantic web data management using vertical partitioning. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 411–422. VLDB Endowment (2007)

Abelló, A., Romero, O.: On-line analytical processing. In: Encyclopedia of Database Systems, pp. 1949–1954. Springer, New York (2009)

Aggarwal, C.C., Wang, H.: Managing and Mining Graph Data. Springer, New York (2010)

Akal, F., Bhm, K., Schek, H.J.: OLAP query evaluation in a database cluster: a performance study on intra-query parallelism. In: Proceedings of the ADBIS, pp. 218–231 (2002)

Alkhateeb, F., Baget, J.F., Euzenat, J.: Extending SPARQL with regular expression patterns (for querying RDF). J. Web Sem. 7(2), 57–73 (2009)

Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Beheshti, S.M.R., Bertino, E., Foo, N.: Collusion detection in online rating systems. In: Proceedings of the Web Technologies and Applications—15th Asia-Pacific Web Conference, APWeb 2013, Sydney, April 4–6, 2013, pp. 196–207 (2013)

Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Beheshti, S.M.R., Foo, N., Bertino, E.: Representation and querying of unfair evaluations in social rating systems. Comput. Secur. 41, 68–88 (2014)

Anyanwu, K., Maduko, A., Sheth, A.: SPARQ2L: towards support for subgraph extraction queries in RDF databases. WWW’07, pp. 797–806. ACM, New York (2007)

Azvine, B., Nauck, D., Ho, C.: Intelligent business analytics: a tool to build decision-support systems for ebusinesses. BT Technol. J. 21(4), 65–71 (2003)

Báez, M., Mussi, A., Casati, F., Birukou, A., Marchese, M.: Liquid journals: scientific journals in the Web 2.0 era. In: Proceedings of the JCDL, pp. 395–396 (2010)

Balmin, A., Papadimitriou, T., Papakonstantinou, Y.: Hypothetical queries in an OLAP environment. In: Proceedings of the VLDB, pp. 220–231 (2000)

Barbieri, D.F., Braga, D., Ceri, S., Valle, E.D., Grossniklaus, M.: C-SPARQL: SPARQL for continuous querying. In: Proceedings of the WWW, pp. 1061–1062 (2009)

Beeri, C., Eyal, A., Milo, T., Pilberg, A.: Monitoring business processes with queries. In: Proceedings of the VLDB (2007)

Begel, A., Phang Khoo, Y., Zimmermann, T.: Codebook: discovering and exploiting relationships in software repositories. In: Proceedings of the ICSE’10, pp. 125–134 (2010)

Beheshti, S.M.R., Benatallah, B., Motahari Nezhad, H.R., Allahbakhsh, M.: A framework and a language for on-line analytical processing on graphs. In: Proceedings of the Web Information Systems Engineering—WISE 2012–13th International Conference, Paphos, Cyprus, November 28–30, pp. 213–227 (2012)

Beheshti, S.M.R., Benatallah, B., Motahari Nezhad, H.R., Sakr, S.: A query language for analyzing business processes execution. In: Proceedings of the Business Process Management—9th International Conference, BPM 2011, Clermont-Ferrand, France, August 30—September 2, pp. 281–297 (2011)

Beheshti, S.M.R., Benatallah, B., Motahari-Nezhad, H.R.: Enabling the analysis of cross-cutting aspects in ad-hoc processes. In: Proceedings of the Advanced Information Systems Engineering—25th International Conference, CAiSE 2013, Valencia, June 17–21, pp. 51–67 (2013)

Beheshti, S.M.R.: Organizing, Querying, and Analyzing Ad-hoc Processes Data. PhD Thesis, University of New South Wales Sydney (2012)

Beyer, K.S., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg CUBEs. In: Proceedings of the SIGMOD 1999, ACM SIGMOD International Conference on Management of Data, June 1–3, 1999, Philadelphia, pp. 359–370. ACM Press, New York (1999)

Bizer, C., Heath, T., Berners-Lee, T.: Linked data-the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)

Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings of the ICDE, pp. 421–430 (2001)

Brambilla, M., Fraternali, P., Vaca, C.: BPMN and design patterns for engineering social BPM solutions. In: Business Process Management Workshops. Lecture Notes in Business Information Processing, vol. 99, pp. 219–230. Springer, Berlin (2012)

Buse, R.P.L., Zimmermann, T.: Information needs for software development analytics. In: Proceedings of the ICSE, pp. 987–996 (2012)

Casati, F., Shan, M.C.: Semantic analysis of business process executions. In: Proceedings of the EDBT, pp. 287–296 (2002)

Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Rec. 26(1), 65–74 (1997)

Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54(8), 88–98 (2011)

Chebotko, A., Lu, S., Fotouhi, F.: Semantics preserving SPARQL-to-SQL translation. Data Knowl. Eng. 68(10), 973–1000 (2009)

Chebotko, A., Lu, S., Fei, X., Fotouhi, F.: RDFProv: a relational RDF store for querying and managing scientific workflow provenance. Data Knowl. Eng. 69(8), 836–865 (2010)

Chen, C., Yan, X., Zhu, F., Han, J., Yu, P.S.: Graph OLAP: Towards online analytical processing on graphs. In: Proceedings of the ICDM, pp. 103–112 (2008)

Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the World-Wide Web. Commun. ACM 54(4), 86–96 (2011)

Dries, A., Nijssen, S., De Raedt, L.: A query language for analyzing networks. In: Proceedings of the CIKM’09, pp. 485–494. ACM, New York (2009)

Egghe, L.: Theory and practise of the g-index. Scientometrics 69(1), 131–152 (2006)

Etcheverry, L., Vaisman, A.A.: Enhancing OLAP analysis with web cubes. In: Proceedings of the ESWC, pp. 469–483 (2012)

Fritz, T., Murphy, G.C.: Using information fragments to answer the questions developers ask. In: Proceedings of the ICSE’10, pp. 175–184. ACM, New York (2010)

Furtado, C., Lima, A.A.B., Pacitti, E., Valduriez, P., Mattoso, M.: Physical and virtual partitioning in OLAP database clusters. In: Proceedings of the SBAC-PAD, pp. 143–150 (2005)

Golfarelli, M., Rizzi, S., Proli, A.: Designing what-if analysis: towards a methodology. In: Proceedings of the DOLAP, pp. 51–58 (2006)

Gómez, L.I., Gómez, S.A., Vaisman, A.A.: A generic data model and query language for spatiotemporal OLAP cube analysis. In: Proceedings of the EDBT, pp. 300–311 (2012)

Gottanka, R., Meyer, N.: ModelAsYouGo: (re-) design of S-BPM process models during execution time. In: S-BPM ONE Scientific Research. Lecture Notes in Business Information Processing, vol. 104, pp. 91–105. Springer, Berlin (2012)

Gubichev, A., Bedathur, S.J., Seufert, S.: Fast and accurate estimation of shortest paths in large graphs. In: Proceedings of the CIKM’10, pp. 499–508 (2010)

Han, J., Pei, J., Dong, G., Wang, K.: Efficient computation of iceberg cubes with complex measures. In: Proceedings of the SIGMOD Conference, pp. 1–12 (2001)

Han, J., Sun, Y., Yan, X., Yu, P.S.: Mining knowledge from data: an information network analysis approach. In: Proceedings of the ICDE (2012)

Han, J., Yan, X., Yu, P.S.: Scalable OLAP and mining of information networks. In: Proceedings of the EDBT (2009)

Hassanzadeh, O., Duan, S., Fokoue, A., Kementsietsidis, A., Srinivas, K., Ward, M.J.: Helix: online enterprise data analytics. In: Proceedings of the WWW (Companion Volume), pp. 225–228 (2011)

Hassanzadeh, O., Kementsietsidis, A., Lim, L., Miller, R.J., Wang, M.: A framework for semantic link discovery over relational data. In: Proceedings of the CIKM, pp. 1027–1036 (2009)

Hirsch, J.E.: An index to quantify an individual’s scientific research output that takes into account the effect of multiple coauthorship. Scientometrics 85(3), 741–754 (2010)

Husain, M.F., Doshi, P., Khan, L., Thuraisingham, B.M.: Storage and retrieval of large RDF graph using Hadoop and MapReduce. In: Proceedings of the CloudCom, pp. 680–686 (2009)

Husain, M.F., Khan, L., Kantarcioglu, M., Thuraisingham, B.M.: Data intensive query processing for large RDF graphs using cloud computing tools. In: Proceedings of the IEEE CLOUD, pp. 1–10 (2010)

Jagadeesh Chandra Bose, R.P., Verbeek, H.M.W., Aalst, W.M.P.V.D.: Discovering hierarchical process models using ProM. In: Proceedings of the CAiSE Forum, pp. 33–40 (2011)

Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph regularized transductive classification on heterogeneous information networks. In: Proceedings of the ECML/PKDD (1), pp. 570–586 (2010)

Kämpgen, B., Harth, A.: Transforming statistical linked data for use in OLAP systems. In: Proceedings of the I-SEMANTICS, pp. 33–40 (2011)

Kim, H., Ravindra, P., Anyanwu, K.: From SPARQL to MapReduce: the journey using a nested triplegroup algebra. In: Proceedings of the PVLDB 4(12), 1426–1429 (2011)

Kmpgen, B., O’Riain, S., Harth, A:. Interacting with statistical linked data via OLAP operations. In: Proceedings of the ILD-ESWC (2012)

Kochut, K.J., Janik, M.: SPARQLeR: Extended SPARQL for semantic association discovery. In: Proceedings of the ESWC’07, pp. 145–159. Springer, Berlin (2007)

Kohavi, R., Rothleder, N.J., Simoudis, E.: Emerging trends in business analytics. Commun. ACM 45(8), 45–48 (2002)

Koutsoukis, N.S., Mitra, G., Lucas, C.: Adapting on-line analytical processing for decision modelling: the interaction of information and decision technologies. Decis. Support Syst. 26(1), 1–30 (1999)

Kurniawan, T.A., Ghose, A.K., Lê, L.S., Dam, H.K.: On formalizing inter-process relationships. In: Proceedings of the Business Process Management Workshops. Lecture Notes in Business Information Processing, vol. 100, pp. 75–86. Springer, Berlin (2012)

Leskovec, J., Adamic, L.A., Huberman, B.A.: The dynamics of viral marketing. TWEB, 1(1), 5 (2007)

Lima, A.A.B., Mattoso, M., Valduriez, P.: Adaptive virtual partitioning for OLAP query processing in a database cluster. JIDM 1(1), 75–88 (2010)

Manola, F., Miller, E.: RDF Primer. W3C, http://www.w3.org/TR/rdf-primer/ (2004). Accessed 1 May 2014

Mathiesen, P., Watson, J., Bandara, W., Rosemann, M.: Applying social technology to business process lifecycle management. In: Proceedings of the Business Process Management Workshops. Lecture Notes in Business Information Processing, vol. 99, pp. 231–241. Springer, Berlin (2012)

Medeiros, A.K.A.D., Aalst, W.M.P.V.D., Pedrinaci, C.: Semantic process mining tools: core building blocks. In: Proceedings of the ECIS, pp. 1953–1964 (2008)

Menzies, T., Zimmermann, T.: Goldfish bowl panel: software development analytics. In: Proceedings of the ICSE, pp. 1032–1033 (2012)

Mhlen, M., Shapiro, R.: Business process analytics. In: Handbook on Business Process Management 2, International Handbooks on Information Systems, pp. 137–157. Springer, Berlin (2010)

Molhanec, M.: Enterprise systems meet social BPM. In: Proceedings of the Advanced Information Systems Engineering Workshops. Lecture Notes in Business Information Processing, vol. 112, pp. 413–424. Springer, Berlin (2012)

Momotko, M., Subieta, K.: Process query language: a way to make workflow processes more flexible. In: Proceedings of the ADBIS (2004)

Motahari-Nezhad, H.R., Saint-Paul, R., Benatallah, B., Casati, F.: Deriving protocol models from imperfect service conversation logs. IEEE Trans. Knowl. Data Eng. 20, 1683–1698 (2008)

Motahari-Nezhad, H.R., Saint-Paul, R., Casati, F., Benatallah, B.: Event correlation for process discovery from web service interaction logs. VLDB J. 20(3), 417–444 (2011)

Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp. 1099–1110. ACM, New York (2008)

Ooi, B.C., Yu, B., Li, G.: One table stores all: enabling painless free-and-easy data publishing and sharing. In: Proceedings of the CIDR’07, pp. 142–153 (2007)

Papastefanatos, G., Anagnostou, F., Vassiliou, Y., Vassiliadis, P.: Hecataeus: a what-if analysis tool for database schema evolution. In: Proceedings of the CSMR, pp. 326–328 (2008)

Papastefanatos, G., Vassiliadis, P., Simitsis, A., Vassiliou, Y.: What-if analysis for data warehouse evolution. In: Proceedings of the DaWaK, pp. 23–33 (2007)

Pistore, M., Barbon, F., Bertoli, P., Shaparau, D., Traverso, P.: Planning and monitoring web service composition. In: Proceedings of the AIMSA (2004)

PrudHommeaux, E., Seaborne, A. et al.: Sparql query language for rdf. W3C recommendation, http://www.w3.org/TR/rdf-sparql-query/ (2008)

Qu, Q., Zhu, F., Yan, X., Han, J., Yu, P.S., Li, H.: Efficient topological OLAP on information networks. In: Proceedings of the DASFAA (2011)

Ravindra, P., Kim, H., Anyanwu, K.: An intermediate algebra for optimizing RDF graph pattern matching on MapReduce. In: Proceedings of the ESWC (2), pp. 46–61 (2011)

Romero, O., Abelló, A.: A survey of multidimensional modeling methodologies. IJDWM 5(2), 1–23 (2009)

Rozsnyai, S., Slominski, A., Lakshmanan, G.T.: Automated correlation discovery for semi-structured business processes. In: Proceedings of the ICDE Workshops, pp. 261–266 (2011)

Schätzle, A., Przyjaciel-Zablocki, M., Lausen, G.: PigSPARQL: mapping SPARQL to Pig Latin. In: Proceedings of the International Workshop on Semantic Web Information Management, SWIM ’11, pp. 4:1–4:8. ACM, New York (2011)

Sun, Y., Aggarwal, C.C., Han, J.: Relation strength-aware clustering of heterogeneous information networks with incomplete attributes. PVLDB 5(5), 394–405 (2012)

Sun, Y., Han, J., Zhao, P., Yin, Z., Cheng, H., Wu, T.: RankClus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the EDBT, pp. 565–576 (2009)

Sun, Y., Yu, Y., Han, J.: Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the KDD, pp. 797–806 (2009)

Thomsen, E.: OLAP Solutions: Building Multidimensional Information Systems, 2nd edn. John Wiley, New York (2002)

Tian, Y., Hankins, R.A., Patel, J.M.: Efficient aggregation for graph summarization. In: Proceedings of the SIGMOD Conference, pp. 567–580 (2008)

Vassiliadis, P.: A survey of extract-transform-load technology. IJDWM 5(3), 1–27 (2009)

Wang, J., Jin, T., Wong, R. K., Wen, L.: Querying business process model repositories. World Wide Web 17(3), 427–454 (2014)

White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Sebastopol (2009). Original edition

Witkowski, A., Bellamkonda, S., Bozkaya, T., Dorman, G., Folkert, N., Gupta, A., Sheng, L., Subramanian, S.: Spreadsheets in RDBMS for OLAP. In: Proceedings of the SIGMOD Conference, pp. 52–63 (2003)

Wynn, M.T., Dumas, M., Fidge, C.J., Hofstede, A.H.M.T., Aalst, W.M.P.V.D.: Business process simulation for operational decision support. In: Proceedings of the Business Process Management Workshops, pp. 66–77 (2007)

Xin, D., Shao, Z., Han, J., Liu, H.: C-Cubing: Efficient computation of closed cubes by aggregation-based checking. In: Proceedings of the ICDE (2006)

Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Proceedings of the SIGMOD Conference, pp. 335–346 (2004)

Yu, T.L., Goldberg, D.E.: Dependency structure matrix analysis: offline utility of the dependency structure matrix genetic algorithm. In: Proceedings of the GECCO (2), pp. 355–366 (2004)

Yuan, Y., Lin, X., Liu, Q., Wang, W., Yu, J.X., Zhang, Q.: Efficient computation of the skyline cube. In: Proceedings of the VLDB, pp. 241–252 (2005)

Zhao, P., Li, X., Xin, D., Han, J.: Graph cube: on warehousing and OLAP multidimensional networks. In: Proceedings of the SIGMOD’11, pp. 853–864 (2011)

Zikopoulos, P., Eaton, C.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media, New York (2011)

Zou, L., Peng, P., Zhao, D.: Top-K possible shortest path query over a large uncertain graph. In: Proceedings of the WISE, pp. 72–86 (2011)