XStar: a software system for handling taxi trajectory big data

Computational Urban Science - Tập 1 - Trang 1-13 - 2021
Xiang Li1, Joseph Mango1,2, Jiajia Song1, Di Zhang1
1Key Laboratory of Geographic Information Science (Ministry of Education) and School of Geographic Sciences, East China Normal University, Shanghai, China
2Department of Transportation and Geotechnical Engineering, University of Dar es Salaam, Dar es salaam, Tanzania

Tóm tắt

Advances in positioning and communicating technologies make it possible to collect large volumes of taxi trajectory data, quickly providing a complete picture of the ground traffic systems and thus being applied to different fields. However, there are still challenges for data users to handle such big data. In view of this, we have developed a software system named XStar to deal with trajectory big data. Its core is a scalable index and storage structure. Based on it, raw data can be saved in a more compact scheme and accessed more efficiently. A real taxi trajectory dataset is employed to demonstrate its performance. In general, XStar facilitates processing and analyzing trajectory data affordably and straightforwardly. Since its release in Jan. 2019, it has received downloads of over 4000 by May 2021. More analytical functions are being developed.

Tài liệu tham khảo

Burton, F. W., Kollias, J. G., Matsakis, D. G., & Kollias, V. G. (1990). Short note: Implementation of overlapping b-trees for time and space efficient representation of collections of similar files. The Computer Journal, 33(3), 279–280. https://doi.org/10.1093/comjnl/33.3.279. Chakka, V. P., Everspaugh, A., & Patel, J. M. (2003). Indexing large trajectory data sets with SETI. CIDR, 75, 76. Chen, H., & Rakha, H. A. (2014). Real-time travel time prediction using particle filtering with a non-explicit state-transition model. Transportation Research Part C: Emerging Technologies, 43, 112–126. https://doi.org/10.1016/j.trc.2014.02.008. Chen, N., Shou, L. D., Chen, G., & Dong, J. X. (2008a). Adaptive indexing of moving objects with highly variable update frequencies. Journal of Computer Science and Technology, 23(6), 998–1014. https://doi.org/10.1007/s11390-008-9185-0. Chen, S., Ooi, B. C., Tan, K. L., & Nascimento, M. A. (2008b). ST2B-tree: A self-tunable spatio-temporal B+-tree index for moving objects. In Proceedings of the 2008 ACM SIGMOD international conference on management of data (pp. 29–42). Dittrich, J., & Quiané-Ruiz, J. A. (2012). Efficient big data processing in Hadoop MapReduce. Proceedings of the VLDB Endowment, 5(12), 2014–2015. https://doi.org/10.14778/2367502.2367562. Dodge, S., Weibel, R., & Forootan, E. (2009). Revealing the physics of movement: Comparing the similarity of movement characteristics of different types of moving objects. Computers, Environment and Urban Systems, 33(6), 419–434. https://doi.org/10.1016/j.compenvurbsys.2009.07.008. Elbassioni, K., Elmasry, A., & Kamel, I. (2003). An efficient indexing scheme for multi-dimensional moving objects. In International conference on database theory (pp. 425–439). Berlin, Heidelberg: Springer. Fang, Y., Cao, J., Wang, J., Peng, Y., & Song, W. (2011). HTPR*-tree: An efficient index for moving objects to support predictive query and partial history query. In International conference on web-age information management (pp. 26–39). Berlin, Heidelberg: Springer. Ge, Y., Xiong, H., Tuzhilin, A., Xiao, K., Gruteser, M., & Pazzani, M. (2010, July). An energy-efficient mobile recommender system. In proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 899-908). Guttman, A. (1984). R-trees: A dynamic index structure for spatial searching. In proceedings of the 1984 ACM SIGMOD international conference on management of data (pp. 47-57). Hu, Y., Miller, H. J., & Li, X. (2014). Detecting and analyzing mobility hotspots using surface networks. Transactions in GIS, 18(6), 911–935. https://doi.org/10.1111/tgis.12076. Izakian, Z., Mesgari, M. S., & Abraham, A. (2016). Automated clustering of trajectory data using a particle swarm optimization. Computers, Environment and Urban Systems, 55, 55–65. https://doi.org/10.1016/j.compenvurbsys.2015.10.009. Jensen, C. S., Lin, D., & Ooi, B. C. (2004). Query and update efficient B+-tree based indexing of moving objects. In proceedings of the thirtieth international conference on very large data bases-volume 30 (pp. 768-779). Jiang, Y., & Li, X. (2013). Travel time prediction based on historical trajectory data. Annals of GIS, 19(1), 27–35. https://doi.org/10.1080/19475683.2012.758173. Kang, C., Liu, Y., & Wu, L. (2015, June). Delineating intra-urban spatial connectivity patterns by travel-activities: A case study of Beijing, China. In 2015 23rd international conference on Geoinformatics (pp. 1-7). IEEE. Kwon, D., Lee, S., & Lee, S. (2002). Indexing the current positions of moving objects using the lazy update R-tree. In proceedings third international conference on Mobile data management MDM 2002 (pp. 113-120). IEEE. Lee, M. L., Hsu, W., Jensen, C. S., Cui, B., & Teo, K. L. (2003, January). Supporting frequent updates in r-trees: A bottom-up approach. In proceedings 2003 VLDB conference (pp. 608-619). Morgan Kaufmann. Li, X., & Lin, H. (2006). Indexing network-constrained trajectories for connectivity-based queries. International Journal of Geographical Information Science, 20(3), 303–328. https://doi.org/10.1080/13658810500432570. Liu, X., & Karimi, H. A. (2006). Location awareness through trajectory prediction. Computers, Environment and Urban Systems, 30(6), 741–756. https://doi.org/10.1016/j.compenvurbsys.2006.02.007. Liu, Y., Wang, F., Xiao, Y., & Gao, S. (2012). Urban land uses and traffic ‘source-sink areas’: Evidence from GPS-enabled taxi data in Shanghai. Landscape and Urban Planning, 106(1), 73–87. https://doi.org/10.1016/j.landurbplan.2012.02.012. Lomet, D., & Salzberg, B. (1989). Access methods for multiversion data. ACM SIGMOD Record, 18(2), 315–324. https://doi.org/10.1145/66926.66956. Mahmood, A. R., Aly, A. M., Kuznetsova, T., Basalamah, S., & Aref, W. G. (2018). Disk-based indexing of recent trajectories. ACM Transactions on Spatial Algorithms and Systems (TSAS), 4(3), 1–27. https://doi.org/10.1145/3234941. Mahmood, A. R., Punni, S., & Aref, W. G. (2019). Spatio-temporal access methods: A survey (2010-2017). GeoInformatica, 23(1), 1–36. https://doi.org/10.1007/s10707-018-0329-2. Patel, J. M., Chen, Y., & Chakka, V. P. (2004, June). STRIPES: An efficient index for predicted trajectories. In proceedings of the 2004 ACM SIGMOD international conference on management of data (pp. 635-646). Pfoser, D., & Theodoridis, Y. (2003). Generating semantics-based trajectories of moving objects. Computers, Environment and Urban Systems, 27(3), 243–263. https://doi.org/10.1016/S0198-9715(02)00023-6. Romero, M., Brisaboa, N., & Rodríguez, M. A. (2012). The smo-index: a succinct moving object structure for timestamp and interval queries. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems (pp. 498–501). Šaltenis, S., Jensen, C. S., Leutenegger, S. T., & Lopez, M. A. (2000). Indexing the positions of continuously moving objects. In Proceedings of the 2000 ACM SIGMOD international conference on management of data (pp. 331–342). Song, Z., & Roussopoulos, N. (2003). SEB-tree: An approach to index continuously moving objects. In International conference on Mobile data management (pp. 340–344). Berlin, Heidelberg: Springer. Tao, Y., & Papadias, D. (2000). MV3R-tree: A spatio-temporal access method for timestamp and interval queries (Vol. 6). Technical Report HKUST-CS00. Tao, Y., & Papadias, D. (2001, July). Efficient historical R-trees. In proceedings thirteenth international conference on scientific and statistical database management. SSDBM 2001 (pp. 223-232). IEEE. Tao, Y., Papadias, D., & Sun, J. (2003). The TPR*-tree: An optimized spatio-temporal access method for predictive queries. In proceedings 2003 VLDB conference (pp. 790-801). Morgan Kaufmann. Theodoridis, Y., Vazirgiannis, M., & Sellis, T. (1996, June). Spatio-temporal indexing for large multimedia applications. In proceedings of the third IEEE international conference on multimedia computing and systems (pp. 441-448). IEEE. Torrens, P. M., Nara, A., Li, X., Zhu, H., Griffin, W. A., & Brown, S. B. (2012). An extensible simulation environment and movement metrics for testing walking behavior in agent-based models. Computers, Environment and Urban Systems, 36(1), 1–17. https://doi.org/10.1016/j.compenvurbsys.2011.07.005. Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J. M., Kulkarni, S., ... & Ryaboy, D. (2014, June). Storm@ twitter. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (pp. 147–156). Wang, J., Wang, C., Song, X., & Raghavan, V. (2017). Automatic intersection and traffic rule detection by mining motor-vehicle GPS trajectories. Computers, Environment and Urban Systems, 64, 19–29. https://doi.org/10.1016/j.compenvurbsys.2016.12.006. Wei, L. Y., Zheng, Y., & Peng, W. C. (2012). Constructing popular routes from uncertain trajectories. In proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 195-203). Xu, X., Han, J., & Lu, W. (1990). RT-tree: An improved R-tree indexing structure for temporal spatial databases [C]. In The international symposium on spatial data handling (pp. 1040–1049). Zurich: SDH. Xu, T., Li, X., & Claramunt, C. (2018a). Trip-oriented travel time prediction (TOTTP) with historical vehicle trajectories. Frontiers of Earth Science, 12(2), 253–263. https://doi.org/10.1007/s11707-016-0634-8. Xu, T., Zhang, X., Claramunt, C., & Li, X. (2018b). TripCube: A trip-oriented vehicle trajectory data indexing structure. Computers, Environment and Urban Systems, 67, 21–28. https://doi.org/10.1016/j.compenvurbsys.2017.08.005. Yang, Y., Papadopoulos, S., Papadias, D., & Kollios, G. (2009). Authenticated indexing for outsourced spatial databases. The VLDB Journal, 18(3), 631–648. https://doi.org/10.1007/s00778-008-0113-2. Yiu, M. L., Tao, Y., & Mamoulis, N. (2008). The B dual-Tree: indexing moving objects by space filling curves in the dual space. The VLDB Journal, 17(3), 379–400. https://doi.org/10.1007/s00778-006-0013-2. Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M. J., Ghodsi, A., Gonzalez, J., Shenker, S., & Stoica, I. (2016). Apache spark: A unified engine for big data processing. Communications of the ACM, 59(11), 56–65. https://doi.org/10.1145/2934664. Zheng, Y., Liu, Y., Yuan, J., & Xie, X. (2011, September). Urban computing with taxicabs. In proceedings of the 13th international conference on ubiquitous computing (pp. 89-98). Zhou, P., Zhang, D., Salzberg, B., Cooperman, G., & Kollios, G. (2005, November). Close pair queries in moving object databases. In Proceedings of the 13th annual ACM international workshop on Geographic information systems (pp. 2–11). Zhou, Y., Zhang, Y., Ge, Y., Xue, Z., Fu, Y., Guo, D., Shao, J., Zhu, T., Wang, X., & Li, J. (2017). An efficient data processing framework for mining the massive trajectory of moving objects. Computers, Environment and Urban Systems, 61, 129–140. https://doi.org/10.1016/j.compenvurbsys.2015.03.004.