Context-Based Path Prediction for Targets with Switching Dynamics

Springer Science and Business Media LLC - Tập 127 - Trang 239-262 - 2018
Julian F. P. Kooij1, Fabian Flohr2, Ewoud A. I. Pool3, Dariu M. Gavrila1,3
1Delft University of Technology, Delft, The Netherlands
2Department of Environment Perception, Daimler AG, Ulm, Germany
3AMLab, University of Amsterdam, Amsterdam, The Netherlands

Tóm tắt

Anticipating future situations from streaming sensor data is a key perception challenge for mobile robotics and automated vehicles. We address the problem of predicting the path of objects with multiple dynamic modes. The dynamics of such targets can be described by a Switching Linear Dynamical System (SLDS). However, predictions from this probabilistic model cannot anticipate when a change in dynamic mode will occur. We propose to extract various types of cues with computer vision to provide context on the target’s behavior, and incorporate these in a Dynamic Bayesian Network (DBN). The DBN extends the SLDS by conditioning the mode transition probabilities on additional context states. We describe efficient online inference in this DBN for probabilistic path prediction, accounting for uncertainty in both measurements and target behavior. Our approach is illustrated on two scenarios in the Intelligent Vehicles domain concerning pedestrians and cyclists, so-called Vulnerable Road Users (VRUs). Here, context cues include the static environment of the VRU, its dynamic environment, and its observed actions. Experiments using stereo vision data from a moving vehicle demonstrate that the proposed approach results in more accurate path prediction than SLDS at the relevant short time horizon (1 s). It slightly outperforms a computationally more demanding state-of-the-art method.

Tài liệu tham khảo

Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., & Savarese, S. (2016). Social LSTM: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 961–971). Althoff, M., Stursberg, O., & Buss, M. (2009). Model-based probabilistic collision detection in autonomous driving. IEEE Transactions on Intelligent Transportation Systems, 10(2), 299–310. Antonini, G., Martinez, S. V., Bierlaire, M., & Thiran, J. P. (2006). Behavioral priors for detection and tracking of pedestrians in video sequences. International Journal of Computer Vision, 69(2), 159–180. Ba, S., & Odobez, J. (2011). Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1), 101–116. Ballan, L., Castaldo, F., Alahi, A., Palmieri, F., & Savarese, S. (2016). Knowledge transfer for scene-specific motion prediction. In Proceedings of the European conference on computer vision (ECCV) (pp. 697–713). Springer. Bandyopadhyay, T., Won, K., Frazzoli, E., Hsu, D., Lee, W., & Rus, D. (2013). Intention-aware motion planning. In E. Frazzoli, T. Lozano-Perez, N. Roy, & D. Rus (Eds.), Algorithmic foundations of robotics X (pp. 475–491). Berlin: Springer. Bar-Shalom, Y., Li, X., & Kirubarajan, T. (2001). Estimation with applications to tracking and navigation. Hoboken: Wiley-Interscience. Benfold, B., & Reid, I. (2009). Guiding visual surveillance by tracking human attention. In Proceedings of the British machine vision conference (BMVC) Bishop, C. M. (2006). Pattern recognition and machine learning (Vol. 1). Berlin: Springer. Blackman, S., & Popoli, R. (1999). Design and analysis of modern tracking systems. Norwood: Artech House Norwood. Bonnin, S., Weisswange, T. H., Kummert, F., & Schmuedderich, J. (2014). General behavior prediction by a combination of scenario-specific models. IEEE Transactions on Intelligent Transportation Systems, 15(4), 1478–1488. Boyen, X., & Koller, D. (1998). Tractable inference for complex stochastic processes. In Proceedings of uncertainty in artificial intelligence (UAI) (pp. 33–42). Morgan Kaufmann Publishers Inc. Braun, M., Rao, Q., Wang, Y., & Flohr, F. (2016). Pose-RCNN: Joint object detection and pose estimation using 3d object proposals. In Proceedings of the IEEE intelligent transportation systems conference (pp. 1546–1551). Cao, Z., Simon, T., Wei, S. E., & Sheikh, Y. (2017). Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Cara, I., & de Gelder, E. (2015). Classification for safety-critical car-cyclist scenarios using machine learning. In Proceedings of the IEEE intelligent transportation systems conference (pp. 1995–2000). Chen, B., Zhao, D., & Peng, H. (2017). Evaluation of automated vehicles encountering pedestrians at unsignalized crossings. In Proceedings of the IEEE intelligent vehicles symposium Cho, H., Rybski, P. E., & Zhang, W. (2011). Vision-based 3D bicycle tracking using deformable part model and interacting multiple model filter. In Proceedings of the international conference on robotics and automation (ICRA) (pp. 4391–4398). IEEE. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3213–3223). Dempster, A., Laird, N., & Rubin, D. B. (1977). Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39, 1–38. Dollár, P., Wojek, C., Schiele, B., & Perona, P. (2012). Pedestrian detection: An evaluation of the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 743–761. Duda, R. O., & Hart, P. E. (1972). Use of the Hough transformation to detect lines and curves in pictures. Communications of ACM, 15(1), 11–15. Enzweiler, M., & Gavrila, D. M. (2009). Monocular pedestrian detection: Survey and experiments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12), 2179–2195. Enzweiler, M., & Gavrila, D.M. (2010). Integrated pedestrian classification and orientation estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 982–989). IEEE. Flohr, F., Dumitru-Guzu, M., Kooij, J. F. P., & Gavrila, D. M. (2015). A probabilistic framework for joint pedestrian head and body orientation estimation. IEEE Transactions on Intelligent Transportation Systems, 16(4), 1872–1882. Gavrila, D. M., & Giebel, J. (2002). Shape-based pedestrian detection and tracking. In Proceedings of the IEEE intelligent vehicles symposium (Vol. 1, pp. 8–14). Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Geiger, A., Lauer, M., Wojek, C., Stiller, C., & Urtasun, R. (2014). 3d traffic scene understanding from movable platforms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 1012–1025. Hamaoka, H., Hagiwara, T., Tada, M., & Munehiro, K. (2013). A study on the behavior of pedestrians when confirming approach of right/left-turning vehicle while crossing a crosswalk. In Proceedings of the IEEE intelligent vehicles symposium (pp. 106–110). Hashimoto, Y., Gu, Y., Hsu, L. T., Iryo-Asano, M., & Kamijo, S. (2016). A probabilistic model of pedestrian crossing behavior at signalized intersections for connected vehicles. Transportation Research Part C, 71, 164–181. Helbing, D., & Molnár, P. (1995). Social force model for pedestrian dynamics. Physical Review E, 51(5), 4282. Hirschmüller, H. (2008). Stereo processing by semiglobal matching and mutual information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 328–341. Huang, L., Wu, J., You, F., Lv, Z., & Song, H. (2017). Cyclist social force model at unsignalized intersections with heterogeneous traffic. IEEE Transactions on Industrial Informatics, 13(2), 782–792. Hubert, A., Zernetsch, S., Doll, K., & Sick, B. (2017). Cyclists’ starting behavior at intersections. In IEEE intelligent vehicles symposium (IV) (pp. 1071–1077). IEEE. Jacobs, H., Hughes, O., Johnson-Roberson, M., & Vasudevan, R. (2017). Real-time certified probabilistic pedestrian forecasting. IEEE Robotics and Automation Letters, 2, 2064–2071. Karasev, V., Ayvaci, A., Heisele, B., & Soatto, S. (2016). Intent-aware long-term prediction of pedestrian motion. In Proceeding of the international conference on robotics and automation (ICRA) (pp 2543–2549). IEEE. Keller, C. G., & Gavrila, D. M. (2014). Will the pedestrian cross? A study on pedestrian path prediction. IEEE Transactions on Intelligent Transportation Systems, 15(2), 494–506. Keller, C. G., Dang, T., Fritz, H., Joos, A., Rabe, C., & Gavrila, D. M. (2011). Active pedestrian safety by automatic braking and evasive steering. IEEE Transactions on Intelligent Transportation Systems, 12(4), 1292–1304. Kitani, K. M., Ziebart, B. D., Bagnell, J. A., & Hebert, M. (2012). Activity forecasting. In Proceedings of the European conference on computer vision (ECCV) (pp. 201–214). Springer. Klostermann, D., Osep, A., Stückler, J., & Leibe, B. (2016). Unsupervised learning of shape-motion patterns for objects in urban street scenes. In Proceedings of the British machine vision conference (BMVC). Köhler, S., Schreiner, B., Ronalter, S., Doll, K., Brunsmann, U., & Zindler, K. (2013). Autonomous evasive maneuvers triggered by infrastructure-based detection of pedestrian intentions. In Proceedings of the IEEE intelligent vehicles symposium (pp. 519–526). Kooij, J. F. P., Schneider, N., Flohr, F., & Gavrila, D. M. (2014a). Context-based pedestrian path prediction. In Proceedings of the European conference on computer vision (ECCV) (pp. 618–633). Springer International Publishing. Kooij, J. F. P., Schneider, N., & Gavrila, D. M. (2014b). Analysis of pedestrian dynamics from a vehicle perspective. In Proceedings of the IEEE intelligent vehicles symposium (pp. 1445–1450). Kooij, J. F. P., Englebienne, G., & Gavrila, D. M. (2016). Mixture of switching linear dynamics to discover behavior patterns in object tracks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 322–334. Lauritzen, S. L. (1992). Propagation of probabilities, means, and variances in mixed graphical association models. Journal of the American Statistical Association, 87(420), 1098–1108. Lee, N., Choi, W., Vernaza, P., Choy, C. B., Torr, P. H., & Chandraker, M. (2017). DESIRE: Distant future prediction in dynamic scenes with interacting agents. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Li, X., Flohr, F., Yang, Y., Xiong, H., Braun, M., Pan, S., et al. (2016). A new benchmark for vision-based cyclist detection. In Proceedings of the IEEE intelligent vehicles symposium (pp. 1028–1033). IEEE Li, X., Li, L., Flohr, F., Wang, J., Xiong, H., Bernhard, M., et al. (2017). A unified framework for concurrent pedestrian and cyclist detection. IEEE Transactions on Intelligent Transportation Systems, 18(2), 269–281. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y. et al. (2016). SSD: Single shot multibox detector. In Proceedings of the European conference on computer vision (ECCV) (pp. 21–37). Springer. Meinecke, M. M., Obojski, M., Gavrila, D. M., Marc, E., Morris, R., Töns, M., et al. (2003). Strategies in terms of vulnerable road user protection. In EU project SAVE-U, Deliverable D6. Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Meuter, M., Iurgel, U., Park, S. B., & Kummert, A. (2008). Unscented Kalman filter for pedestrian tracking from a moving host. In Proceedings of the IEEE intelligent vehicles symposium (pp. 37–42). Minka, T. P. (2001). Expectation propagation for approximate Bayesian inference. In Proceedings uncertainty in artificial intelligence (UAI) (pp. 362–369). Morgan Kaufmann Publishers Inc. Morris, B. T., & Trivedi, M. M. (2011). Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2287–2301. Mur-Artal, R., & Tardós, J. D. (2017). ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 33(5), 1255–1262. Murphy, K. P. (2002). Dynamic bayesian networks: Representation, inference and learning. PhD thesis, University of California, Berkeley. Oh, S. M., Rehg, J. M., Balch, T., & Dellaert, F. (2008). Learning and inferring motion patterns using parametric segmental switching linear dynamic systems. International Journal of Computer Vision, 77(1–3), 103–124. Ohn-Bar, E., & Trivedi, M. M. (2016). Looking at humans in the age of self-driving and highly automated vehicles. IEEE Transactions on Intelligent Vehicles, 1(1), 90–104. Oniga, F., Nedevschi, S., & Meinecke, M. M. (2008). Curb detection based on a multi-frame persistence map for urban driving scenarios. In Proceedings of the IEEE intelligent transportation systems conference (pp. 67–72). Otsuka, K., Hara, K., Suzuki, T., & Aoki, Y. (2017). Danger level modeling and analysis of vehicle-pedestrian encounter using situation dependent topic model. In Proceedings of the IEEE intelligent vehicles symposium (pp. 251–256). Paden, B., Čáp, M., Yong, S. Z., Yershov, D., & Frazzoli, E. (2016). A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Transactions on Intelligent Vehicles, 1(1), 33–55. Pavlovic, V., Rehg, J. M., & MacCormick, J. (2000). Learning switching linear models of human motion. In T. K. Leen, T. G. Dietterich, & V. Tresp (Eds.), Advances in neural information processing systems (NIPS) (pp. 981–987). Massachusetts, US: MIT Press. Pellegrini, S., Ess, A., Schindler, K., & Van Gool, L. (2009). You’ll never walk alone: Modeling social behavior for multi-target tracking. In Proceedings of the international conference on computer vision (ICCV) (pp. 261–268). Pool, E. A. I., Kooij, J. F. P., & Gavrila, D. M. (2017). Using road topology to improve cyclist path prediction. In Proceedings of the IEEE intelligent vehicles symposium (pp. 289–296). IEEE. Rasouli, A., Kotseruba, I., & Tsotsos, J. K. (2017). Agreeing to cross: How drivers and pedestrians communicate. In Proceedings of the IEEE intelligent vehicles symposium. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 779–788). Rehder, E., & Kloeden, H. (2015). Goal-directed pedestrian prediction. In Proceedings of IEEE international conference on computer vision workshops (pp. 50–58). Robicquet, A., Sadeghian, A., Alahi, A., & Savarese, S. (2016). Learning social etiquette: Human trajectory understanding in crowded scenes. In Proceedings of the European conference on computer vision (ECCV) (pp. 549–565). Springer. Rosti, A. V. I., & Gales, M. J. F. (2004). Rao-Blackwellised Gibbs sampling for switching linear dynamical systems. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP) (Vol. 1, pp. 809–812). Roth, M., Flohr, F., & Gavrila, D. M. (2016). Driver and pedestrian awareness-based collision risk analysis. In Proceedings of the IEEE intelligent vehicles symposium (pp. 454–459). Sattarov, E., Gepperth, A., & Reynaud, R., et al. (2014). Context-based vector fields for multi-object tracking in application to road traffic. In Proceedings of the IEEE intelligent transportation systems conference (pp. 1179–1185). Sayed, T., Zaki, M. H., & Autey, J. (2013). Automated safety diagnosis of vehicle–bicycle interactions using computer vision analysis. Safety Science, 59, 163–172. Schmidt, S., & Färber, B. (2009). Pedestrians at the kerb—Recognising the action intentions of humans. Transportation Research Part F: Traffic Psychology and Behaviour, 12(4), 300–310. Schneider, N., & Gavrila, D. M. (2013). Pedestrian path prediction with recursive Bayesian filters: A comparative study. In J. Weickert, M. Hein, & B. Schiele (Eds.), Lecture notes in computer science (Vol. 8142, pp. 174–183). Berlin, Heidelberg: Springer-Verlag. Schreiber, M., Knöppel, C., & Franke, U. (2013). LaneLoc: Lane marking based localization using highly accurate maps. In Proceedings of the IEEE intelligent vehicles symposium (pp. 449–454). Schulz, A. T., & Stiefelhagen, R. (2015a). A controlled interactive multiple model filter for combined pedestrian intention recognition and path prediction. In Proceedings of the IEEE intelligent transportation systems conference (pp. 173–178). Schulz, A. T., & Stiefelhagen, R. (2015b) Pedestrian intention recognition using latent-dynamic conditional random fields. In Proceedings of the IEEE intelligent vehicles symposium (pp. 622–627). Tamura, Y., Le, P. D., Hitomi, K., Chandrasiri, N., Bando, T., Yamashita, A., et al. (2012). Development of pedestrian behavior model taking account of intention. In Proceedings IEEE international conference on intelligent robots and systems (IROS) (pp. 382–387). Vanparijs, J., Panis, L. I., Meeusen, R., & de Geus, B. (2015). Exposure measurement in bicycle safety analysis: A review of the literature. Accident Analysis & Prevention, 84, 9–19. Völz, B., Mielenz, H., Siegwart, R., & Nieto, J. (2016). Predicting pedestrian crossing using quantile regression forests. In Proceeding of the IEEE intelligent vehicles symposium, pp. 426–432. Wöhler, C., & Anlauf, J. K. (1999). A time delay neural network algorithm for estimating image-pattern shape and motion. Image and Vision Computing, 17(3–4), 281–294. Yi, S., Li, H., & Wang, X. (2016). Pedestrian behavior understanding and prediction with deep neural networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 263–279). Springer. Yi, Y., Hao, L., Hao, Z., Songtian, S., Ningyi, L., & Wenjie, S. (2017). Intersection scan model and probability inference for vision based small-scale urban intersection detection. In Proceedings of the IEEE intelligent vehicles symposium (pp. 1393–1398). Zernetsch, S., Kohnen, S., Goldhammer, M., Doll, K., & Sick, B. (2016). Trajectory prediction of cyclists using a physical model and an artificial neural network. In Proceedings of the IEEE intelligent vehicles symposium (pp. 833–838). Zhang, R., Wu, J., Huang, L., & You, F. (2017). Study of bicycle movements in conflicts at mixed traffic unsignalized intersections. IEEE Access, 5, 10108–10117. Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., & Torralba, A. (2017). Scene parsing through ADE20K dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).