Keyframe-based monocular SLAM: design, survey, and future directions
Tài liệu tham khảo
Cadena, 2016, Past, present, and future of simultaneous localization and mapping: toward the robust-perception age, IEEE Trans. Robot., 32, 1309, 10.1109/TRO.2016.2624754
Scaramuzza, 2011, Visual odometry [tutorial], IEEE Robot. Autom. Mag., 18, 80, 10.1109/MRA.2011.943233
Fuentes-Pacheco, 2012, Visual simultaneous localization and mapping: A survey, Artif. Intell. Rev., 43, 55, 10.1007/s10462-012-9365-8
Yousif, 2015, An overview to visual odometry and visual SLAM: Applications to mobile robotics, Intell. Ind. Syst., 1, 289, 10.1007/s40903-015-0032-7
Saeedi, 2016, Multiple-Robot simultaneous localization and mapping: A Review, J. Field Robot., 33, 3, 10.1002/rob.21620
Bailey, 2006, Simultaneous localization and mapping (SLAM): Part II, IEEE Robot. Autom. Mag., 13, 108, 10.1109/MRA.2006.1678144
Lucas, 1981, An Iterative Image Registration Technique with an Application to Stereo Vision, 674
Baker, 2004, Lucas-Kanade 20 years on: A unifying framework, Int. J. Comput. Vis., 56, 221, 10.1023/B:VISI.0000011205.11775.fd
Krig, 2014, Interest point detector and feature descriptor survey, 217
Beaudet, 1978, Rotationally invariant image operators
C. Harris, M. Stephens, A combined corner and edge detector, In: Proc. of Fourth Alvey Vision Conference, pp. 147–151, 1988.
J. Shi, C. Tomasi, Good features to track, in: Computer Vision and Pattern Recognition, 1994. Proceedings CVPR ’94, 1994 IEEE Computer Society Conference on, 1994, pp. 593–600.
Lindeberg, 1998, Feature detection with automatic scale selection, Int. J. Comput. Vis., 30, 79, 10.1023/A:1008045108935
J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide baseline stereo from maximally stable extremal regions, in: Proc. BMVC, 36.1–36.10. http://dx.doi.org/10.5244/C.16.36.
Lowe, 2004, Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vis., 60, 91, 10.1023/B:VISI.0000029664.99615.94
Mair, 2010, Adaptive and generic corner detection based on the accelerated segment test
Calonder, 2012, Brief: computing a local binary descriptor very fast, IEEE Trans. Pattern Anal. Mach. Intell., 34, 1281, 10.1109/TPAMI.2011.222
S. Leutenegger, M. Chli, R.Y. Siegwart, Brisk: Binary robust invariant scalable keypoints, Computer Vision, ICCV, 2011 IEEE International Conference on, 2011, pp. 2548–2555. http://dx.doi.org/10.1109/ICCV.2011.6126542.
Bay, 2008, Speeded-up robust features (SURF), Comput. Vis. Image Underst., 110, 346, 10.1016/j.cviu.2007.09.014
Lowe, 1999, Object recognition from local scale-invariant features, 1150
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection,Computer Vision and Pattern Recognition, in: 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, 2005, pp. 886–893. http://dx.doi.org/10.1109/CVPR.2005.177.
A. Alahi, R. Ortiz, P. Vandergheynst, Freak: Fast retina keypoint, in: Computer Vision and Pattern Recognition, CVPR, 2012 IEEE Conference on, 2012, pp. 510–517. http://dx.doi.org/10.1109/CVPR.2012.6247715.
E. Rublee, V. Rabaud, K. Konolige, G. Bradski, ORB: An efficient alternative to SIFT or SURF, in: International Conference on Computer Vision, ICCV, 2011, pp. 2564–2571.
Moreels, 2007, Evaluation of features detectors and descriptors based on 3D objects, Int. J. Comput. Vis., 73, 263, 10.1007/s11263-006-9967-1
J. Hartmann, J.H. Klussendorff, E. Maehle, A comparison of feature descriptors for visual SLAM, in: Mobile Robots, ECMR, 2013 European Conference on, 2013, pp. 56–61. http://dx.doi.org/10.1109/ECMR.2013.6698820.
Rey-Otero, 2014, Comparing feature detectors: A bias in the repeatability criteria, and how to correct it, CoRR
Hietanen, 2016, A comparison of feature detectors and descriptors for object class matching, Neurocomputing, 10.1016/j.neucom.2015.08.106
J.L.C. Jérôme Martin, Experimental comparison of correlation techniques, in: IAS-4, International Conference on Intelligent Autonomous Systems, 1995.
M. Muja, D.G. Lowe, Fast approximate nearest neighbors with automatic algorithm configuration, in: VISAPP International Conference on Computer Vision Theory and Applications, 2009, pp. 331–340.
Galvez-López, 2012, Bags of binary words for fast place recognition in image sequences, IEEE Trans. Robot., 28, 1188, 10.1109/TRO.2012.2197158
Umeyama, 1991, Least-squares estimation of transformation parameters between two point patterns, IEEE Trans. Pattern Anal. Mach. Intell., 13, 376, 10.1109/34.88573
Davison, 2007, MonoSLAM: Real-time single camera SLAM, IEEE Trans. Pattern Anal. Mach. Intell., 29, 1052, 10.1109/TPAMI.2007.1049
Longuet-Higgins, 1981, A computer algorithm for reconstructing a scene from two projections, Lett. Nature, 293, 133, 10.1038/293133a0
Torr, 2000, MLESAC, Comput. Vis. Image Underst., 78, 138, 10.1006/cviu.1999.0832
Hartley, 2003, 655
Boal, 2014, Topological simultaneous localization and mapping: A survey, Robotica, 32, 803, 10.1017/S0263574713001070
Mur-Artal, 2015, ORB-SLAM: A Versatile and Accurate Monocular SLAM System, IEEE Trans. Robot., PP, 1
Engel, 2014, LSD-SLAM: Large-Scale Direct Monocular SLAM, 834
J. Lim, J.M. Frahm, M. Pollefeys, Online environment mapping, in: Computer Vision and Pattern Recognition, CVPR, 2011 IEEE Conference on, pp. 3489–3496, 2011.
H. Lim, J. Lim, H.J. Kim, Real-time 6-DOF monocular visual SLAM in a large-scale environment, in: Robotics and Automation, ICRA, IEEE International Conference on, 2014, pp. 1532–1539.
Fernández-Moral, 2015, 217
Konolige, 2010, Sparse sparse bundle adjustment, 102.1
Hartley, 1997, Triangulation, Comput. Vis. Image Underst., 68, 146, 10.1006/cviu.1997.0547
S. Hochdorfer, C. Schlegel, Towards a robust visual SLAM approach: Addressing the challenge of life-long operation, in: Advanced Robotics, 2009. ICAR 2009. International Conference on, 2009, pp. 1–6.
Civera, 2008, Inverse depth parametrization for monocular SLAM, IEEE Trans. Robot., 24, 932, 10.1109/TRO.2008.2003276
Kummerle, 2011, G2o: A general framework for graph optimization, 3607
Triggs, 2000, 298
Strasdat, 2011, Double Window Optimisation for Constant Time Visual SLAM, 2352
Garcia-Fidalgo, 2015, Vision-based topological mapping and localization methods: a survey, Robot. Auton. Syst., 64, 1, 10.1016/j.robot.2014.11.009
S. Agarwal, K. Mierle, et al., Ceres solver, 2013.
E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, P. Sayd, Real time localization and 3D reconstruction, in: computer vision and pattern recognition, 2006 IEEE Computer Society Conference on, vol. 1, 2006. pp. 363–370. http://dx.doi.org/10.1109/CVPR.2006.236.
G. Klein, D. Murray, parallel tracking and mapping for small AR workspaces, in: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 2007, pp. 1–10.
Silveira, 2008, An efficient direct approach to visual SLAM, IEEE Trans. Robot., 24, 969, 10.1109/TRO.2008.2004829
Strasdat, 2010, Scale Drift-Aware Large Scale Monocular SLAM, 10.15607/RSS.2010.VI.010
Newcombe, 2010, Live dense reconstruction with a single moving camera, 1498
Newcombe, 2011, DTAM: dense tracking and mapping in real-time, 2320
A. Pretto, E. Menegatti, E. Pagello, Omnidirectional dense large-scale mapping and navigation based on meaningful triangulation, in: Robotics and Automation, ICRA, 2011 IEEE International Conference on, 2011, pp. 3289–3296. http://dx.doi.org/10.1109/ICRA.2011.5980206.
Pirker, 2011, CD SLAM - Continuous localization and mapping in a dynamic world., 3990
C. Pirchheim, G. Reitmayr, (2011) Homography-based planar mapping and tracking for mobile phones, in: Mixed and Augmented Reality, ISMAR, 2011 10th IEEE International Symposium on, 2011, pp. 7–36. http://dx.doi.org/10.1109/ISMAR.2011.6092367.
W. Tan, H. Liu, Z. Dong, G. Zhang, H. Bao, Robust monocular SLAM in dynamic environments, in: 2013 IEEE International Symposium on Mixed and Augmented Reality, ISMAR, 2013, pp. 209–218.
C. Pirchheim, D. Schmalstieg, G. Reitmayr, Handling pure camera rotation in keyframe-based SLAM, in: Mixed and Augmented Reality, ISMAR, 2013 IEEE International Symposium on, 2013, pp. 229–238. http://dx.doi.org/10.1109/ISMAR.2013.6671783.
Dong, 2014, Efficient keyframe-based real-time camera tracking, Comput. Vis. Image Underst., 118, 97, 10.1016/j.cviu.2013.08.005
C. Forster, M. Pizzoli, D. Scaramuzza, SVO : Fast semi-direct monocular visual odometry, in: Robotics and Automation, ICRA, IEEE International Conference on, 2014.
Herrera, 2014, DT-SLAM: Deferred triangulation for robust SLAM, 609
G. Bourmaud, R. Megret, Robust large scale monocular visual SLAM, in: Computer Vision and Pattern Recognition, CVPR, 2015 IEEE Conference on, 2015 pp. 1638–1647. http://dx.doi.org/10.1109/CVPR.2015.7298772.
A. Concha, J. Civera, DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence, in: Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, 2015, pp. 5686–5693. http://dx.doi.org/10.1109/IROS.2015.7354184.
W.N. Greene, K. Ok, P. Lommel, N. Roy, (2016) Multi-level mapping: Real-time dense monocular SLAM, 2016, pp. 833–840. http://dx.doi.org/10.1109/ICRA.2016.7487213.
H. Liu, G. Zhang, H. Bao, Robust keyframe-based monocular SLAM for augmented reality, in: International Symposium on Mixed and Augmented Reality, ISMAR, 2016.
Engel, 2016, Direct sparse odometry, CoRR, abs/1607.02565
Rosten, 2006, Machine Learning for High-speed Corner Detection, 430
Nistér, 2004, An efficient solution to the five-point relative pose problem, IEEE Trans. Pattern Anal. Mach. Intell., 26, 756, 10.1109/TPAMI.2004.17
Faugeras, 1988, Motion and structure from motion in a piecewise planar environment, Int. J. Pattern Recognit. Artif. Intell., 02, 10.1142/S0218001488000285
Tomasi, 1991, Detection and Tracking of Point Features
Hall, 2015
Benhimane, 2007, Homography-based 2D Visual Tracking and Servoing, Int. J. Robot. Res., 26, 661, 10.1177/0278364907080252
Moranna, 2006
Kneip, 2012, 696
Engel, 2013, Semi-dense Visual Odometry for a Monocular Camera, 1449
Vogiatzis, 2011, Video-based, real-time multi-view stereo, Image Vis. Comput., 29, 434, 10.1016/j.imavis.2011.01.006
M. Pizzoli, C. Forster, D. Scaramuzza, REMODE: Probabilistic, monocular dense reconstruction in real time, in: IEEE International Conference on Robotics and Automation, ICRA, 2014.
Lepetit, 2009, EPnP: An Accurate O(n) Solution to the PnP Problem, Int. J. Comput. Vis., 81, 155, 10.1007/s11263-008-0152-6
Glover, 2012, OpenFABMAP: An open source toolbox for appearance-based loop closure detection, 4730
Z. Dong, G. Zhang, J. Jia, H. Bao, Keyframe-based real-time camera tracking, in: 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 1538–1545. http://dx.doi.org/10.1109/ICCV.2009.5459273.
Pirker, 2010, Histogram of oriented cameras - a new descriptor for visual SLAM in dynamic environments, 76.1
Bentley, 1975, Multidimensional Binary Search Trees Used for Associative Searching, Commun. ACM, 18, 509, 10.1145/361002.361007
Wagner, 2010, Real-time detection and tracking for augmented reality on mobile phones, IEEE Trans. Vis. Comput. Graphics, 16, 355, 10.1109/TVCG.2009.99
Pradeep, 2013, Monofusion: real-time 3D reconstruction of small scenes with a single web camera, 83
Curless, 1996, A volumetric method for building complex models from range images, 303
Engel, 2016, A photometrically calibrated benchmark for monocular visual odometry, CoRR
J. Engel, J. Stuckler, D. Cremers, Large-scale direct SLAM with stereo cameras, Intelligent Robots and Systems, IROS, in: 2015 IEEE/RSJ International Conference on,2015, pp. 1935–1942. http://dx.doi.org/10.1109/IROS.2015.7353631.
Bista, 2016, Appearance-Based indoor navigation by ibvs using line segments, IEEE Robot. Autom. Lett., 1, 423, 10.1109/LRA.2016.2521907
Zhang, 2011, Hand-Held monocular SLAM based on line segments, 7
Dubé, 2016, SegMatch: segment based loop-closure for 3D point clouds, CoRR
Klein, 2008, Improving the Agility of Keyframe-Based {SLAM}, 802
B. Micusik, H. Wildenauer, Descriptor free visual indoor localization with line segments, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2015, pp. 3165–3173. http://dx.doi.org/10.1109/CVPR.2015.7298936.
Vakhitov, 2016, 583
Verhagen, 2014, Scale-invariant line descriptors for wide baseline matching, 493
Yammine, 2014, Novel similarity-invariant line descriptor and matching algorithm for global motion estimation, IEEE Trans. Circuits Syst. Video Technol., 24, 1323, 10.1109/TCSVT.2014.2302874
A. Concha, J. Civera, (2014) Using superpixels in monocular SLAM, in: 2014 IEEE International Conference on Robotics and Automation, ICRA, 2014, pp. 65–372. http://dx.doi.org/10.1109/ICRA.2014.6906883.
Martinez-Carranza, 2010, Unifying planar and point mapping in monocular SLAM, 43.1
Gálvez-López, 2015, Real-time monocular object SLAM, CoRR
E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua, F. Moreno-Noguer, Discriminative learning of deep convolutional feature point descriptors, in: Proceedings of the International Conference on Computer Vision, ICCV, 2015.
Y. Verdie, K.M. Yi, P. Fua, V. Lepetit, TILDE: A temporally invariant learned detector, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2015, pp. 5279–5288.
X. Han, T. Leung, Y. Jia, R. Sukthankar, A.C. Berg, MatchNet: Unifying feature and metric learning for patch-based matching, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2015, pp. 3279–3286. http://dx.doi.org/10.1109/CVPR.2015.7298948.
S. Zagoruyko, N. Komodakis, Learning to compare image patches via convolutional neural networks, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2015, pp. 4353–4361. http://dx.doi.org/10.1109/CVPR.2015.7299064.
K.M. Yi, Y. Verdie, P. Fua, V. Lepetit, Learning to assign orientations to feature points, in: Proceedings of the Computer Vision and Pattern Recognition, 2016.
Pillai, 2015, Monocular SLAM supported object recognition, CoRR, abs/1506.01732
Kundu, 2014, Joint semantic segmentation and 3D reconstruction from monocular video, 8694, 703
N. Fioraio, L.D. Stefano, Joint detection, tracking and mapping by semantic bundle adjustment, in: Computer Vision and Pattern Recognition, CVPR, 2013 IEEE Conference on, 2013, pp. 1538–1545. http://dx.doi.org/10.1109/CVPR.2013.202.
S. Savarese, Y.-W. Chao, M. Bagra, S.Y. Bao, Semantic structure from motion with points, regions, and objects, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 00, 2012, pp. 2703–2710.
Yang, 2016, Pop-up SLAM: Semantic monocular plane SLAM for low-texture environments