On the role of geometry in geo-localization

Springer Science and Business Media LLC - Tập 7 - Trang 103-113 - 2021

Moti Kadosh¹, Yael Moses², Ariel Shamir²

¹Department of Electrical Engineering, Tel-Aviv University, Tel-Aviv, Israel

²The Efi Arazi School of Computer Science, The Interdisciplinary Center Herzliya, Herzliya, Israel

Tóm tắt

Consider the geo-localization task of finding the pose of a camera in a large 3D scene from a single image. Most existing CNN-based methods use as input textured images. We aim to experimentally explore whether texture and correlation between nearby images are necessary in a CNN-based solution for the geo-localization task. To do so, we consider lean images, textureless projections of a simple 3D model of a city. They only contain information related to the geometry of the scene viewed (edges, faces, and relative depth). The main contributions of this paper are: (i) to demonstrate the ability of CNNs to recover camera pose using lean images; and (ii) to provide insight into the role of geometry in the CNN learning process.

Tài liệu tham khảo

Se, S.; Lowe, D.; Little, J. Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. The International Journal of Robotics Research Vol. 21, No. 8, 735–758, 2002. Lowe, D. G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision Vol. 60, No. 2, 91–110, 2004. Li, Y. P.; Snavely, N.; Huttenlocher, D. P. Location recognition using prioritized feature matching. In: Computer Vision — ECCV 2010. Lecture Notes in Computer Science, Vol. 6312. Daniilidis, K.; Maragos, P.; Paragios, N. Eds. Springer Berlin Heidelberg, 791–804, 2010. Ramalingam, S.; Bouaziz, S.; Sturm, P. Pose estimation using both points and lines for geo-localization. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4716–4723, 2011. Bansal, M.; Daniilidis, K. Geometric urban geolocalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3978–3985, 2014. Kendall, A.; Grimes, M.; Cipolla, R. PoseNet: A convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, 2938–2946, 2015. Walch, F.; Hazirbas, C.; Leal-Taixé, L.; Sattler, T.; Hilsenbeck, S.; Cremers, D. Image-based localization using LSTMs for structured feature correlation. In: Proceedings of the IEEE International Conference on Computer Vision, 627–637, 2017. Melekhov, I.; Ylioinas, J.; Kannala, J.; Rahtu, E. Image-based localization using hourglass networks. arXiv preprint arXiv:1703.07971, 2017. Sattler, T.; Torii, A.; Sivic, J.; Pollefeys, M.; Taira, H.; Okutomi, M.; Pajdla, T. Are large-scale 3D models really necessary for accurate visual localization? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6175–6184, 2017. Sivic, J.; Zisserman, A. Video Google: A text retrieval approach to object matching in videos. In: Proceedings 9th IEEE International Conference on Computer Vision, 1470–1477, 2003. Robertsone, D.; Cipolla, R. An Image-based system for urban navigation. In: Proceedings of the British Machine Conference, 84.1–84.10, 2004. Hays, J.; Efros, A. A. IM2GPS: Estimating geographic information from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, 1–8, 2008. Bergamo, A.; Sinha, S. N.; Torresani, L. Leveraging structure from motion to learn discriminative codebooks for scalable landmark classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 763–770, 2013. Zhang, W.; Kosecka, J. Image based localization in urban environments. In: Proceedings of the 3rd International Symposium on 3D Data Processing, Visualization, and Transmission, 33–40, 2006. Nister, D.; Stewenius, H. Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2161–2168, 2006. Schindler, G.; Brown, M.; Szeliski, R. City-scale location recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–7, 2007. Irschara, A.; Zach, C.; Frahm, J.; Bischof, H. From structure-from-motion point clouds to fast location recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2599–2606, 2009. Sattler, T.; Leibe, B.; Kobbelt, L. Fast image-based localization using direct 2D-to-3D matching. In: Proceedings of the International Conference on Computer Vision, 667–674, 2011. Matei, B. C.; Vander Valk, N.; Zhu, Z.; Cheng, H.; Sawhney, H. S. Image to LIDAR matching for geotagging in urban environments. In: Proceedings of the IEEE Workshop on Applications of Computer Vision, 413–420, 2013. Svarm, L.; Enqvist, O.; Oskarsson, M.; Kahl, F. Accurate localization and pose estimation for large 3D models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 532–539, 2014. Baatz, G.; Saurer, O.; Köser, K.; Pollefeys, M. Large scale visual geo-localization of images in mountainous terrain. In: Computer Vision — ECCV 2012. Lecture Notes in Computer Science, Vol. 7573. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 517–530, 2012. Svarm, L.; Enqvist, O.; Kahl, F.; Oskarsson, M. City-scale localization for cameras with known vertical direction. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 7, 1455–1461, 2017. Piasco, N.; Sidibé, D.; Demonceaux, C.; Gouet-Brunet, V. A survey on visual-based localization: On the benefit of heterogeneous data. Pattern Recognition Vol. 74, 90–109, 2018. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016. Kendall, A.; Cipolla, R. Modelling uncertainty in deep learning for camera relocalization. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4762–4769, 2016. Kendall, A.; Cipolla, R. Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6555–6564, 2017. Berlin Partner für Wirtschaft und Technologie GmbH. Berlin 3D city model. 2016. Available at https://www.businesslocationcenter.de/en/WA/B/seite0.jsp. Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O. Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530, 2016. Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; Berg, A. C.; Fei-Fei, L. ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211–252, 2015. Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255, 2009. OpenStreetMap Wiki contributors. OSM-3D.org.OpenStreetMap Wiki, 2018. Available at https://wiki.openstreetmap.org/w/index.php?title=OSM-3D.org&oldid=2025859.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA