Complementary spatial transformer network for real-time 3D object recognition

Journal of Real-Time Image Processing - Tập 20 - Trang 1-12 - 2023

K. P. Krishna Kumar¹, Varghese Paul²

¹APJ Abdul Kalam Technological University, CET Campus, Thiruvananthapuram, India

²Department of Computer Science and Engineering, Rajagiri School of Engineering and Technology, Kochi, India

Tóm tắt

Tiny Deep Learning Models offer many advantages in various applications. From the perspective of statistical machine learning theory the contributions of this paper is to complement the research advances and results obtained so far in real-time 3D object recognition. We propose a Tiny Deep Learning Model named Complementary Spatial Transformer Network (CSTN) for Real-Time 3D object recognition. It turns out that CSTN’s working, and analysis are much simplified in a target space setting. We make algorithmic enhancements to perform CSTN computations faster and keep the learning part of CSTN in minimal size. Finally, we provide the experimental verifications of the results obtained in publicly available point cloud data sets ModelNet40 and ShapeNetCore with our model performing 1.65–2 times better in DPS (Detections/s) rate on GPU hardware for 3D object recognition, when compared to state-of-the-art networks. Complementary Spatial Transformer Network architecture requires only 10–35% of trainable parameters, when compared to state-of-the-art networks, making the network easier to deploy in edge AI devices.

Tài liệu tham khảo

Batty, M., Morphet, R., Masucci, P., Stanilov, K.: Entropy, complexity, and spatial information. J. Geogr. Syst. 16, 363–385 (2014) Chen, L., Xu, J., Wang, C., Huang, H., Huang, H., Hu, R.: Uprightrl: upright orientation estimation of 3d shapes via reinforcement learning. In: Computer Graphics Forum, vol. 40, pp. 265–275. Wiley Online Library (2021) Cheney, E.W., Light, W.A.: A Course in Approximation Theory, vol. 101. American Mathematical Soc, Washington, DC (2009) Curry, J., Ghrist, R., Nanda, V.: Discrete morse theory for computing cellular sheaf cohomology. Found. Comput. Math. 16, 875–897 (2016) Disabato, S.: Deep and wide tiny machine learning. In: Special Topics in Information Technology, pp. 79–92. Springer International Publishing, Cham (2022) Disabato, S., Roveri, M.: Tiny machine learning for concept drift. IEEE Trans. Neural Netw. Learn. Syst. 2022, 89 (2022) Fairbank, M., Samothrakis, S., Citi, L.: Deep learning in target space. Rev. Geophys. 59, 3 (2021) Ghrist, R.W.: Elementary Applied Topology, volume 1. Createspace, Seattle (2014) Guo, M.-H., Cai, J.-X., Liu, Z.-N., Tai-Jiang, M., Martin, R.R., Shi-Min, H.: Pct: Point cloud transformer. Comput. Vis. Media 7, 187–199 (2021) Hackbusch, W., Kühn, S.: A new scheme for the tensor representation. J. Fourier Anal. Appl. 15(5), 706–722 (2009) Huang, X., Mei, G., Zhang, J., Abbas, R.: A comprehensive survey on point cloud registration. arXiv:2103.02690 (2021) Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28, 78 (2015) Lin, J., Chen, W.-M., Lin, Y., Gan, C., Han, S., et al.: Mcunet: tiny deep learning on iot devices. Adv. Neural. Inf. Process. Syst. 33, 11711–11722 (2020) Liu, Z., Zhang, J., Liu, L.: Upright orientation of 3d shapes with convolutional networks. Graph. Models 85, 22–29 (2016) Lu, D., Xie, Q., Wei, M., Xu, L., Li, J.: Transformers in 3d point clouds: a survey. arXiv:2205.07417 (2022) Maturana, D., Scherer, S.: Voxnet: a 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp. 922–928, IEEE (2015) Mazenc, E.A., Ranard, D.: Target space entanglement entropy. arXiv:1910.07449 (2019) Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT press, Cambridge (2012) Panagakis, Y., Kossaifi, J., Chrysos, G.G., Oldfield, J., Nicolaou, M.A., Anandkumar, A., Zafeiriou, S.: Tensor methods in computer vision and deep learning. Proc. IEEE 109(5), 863–890 (2021) Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch. In: NIPS-W (2017) Qi, C.R.., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660 (2017) Robinson, M., Ghrist, R.: Topological localization via signals of opportunity. IEEE Trans. Signal Process. 60(5), 2362–2373 (2012) Rotman, J.J.: An Introduction to Algebraic Topology, vol. 119. Springer Science & Business Media, Berlin (2013) Tao, A.: Unsupervised point cloud reconstruction for classific feature learning. https://github.com/antao97/UnsupervisedPointCloudReconstruction (2020) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł, Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 89 (2017) Zhi, S., Liu, Y., Li, X., Guo, Y.: Lightnet: a lightweight 3d convolutional neural network for real-time 3d object recognition. In: 3DOR@ Eurographics (2017) Zhi, S., Liu, Y., Li, X., Guo, Y.: Toward real-time 3d object recognition: a lightweight volumetric cnn framework using multitask learning. Comput. Graph. 71, 199–207 (2018)

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA