O-CNN

ACM Transactions on Graphics - Tập 36 Số 4 - Trang 1-11 - 2017
Peng‐Shuai Wang1, Yang Liu2, Yuxiao Guo3, Chunyu Sun1, Xin Tong2
1 Tsinghua University
2Microsoft Research, Asia
3University of Electronic Science and Technology of China

Tóm tắt

We present O-CNN , an Octree-based Convolutional Neural Network (CNN) for 3D shape analysis. Built upon the octree representation of 3D shapes, our method takes the average normal vectors of a 3D model sampled in the finest leaf octants as input and performs 3D CNN operations on the octants occupied by the 3D shape surface. We design a novel octree data structure to efficiently store the octant information and CNN features into the graphics memory and execute the entire O-CNN training and evaluation on the GPU. O-CNN supports various CNN structures and works for 3D shapes in different representations. By restraining the computations on the octants occupied by 3D surfaces, the memory and computational costs of the O-CNN grow quadratically as the depth of the octree increases, which makes the 3D CNN feasible for high-resolution 3D models. We compare the performance of the O-CNN with other existing 3D CNN solutions and demonstrate the efficiency and efficacy of O-CNN in three shape analysis tasks, including object classification, shape retrieval, and shape segmentation.

Từ khóa


Tài liệu tham khảo

Song Bai , Xiang Bai , Zhichao Zhou , Zhaoxiang Zhang , and Longin Jan Latecki . 2016 . GIFT: A real-time and scalable 3D shape search engine. In Computer Vision and Pattern Recognition (CVPR). Song Bai, Xiang Bai, Zhichao Zhou, Zhaoxiang Zhang, and Longin Jan Latecki. 2016. GIFT: A real-time and scalable 3D shape search engine. In Computer Vision and Pattern Recognition (CVPR).

10.1111/cgf.12693

Davide Boscaini , Jonathan Masci , Emanuele Rodolà , and Michael M . Bronstein . 2016 . Learning shape correspondence with anisotropic convolutional neural networks. In Neural Information Processing Systems (NIPS) . Davide Boscaini, Jonathan Masci, Emanuele Rodolà, and Michael M. Bronstein. 2016. Learning shape correspondence with anisotropic convolutional neural networks. In Neural Information Processing Systems (NIPS).

Andrew Brock Theodore Lim J.M. Ritchie and Nick Weston. 2016. Generative and discriminative voxel modeling with convolutional neural networks. In 3D deep learning workshop (NIPS). Andrew Brock Theodore Lim J.M. Ritchie and Nick Weston. 2016. Generative and discriminative voxel modeling with convolutional neural networks. In 3D deep learning workshop (NIPS).

M. M. Bronstein , J. Bruna , Y. LeCun , A. Szlam , and P. Vandergheynst . 2017. Geometric deep learning: going beyond Euclidean data . IEEE Sig. Proc. Magazine ( 2017 ). M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst. 2017. Geometric deep learning: going beyond Euclidean data. IEEE Sig. Proc. Magazine (2017).

Angel X. Chang Thomas Funkhouser Leonidas Guibas Pat Hanrahan Qixing Huang Zimo Li Silvio Savarese Manolis Savva Shuran Song Hao Su Jianxiong Xiao Li Yi and Fisher Yu. 2015. ShapeNet: an information-rich 3D model repository. arXiv:1512.03012 [cs.GR]. (2015). Angel X. Chang Thomas Funkhouser Leonidas Guibas Pat Hanrahan Qixing Huang Zimo Li Silvio Savarese Manolis Savva Shuran Song Hao Su Jianxiong Xiao Li Yi and Fisher Yu. 2015. ShapeNet: an information-rich 3D model repository. arXiv:1512.03012 [cs.GR]. (2015).

Kumar Chellapilla , Sidd Puri , and Patrice Simard . 2006 . High performance convolutional neural networks for document processing . In International Conference on Frontiers in Handwriting Recognition (ICFHR). Kumar Chellapilla, Sidd Puri, and Patrice Simard. 2006. High performance convolutional neural networks for document processing. In International Conference on Frontiers in Handwriting Recognition (ICFHR).

Ian Goodfellow , Yoshua Bengio , and Aaron Courville . 2016. Deep Learning . MIT Press . Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press.

10.5244/C.29.150

10.1145/2835487

10.1109/CVPR.2016.482

10.1145/2647868.2654889

Philipp Krähenbühl and Vladlen Koltun. 2011. Efficient inference in fully connected CRFs with gaussian edge potentials. In Neural Information Processing Systems (NIPS). Philipp Krähenbühl and Vladlen Koltun. 2011. Efficient inference in fully connected CRFs with gaussian edge potentials. In Neural Information Processing Systems (NIPS).

Philipp Krähenbühl and Vladlen Koltun . 2013 . Parameter learning and convergent inference for dense random fields . In International Conference on Machine Learning (ICML). 513--521 . Philipp Krähenbühl and Vladlen Koltun. 2013. Parameter learning and convergent inference for dense random fields. In International Conference on Machine Learning (ICML). 513--521.

10.1109/5.726791

Yangyan Li , Soeren Pirk , Hao Su , Charles R. Qi , and Leonidas J . Guibas . 2016 . FPNN: field probing neural networks for 3D data. In Neural Information Processing Systems (NIPS) . Yangyan Li, Soeren Pirk, Hao Su, Charles R. Qi, and Leonidas J. Guibas. 2016. FPNN: field probing neural networks for 3D data. In Neural Information Processing Systems (NIPS).

Sergey Loffe and Christian Szegedy . 2015 . Batch Normalization: accelerating deep network training by reducing internal covariate shift . In International Conference on Machine Learning (ICML). 448--456 . Sergey Loffe and Christian Szegedy. 2015. Batch Normalization: accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning (ICML). 448--456.

Jonathan Long Evan Shelhamer and Trevor Darrell. 2015. Fully convolutional models for semantic segmentation. In Computer Vision and Pattern Recognition (CVPR). Jonathan Long Evan Shelhamer and Trevor Darrell. 2015. Fully convolutional models for semantic segmentation. In Computer Vision and Pattern Recognition (CVPR).

10.1109/ICCVW.2015.112

10.1109/IROS.2015.7353481

10.1016/0146-664X(82)90104-6

10.1109/ICCV.2015.178

Charles R. Qi , Hao Su , Kaichun Mo , and Leonidas J . Guibas . 2017 . PointNet: Deep learning on point sets for 3D classification and segmentation. In Computer Vision and Pattern Recognition (CVPR) . Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. In Computer Vision and Pattern Recognition (CVPR).

10.1109/CVPR.2016.609

Gernot Riegler , Ali Osman Ulusoy, and Andreas Geiger . 2017 . OctNet: Learning deep 3D representations at high resolutions. In Computer Vision and Pattern Recognition (CVPR) . Gernot Riegler, Ali Osman Ulusoy, and Andreas Geiger. 2017. OctNet: Learning deep 3D representations at high resolutions. In Computer Vision and Pattern Recognition (CVPR).

M. Savva , F. Yu , Hao Su , M. Aono , B. Chen , D. Cohen-Or , W. Deng , Hang Su , S. Bai , X. Bai , N. Fish , J. Han , E. Kalogerakis , E. G. Learned-Miller , Y. Li , M. Liao , S. Maji , A. Tatsuma , Y. Wang , N. Zhang , and Z. Zhou 4. 2016 . SHREC'16 Track - Large-scale 3D shape retrieval from ShapeNet Core55 . In Eurographics Workshop on 3D Object Retrieval. M. Savva, F. Yu, Hao Su, M. Aono, B. Chen, D. Cohen-Or, W. Deng, Hang Su, S. Bai, X. Bai, N. Fish, J. Han, E. Kalogerakis, E. G. Learned-Miller, Y. Li, M. Liao, S. Maji, A. Tatsuma, Y. Wang, N. Zhang, and Z. Zhou 4. 2016. SHREC'16 Track - Large-scale 3D shape retrieval from ShapeNet Core55. In Eurographics Workshop on 3D Object Retrieval.

10.1109/LSP.2015.2480802

10.1007/978-3-319-46466-4_14

10.5555/2627435.2670313

10.1109/ICCV.2015.114

10.1145/130881.130882

Z. Wu S. Song A. Khosla F. Yu L. Zhang X. Tang and J. Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shape modeling. In Computer Vision and Pattern Recognition (CVPR). Z. Wu S. Song A. Khosla F. Yu L. Zhang X. Tang and J. Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shape modeling. In Computer Vision and Pattern Recognition (CVPR).

Li Yi , Vladimir G. Kim , Duygu Ceylan , I- Chao Shen , Mengyan Yan , Hao Su , Cewu Lu , Qixing Huang , Alla Sheffer , and Leonidas Guibas . 2016 . A scalable active framework for region annotation in 3D shape collections . ACM Trans. Graph. (SIGGRAPH ASIA) 35 , 6 (2016), 210:1--210:12. Li Yi, Vladimir G. Kim, Duygu Ceylan, I-Chao Shen, Mengyan Yan, Hao Su, Cewu Lu, Qixing Huang, Alla Sheffer, and Leonidas Guibas. 2016. A scalable active framework for region annotation in 3D shape collections. ACM Trans. Graph. (SIGGRAPH ASIA) 35, 6 (2016), 210:1--210:12.

Li Yi Hao Su Xingwen Guo and Leonidas Guibas. 2017. SyncSpecCNN: synchronized spectral CNN for 3D shape segmentation. In Computer Vision and Pattern Recognition (CVPR). Li Yi Hao Su Xingwen Guo and Leonidas Guibas. 2017. SyncSpecCNN: synchronized spectral CNN for 3D shape segmentation. In Computer Vision and Pattern Recognition (CVPR).

10.1007/978-3-319-10590-1_53

10.1109/TVCG.2010.75