MeshWalker

ACM Transactions on Graphics - Tập 39 Số 6 - Trang 1-13 - 2020
Alon Lahav1, Ayellet Tal1
1Technion - Israel Institute of Technology

Tóm tắt

Most attempts to represent 3D shapes for deep learning have focused on volumetric grids, multi-view images and point clouds. In this paper we look at the most popular representation of 3D shapes in computer graphics---a triangular mesh---and ask how it can be utilized within deep learning. The few attempts to answer this question propose to adapt convolutions & pooling to suit Convolutional Neural Networks (CNNs). This paper proposes a very different approach, termed MeshWalker to learn the shape directly from a given mesh. The key idea is to represent the mesh by random walks along the surface, which "explore" the mesh's geometry and topology. Each walk is organized as a list of vertices, which in some manner imposes regularity on the mesh. The walk is fed into a Recurrent Neural Network (RNN) that "remembers" the history of the walk. We show that our approach achieves state-of-the-art results for two fundamental shape analysis tasks: shape classification and semantic segmentation. Furthermore, even a very small number of examples suffices for learning. This is highly important, since large datasets of meshes are difficult to acquire.

Từ khóa


Tài liệu tham khảo

Adobe. 2016. Adobe Fuse 3D Characters. https://www.mixamo.com. Adobe. 2016. Adobe Fuse 3D Characters. https://www.mixamo.com.

10.1145/217474.217529

10.1145/1186822.1073207

10.1007/s00371-006-0375-x

M. Attene , S. Katz , M. Mortara , G. Patane , M. Spagnuolo , and A. Tal . 2006 . Mesh Segmentation - A Comparative Study. In IEEE International Conference on Shape Modeling and Applications 2006 (SMI'06) . 7--7. M. Attene, S. Katz, M. Mortara, G. Patane, M. Spagnuolo, and A. Tal. 2006. Mesh Segmentation - A Comparative Study. In IEEE International Conference on Shape Modeling and Applications 2006 (SMI'06). 7--7.

Matan Atzmon , Haggai Maron , and Yaron Lipman . 2018. Point convolutional neural networks by extension operators. arXiv preprint arXiv:1803.10091 ( 2018 ). Matan Atzmon, Haggai Maron, and Yaron Lipman. 2018. Point convolutional neural networks by extension operators. arXiv preprint arXiv:1803.10091 (2018).

10.1109/CVPR.2016.543

10.1109/LRA.2018.2850061

10.1109/CVPR.2014.491

Davide Boscaini Jonathan Masci Emanuele Rodolà and Michael Bronstein. 2016. Learning shape correspondence with anisotropic convolutional neural networks. In Advances in neural information processing systems. 3189--3197. Davide Boscaini Jonathan Masci Emanuele Rodolà and Michael Bronstein. 2016. Learning shape correspondence with anisotropic convolutional neural networks. In Advances in neural information processing systems. 3189--3197.

Alexandre Boulch , Bertrand Le Saux, and Nicolas Audebert . 2017 . Unstructured Point Cloud Semantic Labeling Using Deep Segmentation Networks . 3DOR 2 (2017), 7. Alexandre Boulch, Bertrand Le Saux, and Nicolas Audebert. 2017. Unstructured Point Cloud Semantic Labeling Using Deep Segmentation Networks. 3DOR 2 (2017), 7.

Andrew Brock , Theodore Lim , James M Ritchie , and Nick Weston . 2016. Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 ( 2016 ). Andrew Brock, Theodore Lim, James M Ritchie, and Nick Weston. 2016. Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 (2016).

10.1145/1899404.1899405

10.5555/1237762.1237774

10.1016/S0925-7721(96)00024-7

Kyunghyun Cho , Bart Van Merriënboer , Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014 . Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014). Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

10.1109/TPAMI.2003.1233902

Danielle Ezuz , Justin Solomon , Vladimir G Kim , and Mirela Ben-Chen . 2017 . GWCNN: A metric alignment layer for deep shape analysis. In Computer Graphics Forum , Vol. 36 . Wiley Online Library , 49--57. Danielle Ezuz, Justin Solomon, Vladimir G Kim, and Mirela Ben-Chen. 2017. GWCNN: A metric alignment layer for deep shape analysis. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 49--57.

Gabriele Fanelli , Thibaut Weise , Juergen Gall , and Luc Van Gool . 2011. Real time head pose estimation from consumer depth cameras . In Joint pattern recognition symposium . Springer , 101--110. Gabriele Fanelli, Thibaut Weise, Juergen Gall, and Luc Van Gool. 2011. Real time head pose estimation from consumer depth cameras. In Joint pattern recognition symposium. Springer, 101--110.

10.1609/aaai.v33i01.33018279

10.1109/CVPR.2018.00035

10.1109/CVPR.2018.00035

10.1145/258734.258849

10.1145/1057432.1057461

10.1109/ACCESS.2020.2982196

Daniela Giorgi , Silvia Biasotti , and Laura Paraboschi . 2007. Shape retrieval contest 2007: Watertight models track. SHREC competition 8, 7 ( 2007 ). Daniela Giorgi, Silvia Biasotti, and Laura Paraboschi. 2007. Shape retrieval contest 2007: Watertight models track. SHREC competition 8, 7 (2007).

10.1109/IJCNN.2017.7965883

10.1109/ICCVW.2019.00509

Craig Gotsman . 2003. On graph partitioning, spectral analysis, and digital mesh processing. In 2003 Shape Modeling International . IEEE , 165--171. Craig Gotsman. 2003. On graph partitioning, spectral analysis, and digital mesh processing. In 2003 Shape Modeling International. IEEE, 165--171.

10.1109/TPAMI.2006.233

Alex Graves , Marcus Liwicki , Santiago Fernández , Roman Bertolami , Horst Bunke , and Jürgen Schmidhuber . 2008. A novel connectionist system for unconstrained handwriting recognition . IEEE transactions on pattern analysis and machine intelligence 31, 5 ( 2008 ), 855--868. Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, and Jürgen Schmidhuber. 2008. A novel connectionist system for unconstrained handwriting recognition. IEEE transactions on pattern analysis and machine intelligence 31, 5 (2008), 855--868.

Paul Guerrero , Yanir Kleiman , Maks Ovsjanikov , and Niloy J Mitra . 2018. PCPNet learning local shape properties from raw point clouds . In Computer Graphics Forum , Vol. 37 . Wiley Online Library , 75--85. Paul Guerrero, Yanir Kleiman, Maks Ovsjanikov, and Niloy J Mitra. 2018. PCPNet learning local shape properties from raw point clouds. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 75--85.

10.1145/2835487

10.1109/ICCV.2019.00072

10.1109/TIP.2019.2904460

Rana Hanocka , Noa Fish , Zhenhua Wang , Raja Giryes , Shachar Fleishman , and Daniel Cohen-Or . 2018 . Alignet: Partial-shape agnostic alignment via unsupervised learning . ACM Transactions on Graphics (TOG) 38 , 1 (2018), 1 -- 14 . Rana Hanocka, Noa Fish, Zhenhua Wang, Raja Giryes, Shachar Fleishman, and Daniel Cohen-Or. 2018. Alignet: Partial-shape agnostic alignment via unsupervised learning. ACM Transactions on Graphics (TOG) 38, 1 (2018), 1--14.

Rana Hanocka , Amir Hertz , Noa Fish , Raja Giryes , Shachar Fleishman , and Daniel Cohen-Or . 2019 . MeshCNN: a network with an edge . ACM Transactions on Graphics (TOG) 38 , 4 (2019), 1 -- 12 . Rana Hanocka, Amir Hertz, Noa Fish, Raja Giryes, Shachar Fleishman, and Daniel Cohen-Or. 2019. MeshCNN: a network with an edge. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1--12.

10.1109/CVPR.2018.00208

Mikael Henaff , Joan Bruna , and Yann LeCun . 2015. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163 ( 2015 ). Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163 (2015).

10.1145/383259.383282

Sepp Hochreiter and Jürgen Schmidhuber . 1997. Long short-term memory. Neural computation 9, 8 ( 1997 ), 1735--1780. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.

10.1145/258734.258843

10.1109/CVPR.2018.00109

10.1016/j.cad.2007.02.009

10.1109/CVPR.2016.414

10.1109/34.765655

10.1109/CVPR.2017.702

10.1145/1833349.1778839

10.1109/CVPR.2018.00526

10.1007/s00371-005-0344-9

Sagi Katz and Ayellet Tal . 2003. Hierarchical mesh decomposition using fuzzy clustering and cuts. ACM transactions on graphics (TOG) 22, 3 ( 2003 ), 954--961. Sagi Katz and Ayellet Tal. 2003. Hierarchical mesh decomposition using fuzzy clustering and cuts. ACM transactions on graphics (TOG) 22, 3 (2003), 954--961.

Diederik P Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

Thomas N Kipf and Max Welling . 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 ( 2016 ). Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

10.1109/CVPR.2003.1211448

10.1145/1364901.1364927

10.1016/j.cad.2004.09.001

Yangyan Li , Rui Bu , Mingchao Sun , Wei Wu , Xinhan Di , and Baoquan Chen . 2018 . Pointcnn: Convolution on x-transformed points. In Advances in neural information processing systems. 820--830. Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. Pointcnn: Convolution on x-transformed points. In Advances in neural information processing systems. 820--830.

Z Lian , A Godil , B Bustos , M Daoudi , J Hermans , S Kawamura , Y Kurita , G Lavoua , and P Dp Suetens . 2011 . Shape retrieval on non-rigid 3D watertight meshes . In Eurographics workshop on 3d object retrieval (3DOR). Citeseer. Z Lian, A Godil, B Bustos, M Daoudi, J Hermans, S Kawamura, Y Kurita, G Lavoua, and P Dp Suetens. 2011. Shape retrieval on non-rigid 3D watertight meshes. In Eurographics workshop on 3d object retrieval (3DOR). Citeseer.

10.1016/j.patcog.2012.07.014

Isaak Lim , Alexander Dielen , Marcel Campen , and Leif Kobbelt . 2018 . A simple approach to intrinsic correspondence learning on unstructured 3d meshes . In Proceedings of the European Conference on Computer Vision (ECCV). 0--0. Isaak Lim, Alexander Dielen, Marcel Campen, and Leif Kobbelt. 2018. A simple approach to intrinsic correspondence learning on unstructured 3d meshes. In Proceedings of the European Conference on Computer Vision (ECCV). 0--0.

Rong Liu and Hao Zhang . 2004 . Segmentation of 3D meshes through spectral clustering . In 12th Pacific Conference on Computer Graphics and Applications, 2004. PG 2004. Proceedings. IEEE, 298--305 . Rong Liu and Hao Zhang. 2004. Segmentation of 3D meshes through spectral clustering. In 12th Pacific Conference on Computer Graphics and Applications, 2004. PG 2004. Proceedings. IEEE, 298--305.

10.1109/CVPR.2019.00910

Yi Liu , Hongbin Zha , and Hong Qin . 2006 . Shape topics: A compact representation and new algorithms for 3d partial shape retrieval . In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) , Vol. 2 . IEEE, 2025--2032. Yi Liu, Hongbin Zha, and Hong Qin. 2006. Shape topics: A compact representation and new algorithms for 3d partial shape retrieval. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), Vol. 2. IEEE, 2025--2032.

László Lovász et al. 1993. Random walks on graphs: A survey. Combinatorics Paul erdos is eighty 2 1 (1993) 1--46. László Lovász et al. 1993. Random walks on graphs: A survey. Combinatorics Paul erdos is eighty 2 1 (1993) 1--46.

10.1023/B:VISI.0000029664.99615.94

10.1016/j.gmod.2008.10.002

10.1145/3072959.3073616

10.1109/ICCVW.2015.112

10.1109/IROS.2015.7353481

Facundo Mémoli. 2007. On the use of Gromov-Hausdorff distances for shape comparison. (2007). Facundo Mémoli. 2007. On the use of Gromov-Hausdorff distances for shape comparison. (2007).

10.5555/3115504.3115930

Jae Dong Noh and Heiko Rieger . 2004. Random walks on complex networks. Physical review letters 92, 11 ( 2004 ), 118701. Jae Dong Noh and Heiko Rieger. 2004. Random walks on complex networks. Physical review letters 92, 11 (2004), 118701.

10.1109/ICCVW.2009.5457682

10.1145/2623330.2623732

10.1145/3272127.3275102

Charles R Qi , Hao Su , Kaichun Mo , and Leonidas J Guibas . 2017 a. Pointnet: Deep learning on point sets for 3d classification and segmentation . In Proceedings of the IEEE conference on computer vision and pattern recognition. 652--660 . Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017a. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652--660.

10.1109/CVPR.2016.609

Charles Ruizhongtai Qi Li Yi Hao Su and Leonidas J Guibas. 2017b. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems. 5099--5108. Charles Ruizhongtai Qi Li Yi Hao Su and Leonidas J Guibas. 2017b. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems. 5099--5108.

10.1145/1060244.1060256

Rui SV Rodrigues , José FM Morgado, and Abel JP Gomes . 2018 . Part-based mesh segmentation: a survey. In Computer Graphics Forum, Vol. 37 . Wiley Online Library , 235--274. Rui SV Rodrigues, José FM Morgado, and Abel JP Gomes. 2018. Part-based mesh segmentation: a survey. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 235--274.

Xavier Roynard , Jean-Emmanuel Deschaud , and François Goulette . 2018. Classification of point cloud scenes with multiscale voxel deep network. arXiv preprint arXiv:1804.03583 ( 2018 ). Xavier Roynard, Jean-Emmanuel Deschaud, and François Goulette. 2018. Classification of point cloud scenes with multiscale voxel deep network. arXiv preprint arXiv:1804.03583 (2018).

10.1007/978-3-030-01270-0_5

Nima Sedaghat , Mohammadreza Zolfaghari , Ehsan Amiri , and Thomas Brox . 2016a. Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:1604.03351 ( 2016 ). Nima Sedaghat, Mohammadreza Zolfaghari, Ehsan Amiri, and Thomas Brox. 2016a. Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:1604.03351 (2016).

Nima Sedaghat , Mohammadreza Zolfaghari , and Thomas Brox . 2016b. Orientation-boosted Voxel Nets for 3D Object Recognition. CoRR abs/1604.03351 ( 2016 ). arXiv:1604.03351 http://arxiv.org/abs/1604.03351 Nima Sedaghat, Mohammadreza Zolfaghari, and Thomas Brox. 2016b. Orientation-boosted Voxel Nets for 3D Object Recognition. CoRR abs/1604.03351 (2016). arXiv:1604.03351 http://arxiv.org/abs/1604.03351

Ariel Shamir . 2008. A survey on mesh segmentation techniques . In Computer graphics forum , Vol. 27 . Wiley Online Library , 1539--1556. Ariel Shamir. 2008. A survey on mesh segmentation techniques. In Computer graphics forum, Vol. 27. Wiley Online Library, 1539--1556.

Shymon Shlafman , Ayellet Tal , and Sagi Katz . 2002. Metamorphosis of polyhedral surfaces using decomposition . In Computer graphics forum , Vol. 21 . Wiley Online Library , 219--228. Shymon Shlafman, Ayellet Tal, and Sagi Katz. 2002. Metamorphosis of polyhedral surfaces using decomposition. In Computer graphics forum, Vol. 21. Wiley Online Library, 219--228.

10.1007/978-3-319-46466-4_14

10.1109/WACV.2017.58

10.1109/ICCV.2015.114

Jian Sun , Maks Ovsjanikov , and Leonidas Guibas . 2009. A concise and provably informative multi-scale signature based on heat diffusion . In Computer graphics forum , Vol. 28 . Wiley Online Library , 1383--1392. Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. 2009. A concise and provably informative multi-scale signature based on heat diffusion. In Computer graphics forum, Vol. 28. Wiley Online Library, 1383--1392.

10.1109/ICIP.2002.1039099

Hari Sundar , Deborah Silver , Nikhil Gagvani , and Sven Dickinson . 2003. Skeleton based shape matching and retrieval. In 2003 Shape Modeling International . IEEE , 130--139. Hari Sundar, Deborah Silver, Nikhil Gagvani, and Sven Dickinson. 2003. Skeleton based shape matching and retrieval. In 2003 Shape Modeling International. IEEE, 130--139.

10.1109/TVCG.2007.1011

10.1109/3DV.2017.00067

10.1109/ICCV.2019.00651

Dmitry Ulyanov , Andrea Vedaldi , and Victor Lempitsky . 2016. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 ( 2016 ). Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016).

Petar Veličković , Guillem Cucurull , Arantxa Casanova , Adriana Romero , Pietro Lio , and Yoshua Bengio . 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 ( 2017 ). Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).

10.1109/CVPR.2018.00275

10.1145/1399504.1360696

10.1016/j.neucom.2018.09.075

Chu Wang , Marcello Pelillo , and Kaleem Siddiqi . 2019c. Dominant set clustering and pooling for multi-view 3d object recognition. arXiv preprint arXiv:1906.01592 ( 2019 ). Chu Wang, Marcello Pelillo, and Kaleem Siddiqi. 2019c. Dominant set clustering and pooling for multi-view 3d object recognition. arXiv preprint arXiv:1906.01592 (2019).

10.1109/CVPR.2019.01054

Yunhai Wang , Shmulik Asafi , Oliver Van Kaick , Hao Zhang , Daniel Cohen-Or , and Baoquan Chen . 2012 . Active co-analysis of a set of shapes . ACM Transactions on Graphics (TOG) 31 , 6 (2012), 1 -- 10 . Yunhai Wang, Shmulik Asafi, Oliver Van Kaick, Hao Zhang, Daniel Cohen-Or, and Baoquan Chen. 2012. Active co-analysis of a set of shapes. ACM Transactions on Graphics (TOG) 31, 6 (2012), 1--10.

Yue Wang , Yongbin Sun , Ziwei Liu , Sanjay E Sarma , Michael M Bronstein , and Justin M Solomon . 2019 d. Dynamic graph cnn for learning on point clouds . ACM Transactions on Graphics (TOG) 38 , 5 (2019), 1 -- 12 . Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019d. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (TOG) 38, 5 (2019), 1--12.

10.1109/CVPR.2019.01037

Zhirong Wu , Shuran Song , Aditya Khosla , Fisher Yu , Linguang Zhang , Xiaoou Tang , and Jianxiong Xiao . 2015 . 3d shapenets: A deep representation for volumetric shapes . In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912--1920 . Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912--1920.

Mingye Xu , Zhipeng Zhou , and Yu Qiao . 2019. Geometry Sharing Network for 3D Point Cloud Classification and Segmentation. arXiv preprint arXiv:1912.10644 ( 2019 ). Mingye Xu, Zhipeng Zhou, and Yu Qiao. 2019. Geometry Sharing Network for 3D Point Cloud Classification and Segmentation. arXiv preprint arXiv:1912.10644 (2019).

10.1007/978-3-030-01237-3_6

Zhangsihao Yang , Or Litany , Tolga Birdal , Srinath Sridhar , and Leonidas Guibas . 2020. Continuous Geodesic Convolutions for Learning on 3D Shapes. arXiv preprint arXiv:2002.02506 ( 2020 ). Zhangsihao Yang, Or Litany, Tolga Birdal, Srinath Sridhar, and Leonidas Guibas. 2020. Continuous Geodesic Convolutions for Learning on 3D Shapes. arXiv preprint arXiv:2002.02506 (2020).

Mohsen Yavartanoo , Euyoung Kim , and Kyoung Mu Lee . 2018. SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection. CoRR abs/1811.01571 ( 2018 ). arXiv:1811.01571 http://arxiv.org/abs/1811.01571 Mohsen Yavartanoo, Euyoung Kim, and Kyoung Mu Lee. 2018. SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection. CoRR abs/1811.01571 (2018). arXiv:1811.01571 http://arxiv.org/abs/1811.01571

10.1109/ICIP.2017.8296956

Hao Zhang , Rong Liu , 2005 . Mesh segmentation via recursive and visually salient spectral cuts . In Proc. of vision, modeling, and visualization. 429--436 . Hao Zhang, Rong Liu, et al. 2005. Mesh segmentation via recursive and visually salient spectral cuts. In Proc. of vision, modeling, and visualization. 429--436.

10.1016/j.cag.2017.10.007

10.1109/MULMM.2004.1264985

10.1109/ROBIO49542.2019.8961535