Restricted Boltzmann machine as an aggregation technique for binary descriptors

The Visual Computer - Tập 37 - Trang 423-432 - 2019

Szymon Sobczak¹, Rafal Kapela¹, Kevin McGuinness², Aleksandra Swietlicka¹, Dariusz Pazderski¹, Noel E. O’Connor²

¹Poznan University of Technology, Poznan, Poland

²Insight Centre for Data Analytics, Dublin City University, Dublin, Ireland

Tóm tắt

The article presents a novel approach to the challenge of real-time image classification with deep neural networks. The proposed architecture of the neural network exploits computationally efficient local binary descriptors and uses a restricted Boltzmann machine (RBM) as a feature space projection step so that the resulting depth of the deep neural network can be reduced. A contrastive divergence procedure is used both for RBM training and for feature projection. The resulting neural networks exhibit performance close to the current state-of-the-art but are characterized by a small model memory footprint (i.e., number of parameters) and extremely efficient computational complexity (i.e., response time). The low number of parameters makes these architectures applicable in embedded systems with limited memory or reduced computational capabilities.

Tài liệu tham khảo

Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: European Conference on Computer Vision (ECCV), pp. 778–792 (2010) Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: binary robust invariant scalable keypoints. In: International Conference on Computer Vision (ICCV), pp. 2548–2555 (2011) Alahi, A., Ortiz, R., Vandergheynst, P.: Freak: fast retina keypoint. In: Computer Vision and Pattern Recognition (CVPR), pp. 510–517 (2012) Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011) Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision (ICCV), vol. 2, pp. 1150–1157 (1999) Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: European Conference on Computer Vision (ECCV), pp. 404–417 (2006) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005) Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 806–813 (2014) Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: European conference on computer vision (ECCV), pp. 128–142 (2002) Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision (ICCV), vol. 2, pp. 1470–1477 (2003) Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007) Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010) Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007) Mohedano, E., McGuinness, K., O’Connor, N.E., Salvador, A., Marques, F., Giro-i Nieto, X.: Bags of local convolutional features for scalable instance search. In: International Conference on Multimedia Retrieval (ICMR), pp. 327–331 (2016) Fischer, A., Igel, C.: An introduction to restricted boltzmann machines. In: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 14–36 (2012) Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90 Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015). arxiv:1409.4842 Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Efficient convolutional neural networks for mobile vision applications (2017). URL https://arxiv.org/pdf/1704.04861.pdf Fischer, A., Igel, C.: An introduction to restricted boltzmann machines. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 14–36. Springer, Berlin (2012) Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2130–2137 (2009). https://doi.org/10.1109/ICCV.2009.5459466 Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013). https://doi.org/10.1007/s11263-013-0620-5 Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates Inc, New York (2012) Hao Wooi Lim’s blog, friday, august 21, table of results for caltech 101 dataset. http://zybler.blogspot.com/2009/08/table-of-results-for-famous-public.html (2009). Accessed 22 Nov 2018 He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. CoRR arXiv:1406.4729 (2014) Github, cnn-benchmarks. https://github.com/jcjohnson/cnn-benchmarks. Accessed 22 Nov 2018 Chatoux, H., Lecellier, F., Fernandez-Maloigne, C.: Comparative study of descriptors with dense key points. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 1988–1993 (2016) Kornblith, S., Shlens, J., Le, Q.V.: Do better imagenet models transfer better? (2018). URL https://arxiv.org/pdf/1805.08974.pdf Canziani A. Culurciello E, P.A.: An analysis of deep neural network models for practical applications (2016). arxiv:1605.07678 Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems, vol. 25, (2012). https://doi.org/10.1145/3065386

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA