Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Chỉ Tiền Xoay Tại Giai Đoạn Suy Diễn: Một Phương Pháp Để Đạt Được Tính Bất Biến Với Sự Xoay Của Mạng Nơron Tích Chập

International Journal of Computational Intelligence Systems - Tập 17 Số 1

Yue Fan¹, Peng Zhang², Joungho Han², Dandan Liu³, Jinsong Tang⁴, Guoping Zhang¹

¹Central China Normal University, Wuhan, China

²National University of Defense Technology, Changsha, China

³Yancheng Institute of Technology, Yancheng, China

⁴Naval University of Engineering, Wuhan, China

Tóm tắt

Tóm tắtCác mạng nơron tích chập (CNN) phổ biến cần phải tăng cường dữ liệu để đạt được tính bất biến với sự xoay. Chúng tôi đề xuất một cơ chế thay thế, Chỉ Tiền Xoay Tại Giai Đoạn Suy Diễn (PROAI), để làm cho CNN bất biến với sự xoay. Ý tưởng tổng quát là học cách mà não người quan sát hình ảnh. Tại giai đoạn huấn luyện, PROAI huấn luyện một CNN với một số lượng nhỏ bằng cách chỉ sử dụng hình ảnh ở một cách định hướng. Tại giai đoạn suy diễn, PROAI giới thiệu một phép biến đổi tiền xoay để xoay mỗi hình ảnh kiểm tra vào tất cả các định hướng có thể và tính toán điểm phân loại sử dụng CNN đã huấn luyện với số lượng tham số nhỏ. Giá trị cao nhất trong những điểm phân loại này có khả năng tự ước lượng cả thể loại và định hướng của mỗi hình ảnh kiểm tra. Những lợi ích cụ thể của PROAI đã được thử nghiệm trên các nhiệm vụ nhận dạng hình ảnh đã xoay. Kết quả cho thấy PROAI cải thiện cả hiệu suất phân loại và ước lượng định hướng trong khi giảm đáng kể số lượng tham số và thời gian huấn luyện. Mã nguồn và tập dữ liệu có sẵn công khai tại https://github.com/automlresearch/FRPRF.

Từ khóa

Tài liệu tham khảo

Alex, K., Ilya, S., Geoffrey, H.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–89 (2017)

LeCun, Y.: Generalization and network design strategies. Connect. Perspect. 19, 143–155 (1989)

Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 3856–3866 (2017)

Mei, S., Jiang, R., Ma, M., et al.: Rotation-invariant feature learning via convolutional neural network with cyclic polar coordinates convolutional layer. IEEE Trans. Geosci. Remote Sens. 61, 1–13 (2023)

Quiroga, F.M., Torrents-Barrena, J., Lanzarini, L.C., et al.: Invariance measures for neural networks. Appl. Soft Comput. 132, 109817 (2023)

Marcos, D., Volpi, M., Komodakis, N., et al.: Rotation equivariant vector field networks. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22–29, 2017, pp. 5058–5067 (2017)

Zhou, Y., Ye, Q., Qiu, Q., et al.: Oriented response networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017, pp. 4961–4970 (2017)

Laptev, D., Savinov, N., Buhmann, J.M., et al.: TI-POOLING: transformation-invariant pooling for feature learning in convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp. 289–297 (2016)

Cohen, T., Welling, M.: Group equivariant convolutional networks. In: Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19–24, 2016, pp. 2990–2999 (2016)

Worrall, D.E., Garbin, S.J., Turmukhambetov, D., et al.: Harmonic networks: deep translation and rotation equivariance. CVPR 2017, 7168–7177 (2017)

Bruintjes, R.-J., Motyka, T., van Gemert, J.: What affects learned equivariance in deep image recognition models? CoRR abs/2304.02628 (2023)

Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. In: 7th International Conference on Document Analysis and Recognition (ICDAR 2003), 2-Volume Set, 3–6 August 2003, Edinburgh, Scotland, UK, pp. 958–962 (2003)

Zheng, X., Sun, H., Lu, X., et al.: Rotation-invariant attention network for hyperspectral image classification. IEEE Trans. Image Process. 31, 4251–4265 (2022)

Li, J.: Rotation equivariance of deep convolutional neural network (in Chinese). A Dissertation Submitted to Zhejiang University for the Degree of Master of Engineering, 4th March (2019)

Shi, Y., Fu, B., Wang, N., et al.: Spectral-spatial attention rotation-invariant classification network for airborne hyperspectral images. Drones 7(4), 240 (2023)

Fang, G., Ba, S., Gu, Y., et al.: Automatic classification of galaxy morphology: a rotationally-invariant supervised machine-learning method based on the unsupervised machine-learning data set. Astron. J. 165(2), 35 (2023)

Gens, R., Domingos, P.M.: Deep symmetry networks. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp. 2537–2545 (2014)

Dieleman, S., Fauw, J.D., Kavukcuoglu, K.: Exploiting cyclic symmetry in convolutional neural networks. In: Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19–24, 2016, pp. 1889–1898 (2016)

Mo, H., Zhao, G.: RIC-CNN: rotation-invariant coordinate convolutional neural network. CoRR abs/2211.11812 (2022)

Wei, C., Ni, W., Qin, Y., et al.: RiDOP: a rotation-invariant detector with simple oriented proposals in remote sensing images. Remote Sens. 15(3), 594 (2023)

Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp. 2017–2025 (2015)

Massa, F., Marlet, R., Aubry, M.: Crafting a multi-task CNN for viewpoint estimation. In: Proceedings of the British Machine Vision Conference 2016, BMVC 2016, York, UK, September 19–22, 2016 (2016)

Penedones, H., Collobert, R., Fleuret, F., et al.: Improving Object Classification using Pose Information. L'IDIAP Laboratory, École Polytechnique Fédérale de Lausanne. https://infoscience.epfl.ch/record/192574 (2012)

Koriat, A., Norman, J.: What is rotated in mental rotation? J. Exp. Psychol. Learn. Memory Cognit. 10(3), 421–434 (1984)

Shepard, R.N., Metzler, J.: Mental rotation of three-dimensional objects. Science (New York, N.Y.) 171(3972), 701–703 (1971)

Sun, F., Morita, M., Stark, L.W.: Comparative patterns of reading eye movement in Chinese and English. Percept. Psychophys. 37(6), 502–506 (1985)

Lecun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747 (2017)

Larochelle, H., Erhan, D., Courville, A.C., et al.: An empirical evaluation of deep architectures on problems with many factors of variation. In: Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, June 20–24, 2007, pp. 473–480 (2007)

Shi, X., Shan, S., Kan, M., et al.: Real-time rotation-invariant face detection with progressive calibration networks. In: CVPR 2018, Salt Lake City, USA, pp. 2295–2303 (2018)

Zhang, P., Tang, J., Zhong, H., et al.: Self-trained target detection of radar and sonar images using automatic deep learning. IEEE Trans. Geosci. Remote Sens. (2021). https://doi.org/10.1109/TGRS.2021.3096011

Bjorck, J., Gomes, C.P., Selman, B., et al.: Understanding batch normalization. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada, pp. 7705–7716 (2018)

Kaiming, H., Xiangyu, Z., Shaoqing, R., et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision, pp. 770–778 (2016)

Hanxiao, L., Karen, S., Yiming, Y.: DARTS: differentiable architecture search. In: 7th International Conference on Learning Representations, ICLR (2019)

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA