Method for Convolutional Neural Network Hardware Implementation Based on a Residue Number System
Tóm tắt
Convolutional Neural Networks (CNN) show high accuracy in pattern recognition solving problem but have high computational complexity, which leads to slow data processing. To increase the speed of CNN, we propose a hardware implementation method with calculations in the residue number system with moduli of a special type
$${{2}^{\alpha }}$$
and
$${{2}^{\alpha }} - 1$$
. A hardware simulation of the proposed method on Field-Programmable Gate Array for LeNet-5 CNN is trained with the MNIST, FMNIST, and CIFAR-10 image databases. It has shown that the proposed approach can increase the clock frequency and performance of the device by 11–12%, compared with the traditional approach based on the positional number system.
Tài liệu tham khảo
Ashiq, F., et al., CNN-based object recognition and tracking system to assist visually impaired people, IEEE Access, 2022, vol. 10, pp. 14819–14834. https://doi.org/10.1109/ACCESS.2022.3148036
Moon, C.I. and Lee, O., Skin microstructure segmentation and aging classification using CNN-based models, IEEE Access, 2022, vol. 10, pp. 4948–4956. https://doi.org/10.1109/ACCESS.2021.3140031
Mondal, A.K., Bhattacharjee, A., Singla P., and Prathosh, A.P., xViTCOS: explainable vision transformer based COVID-19 screening using radiography, IEEE J. Trans. Eng. Health Med., 2022, vol. 10, p. 1100110. https://doi.org/10.1109/JTEHM.2021.3134096
Elharrouss, O., Almaadeed, N., Abualsaud, K., Al-Maadeed, S., Al-Ali, A., and Mohamed, A., FSC-set: counting, localization of football supporters crowd in the stadiums, IEEE Access, 2022, vol. 10, pp. 10445–10459. https://doi.org/10.1109/ACCESS.2022.3144607
Vieira, J.C., Sartori, A., Stefenon, S.F., Perez, F.L., De Jesus, G.S., and Leithardt, V.R.Q., Low-cost CNN for automatic violence recognition on embedded system, IEEE Access, 2022, vol. 10, pp. 25190–25202.
Wong, C.-C., Chien, M.-Y., Chen, R.-J., Aoyama, H., andWong, K.-Y., Moving object prediction and grasping system of robot manipulator, IEEE Access, 2022, vol. 10, pp. 20159–20172. https://doi.org/10.1109/ACCESS.2022.3151717
Krizhevsky, A., Sutskever, I., and Hinton, G.E., ImageNet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 2012, vol. 25, no. 2.
Nakahara, H. and Sasao, T., A deep convolutional neural network based on nested residue number system, Proc. 25th Int. Conf. on Field Programmable Logic and Applications (FPL), London, 2015, pp. 1–6. https://doi.org/10.1109/FPL.2015.7293933
Nakahara, H. and Sasao, T., A high-speed low-power deep neural network on an FPGA based on the nested RNS: applied to an object detector, Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS), Florence, 2018, pp. 1–5. https://doi.org/10.1109/ISCAS.2018.8351850
Salamat, S., Imani, M., Gupta, S., and Rosing, T., RNSnet: in-memory neural network acceleration using residue number system, Proc. IEEE Int. Conf. on Rebooting Computing (ICRC), McLean, VA, 2018, pp. 1–12. https://doi.org/10.1109/ICRC.2018.8638592
Omondi, A. and Premkumar, B., Residue Number Systems: Theory and Implementationi, London: Imperial College Press, 2007.
Chervyakov, N.I., Lyakhov, P.A., Deryabin, M.A., Nagornov, N.N., Valueva, M.V., and Valuev, G.V., Residue number system-based solution for reducing the hardware cost of a convolutional neural network, Neurocomputing, 2020, vol. 407, pp. 439–453. https://doi.org/10.1016/j.neucom.2020.04.018
Parhami, B., Computer Arithmetic: Algorithms and Hardware Designs, Oxford Univ. Press, 2010.
Vergos, H.T. and Dimitrakopoulos, G., On modulo 2^n+1 adder design, IEEE Trans. Comput., 2012, vol. 61, no. 2, pp. 173–186. https://doi.org/10.1109/TC.2010.261
Kogge, P.M. and Stone, H.S., A parallel algorithm for the efficient solution of a general class of recurrence equations, IEEE Trans. Comput., 1973, vol. C-22, no. 8, pp. 786–793. https://doi.org/10.1109/TC.1973.5009159
Chervyakov, N.I., Lyakhov, P.A., and Valueva, M.V., Increasing of convolutional neural network performance using residue number system, Proc. Int. Multi-Conf. on Engineering, Computer and Information Sciences (SIBIRCON), Novosibirsk-Yekaterinburg, 2017.
Tung, C. and Huang, C, A high-performance multiply-accumulate unit by integrating additions and accumulations into partial product reduction process, IEEE Access, 2020, vol. 8, pp. 87367–87377. https://doi.org/10.1109/ACCESS.2020.2992286
Habibi Aghdam, H. and Jahani Heravi, E., Guide to Convolutional Neural Networks, Cham: Springer Int. Publ., 2017. https://doi.org/10.1007/978-3-319-57550-6
Valueva, M., et al., Construction of residue number system using hardware efficient diagonal function, Electronics, 2019, vol. 8, no. 6, p. 694.
Chervyakov, N.I., et al., Residue-to-binary conversion for general moduli sets based on approximate Chinese remainder theorem, Int. J. Comput. Math., 2017, vol. 94, no. 9, pp. 1833–1849.
Haykin, S.S., Neural Networks: a Comprehensive Foundation, Prentice Hall, 1999.
LeCun, Y., Bottou, L., Bengio, Y., and Haffiner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 1998, vol. 86, no. 11, pp. 2278–2324.
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jozefowicz, R., Jia, Y., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Schuster, M., Monga, R., Moore, S., Murray, D., Olah, C., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. http://tensorflow.org.
Xiao, H., Kashif, R., and Vollgraf, R., Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017. arXiv:1708.07747.
Krizhevsky, A., et al., Learning multiple layers of features from tiny images, Tech. Rep. TR-2009, Univ. of Toronto, 2009.