Cải tiến và tối ưu hoá thuật toán của lõi IP nhân chập ma trận trong mạng nơ-ron trên FPGA
Tóm tắt
Từ khóa
#IP cores; Matrix multiplication; FPGA-CNN; MAC; Vivado-Vitis.Tài liệu tham khảo
[1]. Nguyen, X.-Q. and Pham-Quoc, C., “An FPGA-base Convolution IP Core for Deep Neural Networks Acceleration,” Rev Journal on Electronics and Communications, Vol. 12, No. 1–2, pp. 1–6 (2022). DOI: 10.21553/rev-jec.286.
[2]. Han, S., Pool, J., Tran, J., and Dally, W. J., “Learning Both Weights and Connections for Efficient Neural Networks,” Neural Information Processing Systems (NeurIPS), Vol. 28 (2015).
[3]. Wen, W., Wu, C., Wang, Y., Chen, Y., and Li, H., “Learning Structured Sparsity in Deep Neural Networks,” Advances in Neural Information Processing Systems (NeurIPS) (2016).
[4]. Gschwend, D., “ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network,” arXiv Preprint, arXiv:2005.06892 (2020).
[5]. Li, Y., et al., “Implementation of Energy‐Efficient Fast Convolution Algorithm for Deep Convolutional Neural Networks Based on FPGA,” Electronics Letters, Vol. 56, No. 5, pp. 234–236 (2020).
[6]. Liu, X et al., “WinoCNN: Kernel Sharing Winograd Systolic Array for Efficient Convolutional Neural Network Acceleration on FPGAs,” Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP) (2021).
[7]. Zhang, Y., et al., “An Efficient Convolutional Neural Network Accelerator Design on FPGA Using the Layer-to-Layer Unified Input Winograd Architecture,” Electronics, Vol. 14, No. 6, Article 1182 (2025). DOI: 10.3390/electronics14061182.
[8]. Taka, E., Huang, N.-C., Chang, C.-C., Wu, K.-C., Arora, A., and Marculescu, D., “Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI Acceleration,” arXiv Preprint, arXiv:2502.03763v1 [cs.AR] (2025).
[9]. https://www.fpga4student.com/2016/11/matrix-multiplier-core-design.html
[10]. https://people.ece.cornell.edu/land/courses/ece5760/FinalProjects/f2020/bjd86_lgp36/bjd86_lgp36/index.html
[11]. https://www.mathworks.com/help/hdlverifier/xilinxfpgaboards/ug/large-matrix-multiplication-using-ethernet-aximaster.html
