Utilizing cloud FPGAs towards the open neural network standard
Tài liệu tham khảo
K. Abdelouahab, M. Pelcat, J. Sérot, F. Berry, Accelerating CNN inference on FPGAS: a survey. CoRR arXiv:1806.01683, 2018.
Alemdar, 2017, 2547
Betkaoui, 2010, Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing, 2010 International Conference on Field-Programmable Technology, 94, 10.1109/FPT.2010.5681761
F. Chollet, et al., Keras, 2015. https://keras.io.
Cong, 2018, 93
Danopoulos, 2018, Acceleration of image classification with caffe framework using FPGA, 2018 7th International Conference on Modern Circuits and Systems Technologies (MOCAST), 1
Danopoulos, 2019, 373
Danopoulos, 2020, Automatic generation of fpga kernels from open format CNN models, 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 237, 10.1109/FCCM48280.2020.00070
Duarte, 2018, Fast inference of deep neural networks in FPGAs for particle physics, J. Instrum., 13, P07027, 10.1088/1748-0221/13/07/P07027
Y. Fu, E. Wu, A. Sirasao, S. Attia, K. Khan, R. Wittig, White Paper: Deep Learning with INT8 Optimization on Xilinx Devices. Technical Report WP486 (v1.0.1), 2017. https://www.xilinx.com/support/documentation/white_papers/wp486-deep-learning-int8.pdf.
Ghasemzadeh, 2018, Rebnet: residual binarized neural network, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 57, 10.1109/FCCM.2018.00018
P. Gysel, M. Motamedi, S. Ghiasi, Hardware-Oriented Approximation of Convolutional Neural Networks. CoRR arXiv:1604.03168, 2016.
Han, 2016
Hao, 2019, 1
Hettiarachchi, 2020, Integer vs. floating-point processing on modern fpga technology, 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), 0606, 10.1109/CCWC47524.2020.9031118
Huang, 2019, Accelerating sparse deep neural networks on FPGAs, 2019 IEEE High Performance Extreme Computing Conference (HPEC), 1
Kachris, 2016, A survey on reconfigurable accelerators for cloud computing, 2016 26th International Conference on Field Programmable Logic and Applications (FPL), 1
Kreis, 2019
Krizhevsky, 2012, Imagenet classification with deep convolutional neural networks, Neural Inf. Process. Syst., 25
Makrani, 2017, MeNa: a memory navigator for modern hardware in a scale-out environment, 2017 IEEE International Symposium on Workload Characterization (IISWC), 2, 10.1109/IISWC.2017.8167751
Omondi, 2006
Putnam, 2015, A reconfigurable fabric for accelerating large-scale datacenter services, IEEE Micro, 35, 10, 10.1109/MM.2015.42
Si, 2018, Handwritten digit recognition system on an FPGA, 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), 402
Sze, 2017, Hardware for Machine Learning: Challenges and Opportunities, 1
You, 2019, A flexible dnn accelerator design with layer pipeline for fpgas, 2019 6th International Conference on Information Science and Control Engineering (ICISCE), 959, 10.1109/ICISCE48695.2019.00192