Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices: Design Considerations

Tayfun Gokmen1, Yurii A. Vlasov1
1IBM T. J. Watson Research Center, Yorktown Heights, NY, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Alaghi, 2013, Survey of stochastic computing, ACM Trans. Embed. Comput. Syst., 12, 1, 10.1145/2465787.2465794

Arima, 1991, A 336-neuron, 28 K-synapse, self-learning neural network chip with branch-neuron-unit architecture, IEEE J. Solid State Circuits, 26, 1637, 10.1109/4.98984

Bi, 1998, Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, J. Neurosci., 18, 10464, 10.1523/JNEUROSCI.18-24-10464.1998

Burr, 2014, Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses), using phase-change memory as the synaptic weight element, 2014 IEEE International, Electron Devices Meeting (IEDM), 10.1109/IEDM.2014.7047135

Burr, 2015, Large-scale neural networks implemented with nonvolatile memory as the synaptic weight element: comparative performance analysis (accuracy, speed, and power), IEDM (International Electron Devices Meeting)

Chen, 2014, A 340 mV-to-0.9 V 20.2 Tb/s source-synchronous hybrid packet/circuit-switched 16 × 16 network-on-chip in 22 nm tri-gate CMOS, 2014 IEEE International Soild-State Circuits Conference Digest of techincal Papers (ISSCC), 276, 10.1109/ISSCC.2014.6757432

Chen, 2014, DaDianNao: a machine-learning supercomputer, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, 609, 10.1109/MICRO.2014.58

Chua, 1971, Memristor - the missing circuit element, IEEE Trans. Circuit Theory, 18, 507, 10.1109/TCT.1971.1083337

Coates, 2013, Deep Learning with COTS HPC Systems

Gaines, 1967, Stochastic computing, Proceedings of the AFIPS Spring Joint Computer Conference, 149

Gokhale, 2014, A 240 g-ops/s mobile coprocessor for deep neural networks, IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 696

Gupta, 2015, Deep learning with limited numerical precision

Hinton, 2012, Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal Process. Mag., 29, 82, 10.1109/MSP.2012.2205597

Indiveri, 2013, Integration of nanoscale memristor synapses in neuromorphic computing architectures, Nanotechnology, 24, 384010, 10.1088/0957-4484/24/38/384010

Jackson, 2013, Nanoscale electronic synapses using phase change devices, ACM J. Emerg. Technol. Comput. Syst., 9, 1, 10.1145/2463585.2463588

Jensen, 2013, Noise analysis and measurement of integrator-based sensor interface circuits for fluorescence detection in lab-on-a-chip applications, International Conference on Noise and Fluctuation, 10.1109/ICNF.2013.6578905

Jo, 2010, Nanoscale memristor device as synapse in neuromorphic systems, Nano Lett., 10, 1297, 10.1021/nl904092h

Jonsson, 2011a, An empirical approach to finding energy efficient ADC architectures, 2011 International Workshop on ADC Modelling, Testing and Data Converter Analysis and Design and IEEE 2011 ADC Forum, 132

Jonsson, 2011b, Area Efficiency of ADC Architectures, 2011 20th European Conference on Circuit Theory and Design (ECCTD), 560, 10.1109/ECCTD.2011.6043595

Krizhevsky, 2012, Imagenet classification with deep convolutional neural networks, Neural Information Processing Systems, 1097

Kuzum, 2013, Synaptic electronics: materials, devices and applications, Nanotechnology, 24, 382001, 10.1088/0957-4484/24/38/382001

Le, 2012, Building high-level features using large scale unsupervised learning, International Conference on Machine Learning

LeCun, 2015, Deep learning, Nature, 521, 436, 10.1038/nature14539

LeCun, 1998, Gradient-based learning applied to document recognition, Proce. IEEE, 86, 2278, 10.1109/5.726791

Lehmann, 1993, generic systolic array building block for neural networks with on-chip learning, IEEE Trans. Neural Netw., 4, 400, 10.1109/72.217181

Li, 2014, Training itself: mixed-signal training acceleration for memristor-based neural network, 19th Asia and South Pacific Design Automation Conference (ASP-DAC), 10.1109/ASPDAC.2014.6742916

NVIDIA, 2012, NVIDIA's next generation CUDA compute architecture: Kepler GK110, Whitepaper

O'Connor, 2016, Deep spiking networks

Poppelbaum, 1967, Stochastic computing elements and systems, Proceedings of the AFIPS Fall Joint Computer Conference, 635

Prezioso, 2015, Training and operation of an integrated neuromorphic network based on metal-oxide memristors, Nature, 521, 61, 10.1038/nature14441

Rumelhart, 1986, Learning representations by back-propagating errors, Nature, 323, 533, 10.1038/323533a0

Saighi, 2015, Plasticity in memristive devices for spiking neural networks, Front. Neurosci., 9, 10.3389/fnins.2015.00051

Seo, 2015, On-chip sparse learning acceleration with CMOS and resistive synaptic devices, IEEE Trans. Nanotechnol., 14, 969, 10.1109/TNANO.2015.2478861

Simonyan, 2015, Very Deep Convolutional Networks for Large-Scale Image Recognition

Soudry, 2015, Memristor-based multilayer neural networks with online gradient descent training, IEEE Trans. Neural Netw. Learn. Syst., 26, 2408, 10.1109/TNNLS.2014.2383395

Steinbuch, 1961, Die lernmatrix, Kybernetik, 1, 36, 10.1007/BF00293853

Strukov, 2008, The missing memristor found, Nature, 453, 80, 10.1038/nature06932

Stuecheli, 2013, Next Generation POWER microprocessor, Hot Chips Conference

Szegedy, 2015, Going Deeper with Convolutions, 10.1109/CVPR.2015.7298594

Wu, 2015, Deep image: scaling up image recognition

Xu, 2014, Parallel programming of resistive cross-point array for synaptic plasticity, Procedia Comput. Sci., 41, 126, 10.1016/j.procs.2014.11.094

Yu, 2013, A low energy oxide−based electronic synaptic device for neuromorphic visual systems with tolerance to device variation, Adv. Mater., 25, 1774, 10.1002/adma.201203680

Yu, 2015, Scaling-up resistive synaptic arrays for neuro-inspired architecture: challenges and prospect, International Electron Devices Meeting (IEDM), 451