Towards adaptive learning with improved convergence of deep belief networks on graphics processing units

Pattern Recognition - Tập 47 - Trang 114-127 - 2014
Noel Lopes1,2, Bernardete Ribeiro1,3
1CISUC – Center for Informatics and Systems of University of Coimbra, Portugal
2UDI/IPG – Research Unit, Polytechnic of Guarda, Portugal
3Department of Informatics Engineering, University of Coimbra, Portugal

Tài liệu tham khảo

Markoff, 2012, Giant steps in teaching computers to think like us: ‘neural nets’ mimic the ways human minds listen, see and execute, International Herald Tribune, 24–25, 1 H. Larochelle, D. Erhan, A. Courville, J. Bergstra, Y. Bengio, An empirical evaluation of deep architectures on problems with many factors of variation, in: Proceedings of the 24th International Conference on Machine Learning, 2007, pp. 473–480. Roux, 2008, Representational power of restricted Boltzmann machines and deep belief networks, Neural Computation, 20, 1631, 10.1162/neco.2008.04-07-510 Bengio, 2009, Learning deep architectures for AI, Foundations and Trends in Machine Learning, 2, 1, 10.1561/2200000006 H. Lee, R. Grosse, R. Ranganath, A.Y. Ng, Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations, in: Proceedings of the 26th International Conference on Machine Learning, 2009, pp. 609–616. Yu, 2011, Deep learning and its applications to signal and information processing, IEEE Signal Processing Magazine, 28, 145, 10.1109/MSP.2010.939038 Roux, 2010, Deep belief networks are compact universal approximators, Neural Computation, 22, 2192, 10.1162/neco.2010.08-09-1081 Hinton, 2006, A fast learning algorithm for deep belief nets, Neural Computation, 18, 1527, 10.1162/neco.2006.18.7.1527 K. Swersky, B. Chen, B. Marlin, N. de Freitas, A tutorial on stochastic approximation algorithms for training restricted Boltzmann machines and deep belief nets, in: Information Theory and Applications Workshop, 2010, pp. 1–10. N. Lopes, B. Ribeiro, Improving convergence of restricted Boltzmann machines via a learning adaptive step size, in: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Lecture Notes in Computer Science, vol. 7441, Springer, Berlin/Heidelberg, 2012, pp. 511–518. N. Lopes, B. Ribeiro, J. Gonçalves, Restricted Boltzmann machines and deep belief networks on multi-core processors, in: The 2012 International Joint Conference on Neural Networks (IJCNN), 2012. S.K. Kim, P.L. McMahon, K. Olukotun, A large-scale architecture for restricted Boltzmann machines, in: 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010. Ly, 2010, A high-performance FPGA architecture for restricted Boltzmann machines, IEEE Transactions on Neural Networks, 21, 1780, 10.1109/TNN.2010.2073481 R. Raina, A. Madhavan, A.Y. Ng, Large-scale deep unsupervised learning using graphics processors, in: Proceedings of the 26th International Conference on Machine Learning, 2009, pp. 873–880. D.L. Ly, V. Paprotski, D. Yen, Neural Networks on GPUs: Restricted Boltzmann Machines, Technical Report, University of Toronto, 2009. D. Steinkraus, I. Buck, P.Y. Simard, Using GPUs for machine learning algorithms, in: Proceedings of the 8th International Conference on Document Analysis and Recognition, vol. 2, 2005, pp. 1115–1120. Garland, 2010, Understanding throughput-oriented architectures, Communications of the ACM, 53, 58, 10.1145/1839676.1839694 B. Catanzaro, N. Sundaram, K. Keutzer, Fast support vector machine training and classification on graphics processors, in: Proceedings of the 25th International Conference on Machine Learning, 2008, pp. 104–111. G.E. Hinton, A Practical Guide to Training Restricted Boltzmann Machines, Technical Report, Department of Computer Science, University of Toronto, 2010. M. Ranzato, Y. Boureau, Y. LeCun, Sparse feature learning for deep belief networks, in: Advances in Neural Information Processing Systems (NIPS 2007), vol. 20, 2007, pp. 1185–1192. M.A. Carreira-Perpiñán, G.E. Hinton, On contrastive divergence learning, in: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS 2005), 2005, pp. 33–40. Hinton, 2002, Training products of experts by minimizing contrastive divergence, Neural Computation, 14, 1771, 10.1162/089976602760128018 Srinivas, 2012, Speaker independent vowel recognition using backpropagation neural network on master–slave architecture, International Journal of Computer Applications, 48, 45, 10.5120/7332-9924 Lopes, 2011, An evaluation of multiple feed-forward networks on GPUs, International Journal of Neural Systems, 21, 31, 10.1142/S0129065711002638 Zainuddin, 2005, Improving the convergence of the backpropagation algorithm using local adaptive techniques, International Journal of Computational Intelligence, 172 F.M. Silva, L.B. Almeida, Acceleration techniques for the backpropagation algorithm, in: Proceedings of the EURASIP Workshop on Neural Networks, Lecture Notes in Computer Science, vol. 412, Springer Verlag, 1990. Almeida, 1997 Lopes, 2011, GPUMLib, International Journal of Computer Information Systems and Industrial Management Applications, 3, 355 N. Lopes, B. Ribeiro, R. Quintas, GPUMLib: a new library to combine machine learning algorithms with graphics processing units, in: Proceedings of the 10th International Conference on Hybrid Intelligent Systems, 2010, pp. 229–232. Owens, 2008, GPU computing, Proceedings of the IEEE, 96, 879, 10.1109/JPROC.2008.917757 Hey, 2009 T.R. Halfhill, Looking Beyond Graphics, Technical Report, In-Stat, 2009. S. Ryoo, C.I. Rodrigues, S.S. Baghsorkhi, S.S. Stone, D.B. Kirk, W.W. Hwu, Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, in: Proceedings of the 13th ACM Symposium on Principles and Practice of Parallel Programming, 2008, pp. 73–82. Lopes, 2003, An efficient gradient-based learning algorithm applied to neural networks with selective actuation neurons, Neural, Parallel and Scientific Computations, 11, 253