Deep reinforcement learning: Algorithm, applications, and ultra-low-power implementation

Nano Communication Networks - Tập 16 - Trang 81-90 - 2018
Hongjia Li1, Ruizhe Cai1, Ning Liu1, Xue Lin2, Yanzhi Wang1
1Syracuse University, Syracuse, United States
2Northeastern University, Boston, United States

Tài liệu tham khảo

Sutton, 1998 V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, Playing atari with deep reinforcement learning, arXiv Preprint arXiv:1312.5602. Silver, 2016, Mastering the game of Go with deep neural networks and tree search, Nature, 529, 484, 10.1038/nature16961 C. Reiss, J. Wilkes, J.L. Hellerstein, Google cluster-usage traces: format + schema, [Online]. Available: http://code.google.com/p/googleclusterdata/wiki/TraceVersion2 , Nov. 2011. Mnih, 2015, Human-level control through deep reinforcement learning, Nature, 518, 529, 10.1038/nature14236 T.P. Lillicrap, J.J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning, arXiv Preprint arXiv:1509.02971. V. Mnih, A.P. Badia, M. Mirza, A. Graves, T.P. Lillicrap, T. Harley, D. Silver, K. Kavukcuoglu, Asynchronous methods for deep reinforcement learning, in: International Conference on Machine Learning, 2016. Van Hasselt, 2016, Deep reinforcement learning with double q-learning, 2094 A. Ren, J. Li, Z. Li, C. Ding, X. Qian, Q. Qiu, B. Yuan, Y. Wang, Sc-dcnn: highly-scalable deep convolutional neural network using stochastic computing, arXiv Preprint arXiv:1611.05939. Rao, 2009, Vconf: a reinforcement learning approach to virtual machines auto-configuration, 137 D. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv Preprint arXiv:1412.6980. P. Mell, T. Grance, The nist definition of cloud computing. Fan, 2007, Power provisioning for a warehouse-sized computer, 13 Duff, 1995, Reinforcement learning methods for continuous-time Markov decision problems, Adv. Neural Inf. Process. Syst., 7, 393 Gross, 2008 M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, et al., Tensorflow: Large-scale machine learning on heterogeneous distributed systems, arXiv Preprint arXiv:1603.04467. Dynamic pricing Comverge[Online]. Available: http://www.comverge.com/Comverge/media/pdf/Whitepaper/Comverge-Dynamic-Pricing-White-Paper.pdf. Vardakas, 2014, Scheduling policies for two-state smart-home appliances in dynamic electricity pricing environments, Energy, 69, 455, 10.1016/j.energy.2014.03.037 Vardakas, 2015, A survey on demand response programs in smart grids: pricing methods and optimization algorithms, IEEE Communications Surveys & Tutorials, 17, 152, 10.1109/COMST.2014.2341586 Farhangi, 2010, The path of the smart grid, IEEE Power Energ. Mag., 8, 18, 10.1109/MPE.2009.934876 Renewables 2016 global status report, REN21[Online]. Available: http://www.ren21.net/wp-content/uploads/2016/10/REN21-GSR2016-FullReport-en-11.pdf. Baltimore gas and electric company, [Online]. Available: https://supplier.bge.com/electric/load/profiles.asp. Li, 2014, Negotiation-based task scheduling to minimize user??? s electricity bills under dynamic energy prices, 1 Yuan, 2016, Design space exploration for hardware-efficient stochastic computing: A case study on discrete cosine transformation, 6555 Gaines, 1967, Stochastic computing, 149 Brown, 2001, Stochastic neural computation. i. Computational elements, IEEE Trans. Comput., 50, 891, 10.1109/12.954505 Kim, 2015, Approximate de-randomizer for stochastic circuits, 123