Decentralized computation offloading for multi-user mobile edge computing: a deep reinforcement learning approach
Tóm tắt
Mobile edge computing (MEC) emerges recently as a promising solution to relieve resource-limited mobile devices from computation-intensive tasks, which enables devices to offload workloads to nearby MEC servers and improve the quality of computation experience. In this paper, an MEC enabled multi-user multi-input multi-output (MIMO) system with stochastic wireless channels and task arrivals is considered. In order to minimize long-term average computation cost in terms of power consumption and buffering delay at each user, a deep reinforcement learning (DRL)-based dynamic computation offloading strategy is investigated to build a scalable system with limited feedback. Specifically, a continuous action space-based DRL approach named deep deterministic policy gradient (DDPG) is adopted to learn decentralized computation offloading policies at all users respectively, where local execution and task offloading powers will be adaptively allocated according to each user’s local observation. Numerical results demonstrate that the proposed DDPG-based strategy can help each user learn an efficient dynamic offloading policy and also verify the superiority of its continuous power allocation capability to policies learned by conventional discrete action space-based reinforcement learning approaches like deep Q-network (DQN) as well as some other greedy strategies with reduced computation cost. Besides, power-delay tradeoff for computation offloading is also analyzed for both the DDPG-based and DQN-based strategies.
Từ khóa
Tài liệu tham khảo
W. Shi, J. Cao, Q. Zhang, Y. Li, L. Xu, Edge computing: vision and challenges. IEEE Internet Things J. 3(5), 637–646 (2016).
X. Sun, N. Ansari, EdgeIoT: mobile edge computing for the Internet of Things. IEEE Commun. Mag.54(12), 22–29 (2016).
K. Zhang, Y. Mao, S. Leng, Y. He, Y. Zhang, Mobile-edge computing for vehicular networks: a promising network paradigm with predictive off-loading. IEEE Veh. Technol. Mag.12(2), 36–44 (2017).
Y. Mao, C. You, J. Zhang, K. Huang, K. B. Letaief, A survey on mobile edge computing: the communication perspective. IEEE Commun. Surv. Tuts.19(4), 2322–2358 (2017).
M. Chen, Y. Hao, Task offloading for mobile edge computing in software defined ultra-dense network. IEEE J. Sel. Areas Commun.36(3), 587–597 (2018).
H. Guo, J. Liu, J. Zhang, W. Sun, N. Kato, Mobile-edge computation offloading for ultra-dense IoT networks. IEEE Internet Things J.5(6), 4977–4988 (2018).
J. Zhang, X. Hu, Z. Ning, E. C. -. Ngai, L. Zhou, J. Wei, J. Cheng, B. Hu, Energy-latency tradeoff for energy-aware offloading in mobile edge computing networks. IEEE Internet Things J.5(4), 2633–2645 (2018).
S. Bi, Y. J. Zhang, Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading. IEEE Trans. Wirel. Commun.17(6), 4177–4190 (2018).
Z. Ding, P. Fan, H. V. Poor, Impact of non-orthogonal multiple access on the offloading of mobile edge computing. IEEE Trans Commun. 67(1), 375–390 (2018).
W. Wu, F. Zhou, R. Q. Hu, B. Wang, Energy-efficient resource allocation for secure noma-enabled mobile edge computing networks. IEEE Trans. Commun.68(1), 493–505 (2019).
J. Zhu, J. Wang, Y. Huang, F. Fang, K. Navaie, Z. Ding, Resource allocation for hybrid NOMA MEC offloading. IEEE Trans. Wirel. Commun.19(7), 4964–4977 (2020).
J. Kwak, Y. Kim, J. Lee, S. Chong, Dream: dynamic resource and task allocation for energy minimization in mobile cloud systems. IEEE J. Sel. Areas Commun.33(12), 2510–2523 (2015).
S. Sardellitti, G. Scutari, S. Barbarossa, Joint optimization of radio and computational resources for multicell mobile-edge computing. IEEE Trans. Signal Inf. Process. Over Netw.1(2), 89–103 (2015).
Y. Mao, J. Zhang, K. B. Letaief, Dynamic computation offloading for mobile-edge computing with energy harvesting devices. IEEE J. Sel. Areas Commun.34(12), 3590–3605 (2016).
Y. Mao, J. Zhang, S. Song, K. B. Letaief, Stochastic joint radio and computational resource management for multi-user mobile-edge computing systems. IEEE Trans. Wirel. Commun.16(9), 5994–6009 (2017).
X. Lyu, W. Ni, H. Tian, R. P. Liu, X. Wang, G. B. Giannakis, A. Paulraj, Optimal schedule of mobile edge computing for Internet of Things using partial information. IEEE J. Sel. Areas Commun.35(11), 2606–2615 (2017).
W. Chen, D. Wang, K. Li, Multi-user multi-task computation offloading in green mobile edge cloud computing. IEEE Trans. Serv. Comput.12(5), 726–738 (2018).
J. Liu, Y. Mao, J. Zhang, K. B. Letaief, in Proc. IEEE International Symposium on Information Theory (ISIT). Delay-optimal computation task scheduling for mobile-edge computing systems (IEEEHonolulu, 2016), pp. 1451–1455.
T. Q. Dinh, Q. D. La, T. Q. Quek, H. Shin, Distributed learning for computation offloading in mobile edge computing. IEEE Trans. Commun.66(12), 6353–6367 (2018).
R. S. Sutton, A. G. Barto, et al., Reinforcement learning: an introduction (MIT Press, Cambridge, MA, 1998).
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., Human-level control through deep reinforcement learning. Nature. 518(7540), 529 (2015).
J. Li, H. Gao, T. Lv, Y. Lu, in Proc. IEEE Wireless Communications and Networking Conference (WCNC). Deep reinforcement learning based computation offloading and resource allocation for MEC (IEEEBarcelona, 2018), pp. 1–6.
L. Huang, S. Bi, Y. -J. A. Zhang, Deep reinforcement learning for online offloading in wireless powered mobile-edge computing networks. IEEE Trans Mob Comput, 1–1 (2019).
M. Min, D. Xu, L. Xiao, Y. Tang, D. Wu, Learning-based computation offloading for IoT devices with energy harvesting. IEEE Trans Veh Technol. 68(2), 1930–1941 (2019).
X. Chen, H. Zhang, C. Wu, S. Mao, Y. Ji, M. Bennis, Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning. IEEE Int Things J. 6(3), 4005–4018 (2018).
Y. Liu, H. Yu, S. Xie, Y. Zhang, Deep reinforcement learning for offloading and resource allocation in vehicle edge computing and networks. IEEE Trans. Veh. Technol.68(11), 11158–11168 (2019).
P. Mach, Z. Becvar, Mobile edge computing: a survey on architecture and computation offloading. IEEE Commun. Surv. Tuts.19(3), 1628–1656 (2017).
H. A. Suraweera, T. A. Tsiftsis, G. K. Karagiannidis, A. Nallanathan, Effect of feedback delay on amplify-and-forward relay networks with beamforming. IEEE Trans. Veh. Technol.60(3), 1265–1271 (2011).
M. Abramowitz, I. A. Stegun, et al., Handbook of Mathematical Functions: with Formulas, Graphs, and Mathematical Tables, vol. 55 (Dover publications, New York, 1972).
H. Q. Ngo, E. G. Larsson, T. L. Marzetta, Energy and spectral efficiency of very large multiuser MIMO systems. IEEE Trans. Commun.61(4), 1436–1449 (2013).
T. D. Burd, R. W. Brodersen, Processor design for portable systems. J. VLSI Signal Process. Syst. Signal Image Video Technol.13(2-3), 203–221 (1996).
A. P. Miettinen, J. K. Nurminen, Energy efficiency of mobile clients in cloud computing. HotCloud. 10:, 4–4 (2010).
J. F. Shortle, J. M. Thompson, D. Gross, C. M. Harris, Fundamentals of Queueing Theory, vol. 399 (Wiley, Hoboken, 2018).
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, in Proc. International Conference on Learning Representations (ICLR). Continuous control with deep reinforcement learning (San Juan, 2016).
C. J. C. H. Watkins, P. Dayan, Q-learning. Mach. Learn.8(3), 279–292 (1992).
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, M. Riedmiller, in Proc. International Conference on Machine Learning (ICML). Deterministic policy gradient algorithms (New York City, 2014), pp. 387–395.
D. Tse, P. Viswanath, Fundamentals of wireless communication (Cambridge university press, Cambridge, 2005).
D. Adelman, A. J. Mersereau, Relaxations of weakly coupled stochastic dynamic programs. Oper. Res.56(3), 712–727 (2008).
D. P. Kingma, J. Ba, in Proc. International Conference on Learning Representations (ICLR), San Diego, CA, USA. Adam: a method for stochastic optimization, (2015).