Cooperation of Mobile Devices for Fast Inference of Deep Learning Applications

Mobile Networks and Applications - Tập 26 - Trang 1243-1249 - 2019
Qinglin Yang1, Xiaofei Luo1, Peng Li1, Toshiaki Miyazaki1, Wenfeng Shen2, Weiqin Tong2
1School of Computer Science and Engineering, University of Aizu, Aizuwakamatsu, Japan
2Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China

Tóm tắt

Deep learning stimulates many novel mobile applications, but it is still challenging to enable efficient mobile deep learning applications. Traditional approach tackles this challenge by offloading computation tasks to cloud, which has weaknesses of high bandwidth requirements and long transmission latency. In this paper, we propose to enable collaborative inference among mobile devices. Instead of sending deep learning inference tasks to cloud, we let mobile devices collaboratively share the computation workloads. This is based on an important observation that batching inference tasks on GPUs can accelerate the inference processing speed. To achieve efficient collaboration, we design an algorithm based on partial swarm optimization (PSO) that is a versatile population-based stochastic optimization technique. We also design a distributed algorithm to address the challenge that is difficult to collect global network information and run the centralized algorithm. Moreover, extensive simulations are conducted to evaluate the performance of the designed algorithm. The simulation results show that the collaborative inference scheme can effectively reduce inference time of mobile deep learning applications.

Tài liệu tham khảo

Najafabadi MM, et al (2015) Deep learning applications and challenges in big data analytics. J Big Data 2.1:1 Mao Y, et al (2017) A survey on mobile edge computing: the communication perspective. IEEE Commun Surv Tutor 19.4:2322– 2358 Mach P, Becvar Z (2017) Mobile edge computing: a survey on architecture and computation offloading. arXiv:1702.05309 Wu C, et al (2018) Toward high mobile GPU performance through collaborative workload offloading. IEEE Trans Parallel Distrib Syst 29.2:435–449 Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1.1:33–57 Chetlur S, et al (2014) cudnn: efficient primitives for deep learning. arXiv:1410.0759 Canziani A, Paszke A, Culurciello E (2016) An analysis of deep neural network models for practical applications. arXiv:1605.07678 Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149 Tang Z, et al (2015) Energy-efficient transmission scheduling in mobile phones using machine learning and participatory sensing. IEEE Trans Veh Technol 64.7:3167–3176 Chatzimilioudis G, et al (2012) Crowdsourcing with smartphones. IEEE Internet Comput 16.5:36–44 Heyi MH, Rossi C (2016) On the evaluation of cloud web services for crowdsourcing mobile applications. In: 2016 2nd International conference on cloud computing technologies and applications (CloudTech). IEEE Yao D, et al (2015) Using crowdsourcing to provide QoS for mobile cloud computing. IEEE Transactions on Cloud Computing Ke H, Li P, Guo S (2014) Crowdsourcing on mobile cloud: cost minimization of joint data acquisition and processing. In: 2014 IEEE Conference on computer communications workshops (INFOCOM WKSHPS), IEEE Fan J, Li Q, Cao G (2015) Privacy-aware and trustworthy data aggregation in mobile sensing. In: 2015 IEEE Conference on communications and network security (CNS). IEEE