Fast-Convergent Federated Learning

IEEE Journal on Selected Areas in Communications - Tập 39 Số 1 - Trang 201-218 - 2021

Hung T. Nguyen¹, Vikash Sehwag¹, Seyyedali Hosseinalipour², Christopher G. Brinton², Mung Chiang², H. Vincent Poor¹

¹Department of Electrical Engineering, Princeton University, Princeton, NJ, USA

²School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA

Tóm tắt

Federated learning has emerged recently as a promising solution for distributing machine learning tasks through modern networks of mobile devices. Recent studies have obtained lower bounds on the expected decrease in model loss that is achieved through each round of federated learning. However, convergence generally requires a large number of communication rounds, which induces delay in model training and is costly in terms of network resources. In this paper, we propose a fast-convergent federated learning algorithm, called $\mathsf {FOLB}$ , which performs intelligent sampling of devices in each round of model training to optimize the expected convergence speed. We first theoretically characterize a lower bound on improvement that can be obtained in each round if devices are selected according to the expected improvement their local models will provide to the current global model. Then, we show that $\mathsf {FOLB}$ obtains this bound through uniform sampling by weighting device updates according to their gradient information. $\mathsf {FOLB}$ is able to handle both communication and computation heterogeneity of devices by adapting the aggregations according to estimates of device’s capabilities of contributing to the updates. We evaluate $\mathsf {FOLB}$ in comparison with existing federated learning algorithms and experimentally show its improvement in trained model accuracy, convergence speed, and/or model stability across various machine learning tasks and datasets.

Từ khóa

#Federated learning #distributed optimization #fast convergence rate

Tài liệu tham khảo

khan, 2019, Federated learning for edge networks: Resource optimization and incentive mechanism, arXiv 1911 05642

10.1109/COMST.2020.2970550

jeon, 2020, A compressive sensing approach for federated learning over massive MIMO communication systems, arXiv 2003 08059

li, 2019, Fair resource allocation in federated learning, arXiv 1905 10497

liu, 2019, Enhancing the privacy of federated learning with sketching, arXiv 1911 01812

ghazi, 2019, Scalable and differentially private distributed aggregation in the shuffled model, arXiv 1906 08320

10.1109/ICASSP40776.2020.9054168

chen, 2020, Convergence time optimization for federated learning over wireless networks, arXiv 2001 07845

chen, 2019, A joint learning and communications framework for federated learning over wireless networks, arXiv 1909 07972

mohammadi amiri, 2020, Convergence of update aware device scheduling for federated learning at the wireless edge, arXiv 2001 10402

zhang, 2015, Deep learning with elastic averaging SGD, Proc Adv Neural Inf Process Syst, 685

hosseinalipour, 2020, From federated to fog learning: Distributed machine learning over heterogeneous wireless networks, arXiv 2006 03594

10.1109/CDC.2012.6426691

yu, 2018, Parallel restarted SGD with faster convergence and less communication: Demystifying why model averaging works for deep learning, arXiv 1807 06629

jiang, 2018, A linear speedup analysis of distributed deep learning with sparse and quantized communication, Proc Adv Neural Inf Process Syst, 2525

yu, 2019, On the linear speedup analysis of communication efficient momentum SGD for distributed non-convex optimization, arXiv 1905 03817

10.1109/JSAC.2019.2904348

lin, 2018, Don’t use large mini-batches, use local SGD, arXiv 1808 07217

stich, 2018, Local SGD converges fast and communicates little, arXiv 1805 09767

wang, 2018, Cooperative SGD: A unified framework for the design and analysis of communication-efficient SGD algorithms, arXiv 1808 07576

woodworth, 0, Graph oracle models, lower bounds, and gaps for parallel stochastic optimization, Proc Adv Neural Inf Process Syst, 2018, 8496

bhowmick, 2018, Protection against reconstruction and its applications in private federated learning, arXiv 1812 00984

dekel, 2012, Optimal distributed online prediction using mini-batches, J Mach Learn Res, 13, 165

10.1145/3133956.3133982

dean, 2012, Large scale distributed deep networks, Proc Adv Neural Inf Process Syst, 1223

reddi, 2016, AIDE: Fast and communication efficient distributed optimization, arXiv 1608 06879

agarwal, 0, CpSGD: Communication-efficient and differentially-private distributed SGD, Proc Adv Neural Inf Process Syst, 2018, 7564

li, 2014, Scaling distributed machine learning with the parameter server, Proc Int Conf Big Data Sci Comput, 19

10.1109/ISIT44484.2020.9174216

richtárik, 2016, Distributed coordinate descent method for learning with big data, J Mach Learn Res, 17, 2657

10.1561/2200000016

smith, 2016, CoCoA: A general framework for communication-efficient distributed optimization, arXiv 1611 02189

10.1109/JIOT.2016.2584538

10.1109/IJCNN.2017.7966217

mcmahan, 2017, Communication-efficient learning of deep networks from decentralized data, Proc 20th Int Conf Artif Intell Statist, 1273

10.1109/5.726791

abadi, 2016, Tensorflow: A system for large-scale machine learning, Proc 12th USENIX Symp Operating Syst Des Implementation, 265

dinh, 2019, Federated learning over wireless networks: Convergence analysis and resource allocation, arXiv 1910 13067

go, 2009, Twitter sentiment classification using distant supervision, CS224N Project Report, 1, 15

li, 2018, Federated optimization in heterogeneous networks, arXiv 1812 06127

li, 2019, Federated learning: Challenges, methods, and future directions, arXiv 1908 07873

praneeth karimireddy, 2019, SCAFFOLD: Stochastic controlled averaging for federated learning, arXiv 1910 06378

kairouz, 2019, Advances and open problems in federated learning, arXiv 1912 04977

reddi, 2020, Adaptive federated optimization, arXiv 2003 00295

bishop, 2006, Pattern Recognition and Machine Learning

smith, 2017, Federated multi-task learning, Proc Adv Neural Inf Process Syst, 4424

10.1109/INFOCOM41043.2020.9155372

li, 2019, On the convergence of FedAvg on non-IID data, arXiv 1907 02189

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA