On the Effectiveness of Partial Variance Reduction in Federated Learning with Heterogeneous Data

Bo Li, Mikkel N. Schmidt, Tommy S. Alstrom, Sebastian U. Stich

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Data heterogeneity across clients is a key challenge in federated learning. Prior works address this by either aligning client and server models or using control variates to correct client model drift. Although these methods achieve fast convergence in convex or simple non-convex problems, the performance in over-parameterized models such as deep neural networks is lacking. In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers. We observe that while the feature extraction layers are learned efficiently by FedAvg, the substantial diversity of the final classification layers across clients impedes the performance. Motivated by this, we propose to correct model drift by variance reduction only on the final layers. We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost. We furthermore provide proof for the convergence rate of our algorithm.
Original languageEnglish
Title of host publicationProceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
PublisherIEEE
Publication date2023
Pages3964-3973
ISBN (Print)979-8-3503-0130-4
ISBN (Electronic)979-8-3503-0129-8
DOIs
Publication statusPublished - 2023
Event2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops - Vancouver, Canada
Duration: 17 Jun 202324 Jun 2023

Conference

Conference2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
Country/TerritoryCanada
CityVancouver
Period17/06/202324/06/2023

Fingerprint

Dive into the research topics of 'On the Effectiveness of Partial Variance Reduction in Federated Learning with Heterogeneous Data'. Together they form a unique fingerprint.

Cite this