Title | Two-layer Coded Gradient Aggregation with Straggling Communication Links |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Liang, Kai, Wu, Youlong |
Conference Name | 2020 IEEE Information Theory Workshop (ITW) |
Keywords | Coding computing, composability, compositionality, computer aided instruction, Computing Theory, Data models, distance learning, Distributed databases, distributed learning, Downlink, encoding, pubcrawl, resilience, Resiliency, Servers, straggling |
Abstract | In many distributed learning setups such as federated learning, client nodes at the edge use individually collected data to compute the local gradients and send them to a central master server, and the master aggregates the received gradients and broadcasts the aggregation to all clients with which the clients can update the global model. As straggling communication links could severely affect the performance of distributed learning system, Prakash et al. proposed to utilize helper nodes and coding strategy to achieve resiliency against straggling client-to-helpers links. In this paper, we propose two coding schemes: repetition coding (RC) and MDS coding both of which enable the clients to update the global model in the presence of only helpers but without the master. Moreover, we characterize the uplink and downlink communication loads, and prove the tightness of uplink communication load. Theoretical tradeoff between uplink and downlink communication loads is established indicating that larger uplink communication load could reduce downlink communication load. Compared to Prakash's schemes which require a master to connect with helpers though noiseless links, our scheme can even reduce the communication load in the absence of master when the number of clients and helpers is relatively large compared to the number of straggling links. |
DOI | 10.1109/ITW46852.2021.9457626 |
Citation Key | liang_two-layer_2021 |