Abstract
Distributed machine learning on edges is widely used in intelligent transportation, smart home, industrial manufacturing, and underground pipe network monitoring to achieve low latency and real time data processing and prediction. However, the presence of a large number of sensing and edge devices with limited computing, storage, and communication capabilities prevents the deployment of huge machine learning models and hinders its application. At the same time, although distributed machine learning on edges forms an emerging and rapidly growing research area, there has not been a systematic survey on this topic. The article begins by detailing the challenges of distributed machine learning in edge environments, such as limited node resources, data heterogeneity, privacy, security issues, and summarizes common metrics for model optimization. We then present a detailed analysis of parallelism patterns, distributed architectures, and model communication and aggregation schemes in edge computing. we subsequently present a comprehensive classification and intensive description of node resource-constrained processing, heterogeneous data processing, attacks and protection of privacy. The article ends by summarizing the applications of distributed machine learning in edge computing and presenting problems and challenges for further research.
| Original language | English |
|---|---|
| Article number | 132 |
| Journal | ACM Computing Surveys |
| Volume | 57 |
| Issue number | 5 |
| DOIs | |
| Publication status | Published - 24 Jan 2025 |
Keywords
- communication constraints
- data heterogeneity
- distributed machine learning
- Edge computing
- model optimization
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science