Communication-Computation Trade-off in Resource-Constrained Edge Inference

Jiawei Shao, Jun Zhang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

88 Citations (Scopus)

Abstract

The recent breakthrough in artificial intelligence (AI), especially deep neural networks (DNNs), has affected every branch of science and technology. Particularly, edge AI has been envisioned as a major application scenario to provide DNN-based services at edge devices. This article presents effective methods for edge inference at resource-constrained devices. It focuses on device-edge co-inference, assisted by an edge computing server, and investigates a critical trade-off among the computational cost of the on-device model and the communication overhead of forwarding the intermediate feature to the edge server. A general three-step framework is proposed for the effective inference: model split point selection to determine the on-device model, communication-aware model compression to reduce the on-device computation and the resulting communication overhead simultaneously, and task-oriented encoding of the intermediate feature to further reduce the communication overhead. Experiments demonstrate that our proposed framework achieves a better tradeoff and significantly reduces the inference latency than baseline methods.

Original languageEnglish
Article number9311935
Pages (from-to)20-26
Number of pages7
JournalIEEE Communications Magazine
Volume58
Issue number12
DOIs
Publication statusPublished - Dec 2020

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Communication-Computation Trade-off in Resource-Constrained Edge Inference'. Together they form a unique fingerprint.

Cite this