A Fully Distributed Training for Class Incremental Learning in Multihead Networks

Mingjun Dai, Yonghao Kong, Junpei Zhong, Shengli Zhang, Hui Wang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Due to good elastic scalability, multi-head network is favored in incremental learning (IL). During IL process, the model size of multi-head network continually grows with the increasing number of branches, which makes it difficult to store and train within a single node. To this end, within model parallelism framework, a distributed training architecture together with its pre-requisite is proposed. Based on the assumption that the pre-requisite is satisfied, a distributed training algorithm is proposed. In addition, to avoid the dilemma that prevalent cross-entropy (CE) loss function does not fit distributed setting, a fully distributed cross-entropy (D-CE) loss function is proposed, which avoids information exchange among nodes. Corresponding training based on D-CE is proposed (D-CE-Train). This method avoids model size expansion problem in centralized training. It employs distributed implementation to speed up training, and reduces the interaction between multiple nodes that may significantly slow down the training. A series of experiments verify the effectiveness of the proposed method.

Original languageEnglish
Title of host publicationIEEE INFOCOM 2023 - Conference on Computer Communications Workshops, INFOCOM WKSHPS 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665494274
DOIs
Publication statusPublished - 29 Aug 2023
Event2023 IEEE INFOCOM Conference on Computer Communications Workshops, INFOCOM WKSHPS 2023 - Hoboken, United States
Duration: 20 May 2023 → …

Publication series

NameIEEE INFOCOM 2023 - Conference on Computer Communications Workshops, INFOCOM WKSHPS 2023

Conference

Conference2023 IEEE INFOCOM Conference on Computer Communications Workshops, INFOCOM WKSHPS 2023
Country/TerritoryUnited States
CityHoboken
Period20/05/23 → …

Keywords

  • class incremental learning
  • cross-entropy
  • distributed implementation
  • multi-head network

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Signal Processing
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'A Fully Distributed Training for Class Incremental Learning in Multihead Networks'. Together they form a unique fingerprint.

Cite this