Optimally removing synchronization overhead for CNNs in three-dimensional neuromorphic architecture

Yi Wang, Ruibiao Chen, Rui Mao, Zili Shao

Research output: Journal article publicationJournal articleAcademic researchpeer-review

4 Citations (Scopus)


Recent three-dimensional (3-D) neuromorphic processing-in-memory (PIM) architecture provides a promising hardware-based solution to speed up the processing of convolutional neural networks. However, the limited capacity of the global buffer in this architecture is unable to efficiently handle synchronization overhead. In this paper, we jointly optimize the allocation of computation and memory resources on the 3-D-stacked PIM architecture. The objective is to minimize schedule length by removing synchronization overhead. To guarantee the generation of a feasible task schedule, we theoretically obtain the upper bound to reschedule each computation task. The target problem is further formulated as a dynamic programming model to get an optimal solution. We evaluate our technique with a variety of realistic neural network applications running on deep learning frameworks Caffe and TensorFlow. The results show that the proposed technique can achieve a significant reduction in processing time and improve the utilization of processing cores compared to previous studies.

Original languageEnglish
Pages (from-to)8973-8981
Number of pages9
JournalIEEE Transactions on Industrial Electronics
Issue number11
Publication statusPublished - 1 Nov 2018


  • Convolutional neural networks (CNNs)
  • embedded systems
  • neuromorphic computing
  • processing-in-memory
  • scheduling

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this