Pyramidal Predictive Network: A Model for Visual-Frame Prediction Based on Predictive Coding Theory

Chaofan Ling, Junpei Zhong (Corresponding Author), Weihua Li (Corresponding Author)

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Visual-frame prediction is a pixel-dense prediction task that infers future frames from past frames. A lack of appearance details, low prediction accuracy and a high computational overhead are still major problems associated with current models or methods. In this paper, we propose a novel neural network model inspired by the well-known predictive coding theory to deal with these problems. Predictive coding provides an interesting and reliable computational framework. We combined this approach with other theories, such as the theory that the cerebral cortex oscillates at different frequencies at different levels, to design an efficient and reliable predictive network model for visual-frame prediction. Specifically, the model is composed of a series of recurrent and convolutional units forming the top-down and bottom-up streams, respectively. The update frequency of neural units on each of the layers decreases with the increase in the network level, which means that neurons of a higher level can capture information in longer time dimensions. According to the experimental results, this model showed better compactness and comparable predictive performance with those of existing works, implying lower computational cost and higher prediction accuracy.

Original languageEnglish
Article number2969
JournalElectronics (Switzerland)
Volume11
Issue number18
DOIs
Publication statusPublished - 19 Sep 2022

Keywords

  • neural network
  • predictive coding
  • video prediction

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Signal Processing
  • Hardware and Architecture
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Pyramidal Predictive Network: A Model for Visual-Frame Prediction Based on Predictive Coding Theory'. Together they form a unique fingerprint.

Cite this