TY - JOUR
T1 - Pyramidal Predictive Network
T2 - A Model for Visual-Frame Prediction Based on Predictive Coding Theory
AU - Ling, Chaofan
AU - Zhong, Junpei
AU - Li, Weihua
N1 - Funding Information:
This work was partially supported by the Key-Area Research and Development Program of Guangdong Province under Grant: 2019B090912001, and by the PolyU Grants: ZVUY-P0035417,CD5E-P0043422, WZ09-P0043123.
Publisher Copyright:
© 2022 by the authors.
PY - 2022/9/19
Y1 - 2022/9/19
N2 - Visual-frame prediction is a pixel-dense prediction task that infers future frames from past frames. A lack of appearance details, low prediction accuracy and a high computational overhead are still major problems associated with current models or methods. In this paper, we propose a novel neural network model inspired by the well-known predictive coding theory to deal with these problems. Predictive coding provides an interesting and reliable computational framework. We combined this approach with other theories, such as the theory that the cerebral cortex oscillates at different frequencies at different levels, to design an efficient and reliable predictive network model for visual-frame prediction. Specifically, the model is composed of a series of recurrent and convolutional units forming the top-down and bottom-up streams, respectively. The update frequency of neural units on each of the layers decreases with the increase in the network level, which means that neurons of a higher level can capture information in longer time dimensions. According to the experimental results, this model showed better compactness and comparable predictive performance with those of existing works, implying lower computational cost and higher prediction accuracy.
AB - Visual-frame prediction is a pixel-dense prediction task that infers future frames from past frames. A lack of appearance details, low prediction accuracy and a high computational overhead are still major problems associated with current models or methods. In this paper, we propose a novel neural network model inspired by the well-known predictive coding theory to deal with these problems. Predictive coding provides an interesting and reliable computational framework. We combined this approach with other theories, such as the theory that the cerebral cortex oscillates at different frequencies at different levels, to design an efficient and reliable predictive network model for visual-frame prediction. Specifically, the model is composed of a series of recurrent and convolutional units forming the top-down and bottom-up streams, respectively. The update frequency of neural units on each of the layers decreases with the increase in the network level, which means that neurons of a higher level can capture information in longer time dimensions. According to the experimental results, this model showed better compactness and comparable predictive performance with those of existing works, implying lower computational cost and higher prediction accuracy.
KW - neural network
KW - predictive coding
KW - video prediction
UR - http://www.scopus.com/inward/record.url?scp=85138724694&partnerID=8YFLogxK
U2 - 10.3390/electronics11182969
DO - 10.3390/electronics11182969
M3 - Journal article
AN - SCOPUS:85138724694
SN - 2079-9292
VL - 11
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 18
M1 - 2969
ER -