TY - GEN
T1 - A Novel Structure of Convolutional Layers with a Higher Performance-Complexity Ratio for Semantic Segmentation
AU - Jiang, Yalong
AU - Chi, Zheru
PY - 2018/11/18
Y1 - 2018/11/18
N2 - In this paper, we study an important factor that determines the capacity of a CNN model and propose a novel structure of convolutional layers with a higher performance-complexity ratio. Firstly, the relationship of the model capacity and the number of parameters versus segmentation performance is explored. Secondly, a mechanism is proposed to optimize the structure of a CNN model for a specific task. The mechanism also provides better convergence than current state-of-the-art methods for factorizing convolutional layers, such as MobileNet. Thirdly, we propose a measure based on the mutual information between hidden activations and inputs/outputs to compute the capacity of a CNN model. This measure is highly correlated with segmentation performance. Experimental results on the segmentation of the PASCAL Person Parts Dataset show that the linear dependency among convolutional kernels is an important factor determining the capacity of a CNN model. It is also demonstrated that our approach can successfully adjust the model capacity to best match to the complexity of a dataset. The optimized CNN model achieves the similar performance to Deeplab-V2 on the segmentation task with 100 × less parameters, resulting in a significantly improved performance-complexity ratio.
AB - In this paper, we study an important factor that determines the capacity of a CNN model and propose a novel structure of convolutional layers with a higher performance-complexity ratio. Firstly, the relationship of the model capacity and the number of parameters versus segmentation performance is explored. Secondly, a mechanism is proposed to optimize the structure of a CNN model for a specific task. The mechanism also provides better convergence than current state-of-the-art methods for factorizing convolutional layers, such as MobileNet. Thirdly, we propose a measure based on the mutual information between hidden activations and inputs/outputs to compute the capacity of a CNN model. This measure is highly correlated with segmentation performance. Experimental results on the segmentation of the PASCAL Person Parts Dataset show that the linear dependency among convolutional kernels is an important factor determining the capacity of a CNN model. It is also demonstrated that our approach can successfully adjust the model capacity to best match to the complexity of a dataset. The optimized CNN model achieves the similar performance to Deeplab-V2 on the segmentation task with 100 × less parameters, resulting in a significantly improved performance-complexity ratio.
UR - http://www.scopus.com/inward/record.url?scp=85060784036&partnerID=8YFLogxK
U2 - 10.1109/ICARCV.2018.8580632
DO - 10.1109/ICARCV.2018.8580632
M3 - Conference article published in proceeding or book
AN - SCOPUS:85060784036
T3 - 2018 15th International Conference on Control, Automation, Robotics and Vision, ICARCV 2018
SP - 186
EP - 191
BT - 2018 15th International Conference on Control, Automation, Robotics and Vision, ICARCV 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th International Conference on Control, Automation, Robotics and Vision, ICARCV 2018
Y2 - 18 November 2018 through 21 November 2018
ER -