Abstract
The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O(C2K2) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to O(C⋅(C+K2)) while spatial separable convolution reduces the complexity to O(C2K). However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O(C[Formula presented]K). When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O(C⋅log(CK2)) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.
Original language | English |
---|---|
Pages (from-to) | 162-171 |
Number of pages | 10 |
Journal | AI Open |
Volume | 3 |
DOIs | |
Publication status | Published - Jan 2022 |
Keywords
- Deep neural network
- Separable convolution
ASJC Scopus subject areas
- Software
- Information Systems
- Human-Computer Interaction
- Computer Vision and Pattern Recognition
- Computer Science Applications
- Artificial Intelligence