Bilinear deep learning for image classification

Sheng Hua Zhong, Yan Liu, Yang Liu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

63 Citations (Scopus)

Abstract

Image classification is a well-known classical problem in multimedia content analysis. This paper proposes a novel deep learning model called bilinear deep belief network (BDBN) for image classification. Unlike previous image classification models, BDBN aims to provide human-like judgment by referencing the architecture of the human visual system and the procedure of intelligent perception. Therefore, the multi-layer structure of the cortex and the propagation of information in the visual areas of the brain are realized faithfully. Unlike most existing deep models, BDBN utilizes a bilinear discriminant strategy to simulate the "initial guess" in human object recognition, and at the same time to avoid falling into a bad local optimum. To preserve the natural tensor structure of the image data, a novel deep architecture with greedy layer-wise reconstruction and global fine-tuning is proposed. To adapt real-world image classification tasks, we develop BDBN under a semi-supervised learning framework, which makes the deep model work well when labeled images are insufficient. Comparative experiments on three standard datasets show that the proposed algorithm outperforms both representative classification models and existing deep learning techniques. More interestingly, our demonstrations show that the proposed BDBN works consistently with the visual perception of humans.
Original languageEnglish
Title of host publicationMM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops
Pages343-352
Number of pages10
DOIs
Publication statusPublished - 29 Dec 2011
Event19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11 - Scottsdale, AZ, United States
Duration: 28 Nov 20111 Dec 2011

Conference

Conference19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11
Country/TerritoryUnited States
CityScottsdale, AZ
Period28/11/111/12/11

Keywords

  • Bilinear discriminant projection
  • Deep learning
  • Image classification

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Human-Computer Interaction

Cite this