TY - GEN
T1 - Deep semantic space with intra-class low-rank constraint for cross-modal retrieval
AU - Kang, Peipei
AU - Lin, Zehang
AU - Yang, Zhenguo
AU - Fang, Xiaozhao
AU - Li, Qing
AU - Liu, Wenyin
PY - 2019/6/5
Y1 - 2019/6/5
N2 - In this paper, a novel Deep Semantic Space learning model with Intra-class Low-rank constraint (DSSIL) is proposed for crossmodal retrieval, which is composed of two subnetworks for modality-specific representation learning, followed by projection layers for common space mapping. In particular, DSSIL takes into account semantic consistency to fuse the cross-modal data in a high-level common space, and constrains the common representation matrix within the same class to be low-rank, in order to induce the intra-class representations more relevant. More formally, two regularization terms are devised for the two aspects, which have been incorporated into the objective of DSSIL. To optimize the modality-specific subnetworks and the projection layers simultaneously by exploiting the gradient decent directly, we approximate the nonconvex low-rank constraint by minimizing a few smallest singular values of the intra-class matrix with theoretical analysis. Extensive experiments conducted on three public datasets demonstrate the competitive superiority of DSSIL for cross-modal retrieval compared with the state-of-theart methods.
AB - In this paper, a novel Deep Semantic Space learning model with Intra-class Low-rank constraint (DSSIL) is proposed for crossmodal retrieval, which is composed of two subnetworks for modality-specific representation learning, followed by projection layers for common space mapping. In particular, DSSIL takes into account semantic consistency to fuse the cross-modal data in a high-level common space, and constrains the common representation matrix within the same class to be low-rank, in order to induce the intra-class representations more relevant. More formally, two regularization terms are devised for the two aspects, which have been incorporated into the objective of DSSIL. To optimize the modality-specific subnetworks and the projection layers simultaneously by exploiting the gradient decent directly, we approximate the nonconvex low-rank constraint by minimizing a few smallest singular values of the intra-class matrix with theoretical analysis. Extensive experiments conducted on three public datasets demonstrate the competitive superiority of DSSIL for cross-modal retrieval compared with the state-of-theart methods.
KW - Cross-modal retrieval
KW - Deep neural networks
KW - Intra-class low-rank
KW - Semantic space
UR - http://www.scopus.com/inward/record.url?scp=85068031452&partnerID=8YFLogxK
U2 - 10.1145/3323873.3325029
DO - 10.1145/3323873.3325029
M3 - Conference article published in proceeding or book
AN - SCOPUS:85068031452
T3 - ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval
SP - 226
EP - 234
BT - ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval
PB - Association for Computing Machinery, Inc
T2 - 2019 ACM International Conference on Multimedia Retrieval, ICMR 2019
Y2 - 10 June 2019 through 13 June 2019
ER -