Is noise always harmful? Visual learning from weakly-related data

Sheng Hua Zhong, Yan Liu, Kien A. Hua, Songtao Wu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Noise exists universally in multimedia data, especially in Internet era. For example, tags from web users are often incomplete, arbitrary, and low relevant with the visual information. Intuitively, noise in the dataset is harmful to learning tasks, which implies that huge volumes of image tags from social media can't be utilized directly. To collect the reliable training dataset, labor-intensive manual labeling and various learning based outlier detection techniques are widely used. This paper intends to discuss whether such kind of preprocessing is always needed. We focus on a very normal case in image classification that the available dataset includes a large amount of images weakly related to any target classes. We use deep models as the platform and design a series of experiments to compare the semi-supervised learning performance with/without weakly related unlabeled data. Fortunately, we validate that weakly related data is not always harmful, which is an encouraging finding for research on web image learning.

Original languageEnglish
Title of host publicationProceedings of 2015 International Conference on Orange Technologies, ICOT 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages181-184
Number of pages4
ISBN (Electronic)9781467382373
DOIs
Publication statusPublished - 22 Jun 2016
Event3rd International Conference on Orange Technologies, ICOT 2015 - Hong Kong, Hong Kong
Duration: 19 Dec 201522 Dec 2015

Publication series

NameProceedings of 2015 International Conference on Orange Technologies, ICOT 2015

Conference

Conference3rd International Conference on Orange Technologies, ICOT 2015
Country/TerritoryHong Kong
CityHong Kong
Period19/12/1522/12/15

Keywords

  • deep learning
  • semi-supervised learning
  • Weakly-related data

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Social Psychology
  • Artificial Intelligence
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Is noise always harmful? Visual learning from weakly-related data'. Together they form a unique fingerprint.

Cite this