Abstract
Label noise is ubiquitous in many real-world scenarios which often misleads training algorithm and brings about the
degraded classification performance. Therefore, many approaches have been proposed to correct the loss function given corrupted
labels to combat such label noise. Among them, a trend of works achieve this goal by unbiasedly estimating the data centroid, which
plays an important role in constructing an unbiased risk estimator for minimization. However, they usually handle the noisy labels in
different classes all at once, so the local information inherited by each class is ignored which often leads to unsatisfactory performance.
To address this defect, this paper presents a novel robust learning algorithm dubbed “Class-Wise Denoising” (CWD), which tackles the
noisy labels in a class-wise way to ease the entire noise correction task. Specifically, two virtual auxiliary sets are respectively
constructed by presuming that the positive and negative labels in the training set are clean, so the original false-negative labels and
false-positive ones are tackled separately. As a result, an improved centroid estimator can be designed which helps to yield more
accurate risk estimator. Theoretically, we prove that: 1) the variance in centroid estimation can often be reduced by our CWD when
compared with existing methods with unbiased centroid estimator; and 2) the performance of CWD trained on the noisy set will
converge to that of the optimal classifier trained on the clean set with a convergence rate O where n is the number of the training
examples. These sound theoretical properties critically enable our CWD to produce the improved classification performance under
label noise, which is also demonstrated by the comparisons with ten representative state-of-the-art methods on a variety of
benchmark datasets.
degraded classification performance. Therefore, many approaches have been proposed to correct the loss function given corrupted
labels to combat such label noise. Among them, a trend of works achieve this goal by unbiasedly estimating the data centroid, which
plays an important role in constructing an unbiased risk estimator for minimization. However, they usually handle the noisy labels in
different classes all at once, so the local information inherited by each class is ignored which often leads to unsatisfactory performance.
To address this defect, this paper presents a novel robust learning algorithm dubbed “Class-Wise Denoising” (CWD), which tackles the
noisy labels in a class-wise way to ease the entire noise correction task. Specifically, two virtual auxiliary sets are respectively
constructed by presuming that the positive and negative labels in the training set are clean, so the original false-negative labels and
false-positive ones are tackled separately. As a result, an improved centroid estimator can be designed which helps to yield more
accurate risk estimator. Theoretically, we prove that: 1) the variance in centroid estimation can often be reduced by our CWD when
compared with existing methods with unbiased centroid estimator; and 2) the performance of CWD trained on the noisy set will
converge to that of the optimal classifier trained on the clean set with a convergence rate O where n is the number of the training
examples. These sound theoretical properties critically enable our CWD to produce the improved classification performance under
label noise, which is also demonstrated by the comparisons with ten representative state-of-the-art methods on a variety of
benchmark datasets.
| Original language | English |
|---|---|
| Pages (from-to) | 2835 – 2848 |
| Number of pages | 14 |
| Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
| Volume | 45 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - Mar 2023 |
Keywords
- Label noise
- centroid estimation
- unbiasedness
- variance reduction