Abstract
This paper studies instance-dependent Positive and Unlabeled (PU) classification, where whether a positive example will be
labeled (indicated by s) is not only related to the class label y, but also depends on the observation x. Therefore, the labeling probability
on positive examples is not uniform as previous works assumed, but is biased to some simple or critical data points. To depict the above
dependency relationship, a graphical model is built in this paper which further leads to a maximization problem on the induced
likelihood function regarding Pðs; yjxÞ. By utilizing the well-known EM and Adam optimization techniques, the labeling probability of any
positive example Pðs ¼ 1jy ¼ 1; xÞ as well as the classifier induced by PðyjxÞ can be acquired. Theoretically, we prove that the critical
solution always exists, and is locally unique for linear model if some sufficient conditions are met. Moreover, we upper bound the
generalization error for both linear logistic and non-linear network instantiations of our algorithm, with the convergence rate of expected
risk to empirical risk as Oð1= ffiffiffi
k
p þ 1= ffiffiffiffiffiffiffiffiffiffiffi
n k p þ 1= ffiffiffi
n p Þ (k and n are the sizes of positive set and the entire training set, respectively).
Empirically, we compare our method with state-of-the-art instance-independent and instance-dependent PU algorithms on a wide
range of synthetic, benchmark and real-world datasets, and the experimental results firmly demonstrate the advantage of the proposed
method over the existing PU approaches.
labeled (indicated by s) is not only related to the class label y, but also depends on the observation x. Therefore, the labeling probability
on positive examples is not uniform as previous works assumed, but is biased to some simple or critical data points. To depict the above
dependency relationship, a graphical model is built in this paper which further leads to a maximization problem on the induced
likelihood function regarding Pðs; yjxÞ. By utilizing the well-known EM and Adam optimization techniques, the labeling probability of any
positive example Pðs ¼ 1jy ¼ 1; xÞ as well as the classifier induced by PðyjxÞ can be acquired. Theoretically, we prove that the critical
solution always exists, and is locally unique for linear model if some sufficient conditions are met. Moreover, we upper bound the
generalization error for both linear logistic and non-linear network instantiations of our algorithm, with the convergence rate of expected
risk to empirical risk as Oð1= ffiffiffi
k
p þ 1= ffiffiffiffiffiffiffiffiffiffiffi
n k p þ 1= ffiffiffi
n p Þ (k and n are the sizes of positive set and the entire training set, respectively).
Empirically, we compare our method with state-of-the-art instance-independent and instance-dependent PU algorithms on a wide
range of synthetic, benchmark and real-world datasets, and the experimental results firmly demonstrate the advantage of the proposed
method over the existing PU approaches.
| Original language | English |
|---|---|
| Pages (from-to) | 4163-4176 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
| Volume | 44 |
| Issue number | 8 |
| DOIs | |
| Publication status | Published - 1 Aug 2022 |
Fingerprint
Dive into the research topics of 'Instance-Dependent Positive and Unlabeled Learning with Labeling Bias Estimation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver