TY - JOUR
T1 - Distributed minimum error entropy algorithms
AU - Guo, Xin
AU - Hu, Ting
AU - Wu, Qiang
N1 - Funding Information:
The work described in this paper was partially supported by National Natural Science Foundation of China (Projects 11671307, 11571078, 11671171) and the Research Grants Council of the Hong Kong
Funding Information:
The work described in this paper was partially supported by National Natural Science Foundation of China (Projects 11671307, 11571078, 11671171) and the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. PolyU 25301115). We thank the anonymous reviewers for their constructive comments. All the three authors contributed equally to the paper.
Publisher Copyright:
© 2020 Xin Guo, Ting Hu and Qiang Wu.
PY - 2020/7
Y1 - 2020/7
N2 - Minimum Error Entropy (MEE) principle is an important approach in Information Theoretical Learning (ITL). It is widely applied and studied in various fields for its robustness to noise. In this paper, we study a reproducing kernel-based distributed MEE algorithm, DMEE, which is designed to work with both fully supervised data and semi-supervised data. The divide-and- conquer approach is employed, so there is no inter-node communication overhead. Similar as other distributed algorithms, DMEE significantly reduces the computational complexity and memory requirement on single computing nodes. With fully supervised data, our proved learning rates equal the minimax optimal learning rates of the classical pointwise kernel-based regressions. Under the semi-supervised learning scenarios, we show that DMEE exploits unlabeled data effectively, in the sense that first, under the settings with weak regularity assumptions, additional unlabeled data significantly improves the learning rates of DMEE. Second, with sufficient unlabeled data, labeled data can be distributed to many more computing nodes, that each node takes only O(1) labels, without spoiling the learning rates in terms of the number of labels. This conclusion overcomes the saturation phenomenon in unlabeled data size. It parallels a recent results for regularized least squares (Lin and Zhou, 2018), and suggests that an ination of unlabeled data is a solution to the MEE learning problems with decentralized data source for the concerns of privacy protection. Our work refers to pairwise learning and non-convex loss. The theoretical analysis is achieved by distributed U-statistics and error decomposition techniques in integral operators.
AB - Minimum Error Entropy (MEE) principle is an important approach in Information Theoretical Learning (ITL). It is widely applied and studied in various fields for its robustness to noise. In this paper, we study a reproducing kernel-based distributed MEE algorithm, DMEE, which is designed to work with both fully supervised data and semi-supervised data. The divide-and- conquer approach is employed, so there is no inter-node communication overhead. Similar as other distributed algorithms, DMEE significantly reduces the computational complexity and memory requirement on single computing nodes. With fully supervised data, our proved learning rates equal the minimax optimal learning rates of the classical pointwise kernel-based regressions. Under the semi-supervised learning scenarios, we show that DMEE exploits unlabeled data effectively, in the sense that first, under the settings with weak regularity assumptions, additional unlabeled data significantly improves the learning rates of DMEE. Second, with sufficient unlabeled data, labeled data can be distributed to many more computing nodes, that each node takes only O(1) labels, without spoiling the learning rates in terms of the number of labels. This conclusion overcomes the saturation phenomenon in unlabeled data size. It parallels a recent results for regularized least squares (Lin and Zhou, 2018), and suggests that an ination of unlabeled data is a solution to the MEE learning problems with decentralized data source for the concerns of privacy protection. Our work refers to pairwise learning and non-convex loss. The theoretical analysis is achieved by distributed U-statistics and error decomposition techniques in integral operators.
KW - Distributed method
KW - Information theoretic learning
KW - Minimum error entropy
KW - Reproducing kernel Hilbert space
KW - Semi-supervised data
UR - http://www.scopus.com/inward/record.url?scp=85094877140&partnerID=8YFLogxK
M3 - Journal article
AN - SCOPUS:85094877140
SN - 1532-4435
VL - 21
SP - 1
EP - 31
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
ER -