Abstract
Community detection is a well-established area of research in network analysis. However, there has been limited discussion on how to improve prediction accuracy when some community labels are already known. In this paper, we introduce a novel algorithm called the weighted inverse Laplacian (WIL) for predicting labels in partially labeled undirected networks. Our algorithm is founded on the concept of the first hitting time of a random walk and is supported by information propagation and regularization frameworks. By combining two different normalization techniques, WIL is highly adaptable and can handle community imbalance and degree heterogeneity. Additionally, we propose a partially labeled degree-corrected block model (pDCBM) to describe the generation of partially labeled networks. Under this model, we prove that WIL guarantees a misclassification rate going to zero as the number of nodes goes to infinity, and it can handle greater imbalances than traditional Laplacian methods. Our simulations and empirical studies demonstrate that WIL outperforms other stateof-the-art methods, particularly in unbalanced and heterogeneous networks.
| Original language | English |
|---|---|
| Pages (from-to) | 501-516 |
| Number of pages | 16 |
| Journal | Statistics and its Interface |
| Volume | 17 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - 19 Jul 2024 |
Keywords
- Heterogeneous node
- Network data
- Semi-supervised learning
- Unbalanced label
ASJC Scopus subject areas
- Statistics and Probability
- Applied Mathematics
Fingerprint
Dive into the research topics of 'Semi-supervised learning in unbalanced networks with heterogeneous degree'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver