Scalable and Effective Bipartite Network Embedding

Renchi Yang, Jieming Shi, Keke Huang, Xiaokui Xiao

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Given a bipartite graph G consisting of inter-set weighted edges connecting the nodes in two disjoint sets U and V, bipartite network embedding (BNE) maps each node ui in U and vj in V to compact embedding vectors that capture the hidden topological features surrounding the nodes, to facilitate downstream tasks. Effective BNE should preserve not only the direct connections between nodes but also the multi-hop relationships formed alternately by the two types of nodes in G, which can incur prohibitive overheads, especially on massive bipartite graphs with millions of nodes and billions of edges. Existing solutions are hardly scalable to massive bipartite graphs, and often produce low-quality results. This paper proposes GEBE, a generic BNE framework achieving state-of-the-art performance on massive bipartite graphs, via four main algorithmic designs. First, we present two generic measures to capture the multi-hop similarity/proximity between homogeneous/heterogeneous nodes respectively, and the measures can be instantiated with three popular probability distributions, including Poisson, Geometric, and Uniform distributions. Second, GEBE formulates a novel and unified BNE objective to preserve the two measures of all possible node pairs. Third, GEBE includes several efficiency designs to get high-quality embeddings on massive graphs. Finally, we observe that GEBE achieves the best performance when instantiating MHS and MHP using a Poisson distribution, and thus, we further develop GEBEp based on Poisson-instantiated MHS and MHP, with non-trivial efficiency optimizations. Extensive experiments, comparing 15 competitors on 10 real datasets, demonstrate that our solutions, especially GEBEp, obtain superior result utility than all competitors for top-N recommendation and link prediction, while being up to orders of magnitude faster.

Original languageEnglish
Title of host publicationSIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1977-1991
Number of pages15
ISBN (Electronic)9781450392495
DOIs
Publication statusPublished - 10 Jun 2022
Event2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022 - Virtual, Online, United States
Duration: 12 Jun 202217 Jun 2022

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022
Country/TerritoryUnited States
CityVirtual, Online
Period12/06/2217/06/22

Keywords

  • bipartite graphs
  • network embedding
  • Poisson distribution

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Scalable and Effective Bipartite Network Embedding'. Together they form a unique fingerprint.

Cite this