TED: Towards Discovering Top-𝑘 Edge-Diversified Patterns in a Graph Database

Kai Huang, Haibo Hu, Qingqing Ye, Kai Tian, Bolong Zheng, Zhou Xiaofang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

With an exponentially growing number of graphs from disparate repositories, there is a strong need to analyze a graph database containing an extensive collection of small- or medium-sized data graphs (e.g., chemical compounds). Although subgraph enumeration and subgraph mining have been proposed to bring insights into a graph database by a set of subgraph structures, they often end up with similar or homogenous topologies, which is undesirable in many graph applications. To address this limitation, we propose the Top-k Edge-Diversified Patterns Discovery problem to retrieve a set of subgraphs that cover the maximum number of edges in a database. To efficiently process such query, we present a generic and extensible framework called Ted which achieves a guaranteed approximation ratio to the optimal result. Two optimization strategies are further developed to improve the performance. Experimental studies on real-world datasets demonstrate the superiority of Ted to traditional techniques.
Original languageEnglish
Title of host publicationInternational Conference of Management of Data
Pages1-14
Publication statusPublished - Jun 2023

Fingerprint

Dive into the research topics of 'TED: Towards Discovering Top-𝑘 Edge-Diversified Patterns in a Graph Database'. Together they form a unique fingerprint.

Cite this