A graph mining algorithm for classifying chemical compounds

Winnie W.M. Lam, Chun Chung Chan

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

3 Citations (Scopus)

Abstract

Graph data mining algorithms are increasingly applied to biological graph dataset. However, while existing graph mining algorithms can identify frequently occurring sub-graphs, these do not necessarily represent useful patterns. In this paper, we propose a novel graph mining algorithm, MIGDAC (Mining Graph DAta for Classification), that applies graph theory and an interestingness measure to discover interesting sub-graphs which can be both characterized and easily distinguished from other classes. Applying MIGDAC to the discovery of specific patterns of chemical compounds, we first represent each chemical compound as a graph and transform it into a set of hierarchical graphs. This not only represents more information that traditional formats, it also simplifies the complex graph structures. We then apply MIGDAC to extract a set of class-specific patterns defined in terms of an interestingness threshold and measure with residue analysis. The next step is to use weight of evidence to estimate whether the identified class-specific pattern will positively or negatively characterize a class of drug. Experiments on a drug dataset from the KEGG ligand database show that MIGDAC using hierarchical graph representation greatly improves the accuracy of the traditional frequent graph mining algorithms.
Original languageEnglish
Title of host publicationProceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
Pages321-324
Number of pages4
DOIs
Publication statusPublished - 1 Dec 2008
Event2008 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008 - Philadelphia, PA, United States
Duration: 3 Nov 20085 Nov 2008

Conference

Conference2008 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
Country/TerritoryUnited States
CityPhiladelphia, PA
Period3/11/085/11/08

ASJC Scopus subject areas

  • Molecular Biology
  • Information Systems
  • Biomedical Engineering

Cite this