Quick-motif: An efficient and scalable framework for exact motif discovery

Yuhong Li, U. Leong Hou, Man Lung Yiu, Zhiguo Gong

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

37 Citations (Scopus)

Abstract

Discovering motifs in sequence databases has been receiving abundant attentions from both database and data mining communities, where the motif is the most correlated pair of subsequences in a sequence object. Motif discovery is expensive for emerging applications which may have very long sequences (e.g., million observations per sequence) or the queries arrive rapidly (e.g., per 10 seconds). Prior works cannot offer fast correlation computations and prune subsequence pairs at the same time, as these two techniques require different orderings on examining subsequence pairs. In this work, we propose a novel framework named Quick-Motif which adopts a two-level approach to enable batch pruning at the outer level and enable fast correlation calculation at the inner level. We further propose two optimization techniques for the outer and the inner level. In our experimental study, our method is up to 3 orders of magnitude faster than the state-of-the-art methods.
Original languageEnglish
Title of host publication2015 IEEE 31st International Conference on Data Engineering, ICDE 2015
PublisherIEEE Computer Society
Pages579-590
Number of pages12
Volume2015-May
ISBN (Electronic)9781479979639
DOIs
Publication statusPublished - 1 Jan 2015
Event2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of
Duration: 13 Apr 201517 Apr 2015

Conference

Conference2015 31st IEEE International Conference on Data Engineering, ICDE 2015
Country/TerritoryKorea, Republic of
CitySeoul
Period13/04/1517/04/15

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this