Abstract
Discovering motifs in sequence databases has been receiving abundant attentions from both database and data mining communities, where the motif is the most correlated pair of subsequences in a sequence object. Motif discovery is expensive for emerging applications which may have very long sequences (e.g., million observations per sequence) or the queries arrive rapidly (e.g., per 10 seconds). Prior works cannot offer fast correlation computations and prune subsequence pairs at the same time, as these two techniques require different orderings on examining subsequence pairs. In this work, we propose a novel framework named Quick-Motif which adopts a two-level approach to enable batch pruning at the outer level and enable fast correlation calculation at the inner level. We further propose two optimization techniques for the outer and the inner level. In our experimental study, our method is up to 3 orders of magnitude faster than the state-of-the-art methods.
Original language | English |
---|---|
Title of host publication | 2015 IEEE 31st International Conference on Data Engineering, ICDE 2015 |
Publisher | IEEE Computer Society |
Pages | 579-590 |
Number of pages | 12 |
Volume | 2015-May |
ISBN (Electronic) | 9781479979639 |
DOIs | |
Publication status | Published - 1 Jan 2015 |
Event | 2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of Duration: 13 Apr 2015 → 17 Apr 2015 |
Conference
Conference | 2015 31st IEEE International Conference on Data Engineering, ICDE 2015 |
---|---|
Country/Territory | Korea, Republic of |
City | Seoul |
Period | 13/04/15 → 17/04/15 |
ASJC Scopus subject areas
- Software
- Signal Processing
- Information Systems