TY - GEN

T1 - Compressed dynamic tries with applications to LZ-compression in sublinear time and space

AU - Jansson, Jesper Andreas

AU - Sadakane, Kunihiko

AU - Sung, Wing Kin

PY - 2007/12/1

Y1 - 2007/12/1

N2 - The dynamic trie is a fundamental data structure which finds applications in many areas. This paper proposes a compressed version of the dynamic trie data structure. Our data-structure is not only space efficient, it also allows pattern searching in o(|P|) time and leaf insertion/deletion in o(log n) time, where |P| is the length of the pattern and n is the size of the trie. To demonstrate the usefulness of the new data structure, we apply it to the LZ-compression problem. For a string S of length s over an alphabet A of size σ, the previously best known algorithms for computing the Ziv-Lempel encoding (LZ78) of S either run in: (1) O(s) time and O(s log s) bits working space; or (2) O(sσ) time and O(sHk+ s log σ/ logσs) bits working space, where Hkis the k-order entropy of the text. No previous algorithm runs in sublinear time. Our new data structure implies a LZ-compression algorithm which runs in sublinear time and uses optimal working space. More precisely, the LZ-compression algorithm uses O(s(log σ + log logσs)logσs) bits working space and runs in O(s(log log s)2/(logσs log log log s)) worst-case time, which is sublinear when σ = 2o(log slog log log s/(log log s)2)).

AB - The dynamic trie is a fundamental data structure which finds applications in many areas. This paper proposes a compressed version of the dynamic trie data structure. Our data-structure is not only space efficient, it also allows pattern searching in o(|P|) time and leaf insertion/deletion in o(log n) time, where |P| is the length of the pattern and n is the size of the trie. To demonstrate the usefulness of the new data structure, we apply it to the LZ-compression problem. For a string S of length s over an alphabet A of size σ, the previously best known algorithms for computing the Ziv-Lempel encoding (LZ78) of S either run in: (1) O(s) time and O(s log s) bits working space; or (2) O(sσ) time and O(sHk+ s log σ/ logσs) bits working space, where Hkis the k-order entropy of the text. No previous algorithm runs in sublinear time. Our new data structure implies a LZ-compression algorithm which runs in sublinear time and uses optimal working space. More precisely, the LZ-compression algorithm uses O(s(log σ + log logσs)logσs) bits working space and runs in O(s(log log s)2/(logσs log log log s)) worst-case time, which is sublinear when σ = 2o(log slog log log s/(log log s)2)).

UR - http://www.scopus.com/inward/record.url?scp=38349142588&partnerID=8YFLogxK

M3 - Conference article published in proceeding or book

SN - 9783540770497

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 424

EP - 435

BT - FSTTCS 2007

T2 - 27th International Conference on the Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2007

Y2 - 12 December 2007 through 14 December 2007

ER -