Abstract
There exist two well-known succinct representations of ordered trees: BP (balanced parenthesis) (Munro and Raman, 2001) [20] and DFUDS (depth first unary degree sequence) (Benoit et al., 2005) [1]. Both have size 2n+o(n) bits for n-node trees, which asymptotically matches the information-theoretic lower bound. Importantly, many fundamental operations on trees can be done in constant time on the word RAM when using BP or DFUDS, for example finding the parent, the first child, the next sibling, the number of descendants, etc. Although the space needed to store the BP or DFUDS representation of an ordered tree matches the lower bound, this is not optimal when we consider encodings for certain special classes of trees such as trees in which every internal node has exactly two children. In this paper, we introduce a new, conditional entropy for trees called the tree degree entropy, and give a succinct tree representation with matching size. We call such a representation an ultra-succinct data structure. We show how to modify the DFUDS representation to obtain a "compressed DFUDS", and as a consequence, a tree in which every internal node has exactly two children can be represented in n+o(n) bits. We also describe applications of the compressed DFUDS representation to ultra-succinct compressed suffix trees and labeled trees.
Original language | English |
---|---|
Pages (from-to) | 619-631 |
Number of pages | 13 |
Journal | Journal of Computer and System Sciences |
Volume | 78 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 Mar 2012 |
Externally published | Yes |
Keywords
- Compressed suffix tree
- Ordered tree
- Succinct data structure
- Tree degree entropy
- Xbw
ASJC Scopus subject areas
- Theoretical Computer Science
- Computer Networks and Communications
- Computational Theory and Mathematics
- Applied Mathematics