Algorithms for Finding a Most Similar Subforest

Jesper Andreas Jansson, Zeshan Peng

Research output: Journal article publicationJournal articleAcademic researchpeer-review

2 Citations (Scopus)


Given an ordered labeled forest F ("the target forest") and an ordered labeled forest G ("the pattern forest"), the most similar subforest problem is to find a subforest F′ of F such that the forest edit distance between F′ and G is minimum over all possible F′. This problem generalizes several well-studied problems which have important applications in locating patterns in hierarchical structures such as RNA molecules' secondary structures and XML documents. Algorithms for the most similar subforest problem restricted to subforests which are either rooted subtrees or simple substructures exist in the literature; in this article, we show how to solve the most similar subforest problem for two other types of subforests: sibling substructures and closed subforests.
Original languageEnglish
Pages (from-to)865-887
Number of pages23
JournalTheory of Computing Systems
Issue number4
Publication statusPublished - 1 May 2011
Externally publishedYes


  • Approximate pattern matching
  • Closed subforest
  • Dynamic programming
  • Forest edit distance
  • Sibling substructure
  • Simple substructure

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computational Theory and Mathematics

Cite this