Abstract
The Robinson-Foulds distance, a widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two phylogenetic networks N1, N2with n leaf labels and at most m nodes and e edges each, the Robinson-Foulds distance measures the number of clusters of descendant leaves not shared by N1and N2. The fastest known algorithm for computing the Robinson-Foulds distance between N1and N2runs in O(me) time. In this paper, we improve the time complexity to O(ne/log n) for general phylogenetic networks and O(nm/log n) for general phylogenetic networks with bounded degree (assuming the word RAM model with a word length of ⌈logn⌉ bits), and to optimal O(m) time for leaf-outerplanar networks as well as optimal O(n) time for level-1 phylogenetic networks (that is, galled-trees). We also introduce the natural concept of the minimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a level-k network is at most k + 1, which implies that for one level-1 and one level-k phylogenetic network, our algorithm runs in O((k + 1)e) time.
Original language | English |
---|---|
Pages (from-to) | 77-90 |
Number of pages | 14 |
Journal | Information Sciences |
Volume | 197 |
DOIs | |
Publication status | Published - 15 Aug 2012 |
Externally published | Yes |
Keywords
- Algorithm
- Cluster representation
- Leaf-outerplanar phylogenetic network
- Minimum spread
- Phylogenetic network
- Robinson-Foulds distance
ASJC Scopus subject areas
- Software
- Control and Systems Engineering
- Theoretical Computer Science
- Computer Science Applications
- Information Systems and Management
- Artificial Intelligence