TY - JOUR
T1 - Computing the Rooted Triplet Distance Between Phylogenetic Networks
AU - Jansson, Jesper Andreas
AU - Mampentzidis, Konstantinos
AU - Rajaby, Ramesh
AU - Sung, Wing Kin
N1 - Funding Information:
JJ was partially funded by RGC/GRF project 15221420. KM acknowledges the support by the Danish National Research Foundation, grant DNRF84, via the Center for Massive Data Algorithmics (MADALGO).
Publisher Copyright:
© 2021, The Author(s).
PY - 2021
Y1 - 2021
N2 - The rooted triplet distance measures the structural dissimilarity of two phylogenetic trees or phylogenetic networks by counting the number of rooted phylogenetic trees with exactly three leaf labels (called rooted triplets, or triplets for short) that occur as embedded subtrees in one, but not both, of them. Suppose that N1= (V1, E1) and N2= (V2, E2) are phylogenetic networks over a common leaf label set of size n, that Ni has level ki and maximum in-degree di for i∈ { 1 , 2 } , and that the networks’ out-degrees are unbounded. Write N= max (| V1| , | V2|) , M= max (| E1| , | E2|) , k= max (k1, k2) , and d= max (d1, d2). Previous work has shown how to compute the rooted triplet distance between N1 and N2 in O (nlog n) time in the special case k≤ 1. For k> 1 , no efficient algorithms are known; applying a classic method from 1980 by Fortune et al. in a direct way leads to a running time of Ω (N6n3) and the only existing non-trivial algorithm imposes restrictions on the networks’ in- and out-degrees (in particular, it does not work when non-binary vertices are allowed). In this article, we develop two new algorithms with no such restrictions. Their running times are O (N2M+ n3) and O (M+ Nk2d2+ n3) , respectively. We also provide implementations of our algorithms, evaluate their performance on simulated and real datasets, and make some observations on the limitations of the current definition of the rooted triplet distance in practice. Our prototype implementations have been packaged into the first publicly available software for computing the rooted triplet distance between unrestricted networks of arbitrary levels.
AB - The rooted triplet distance measures the structural dissimilarity of two phylogenetic trees or phylogenetic networks by counting the number of rooted phylogenetic trees with exactly three leaf labels (called rooted triplets, or triplets for short) that occur as embedded subtrees in one, but not both, of them. Suppose that N1= (V1, E1) and N2= (V2, E2) are phylogenetic networks over a common leaf label set of size n, that Ni has level ki and maximum in-degree di for i∈ { 1 , 2 } , and that the networks’ out-degrees are unbounded. Write N= max (| V1| , | V2|) , M= max (| E1| , | E2|) , k= max (k1, k2) , and d= max (d1, d2). Previous work has shown how to compute the rooted triplet distance between N1 and N2 in O (nlog n) time in the special case k≤ 1. For k> 1 , no efficient algorithms are known; applying a classic method from 1980 by Fortune et al. in a direct way leads to a running time of Ω (N6n3) and the only existing non-trivial algorithm imposes restrictions on the networks’ in- and out-degrees (in particular, it does not work when non-binary vertices are allowed). In this article, we develop two new algorithms with no such restrictions. Their running times are O (N2M+ n3) and O (M+ Nk2d2+ n3) , respectively. We also provide implementations of our algorithms, evaluate their performance on simulated and real datasets, and make some observations on the limitations of the current definition of the rooted triplet distance in practice. Our prototype implementations have been packaged into the first publicly available software for computing the rooted triplet distance between unrestricted networks of arbitrary levels.
KW - Block tree
KW - Contracted block network
KW - Fan graph
KW - Implementation
KW - Phylogenetic network comparison
KW - Resolved graph
KW - Rooted triplet distance
UR - http://www.scopus.com/inward/record.url?scp=85102882195&partnerID=8YFLogxK
U2 - 10.1007/s00453-021-00802-1
DO - 10.1007/s00453-021-00802-1
M3 - Journal article
AN - SCOPUS:85102882195
SN - 0178-4617
VL - 83
SP - 1786
EP - 1828
JO - Algorithmica
JF - Algorithmica
IS - 6
ER -