TY - JOUR

T1 - Exact algorithms for the repetition-bounded longest common subsequence problem

AU - Asahiro, Yuichi

AU - Jansson, Jesper Andreas

AU - Lin, Guohui

AU - Miyano, Eiji

AU - Ono, Hirotaka

AU - Utashima, Tadatoshi

N1 - Funding Information:
This work was partially supported by PolyU Fund 1-ZE8L , the Natural Sciences and Engineering Research Council of Canada , JST CREST JPMJR1402 , and Grants-in-Aid for Scientific Research of Japan (KAKENHI) Grant Numbers JP17K00016 , JP17K00024 , JP17K19960 and JP17H01698 .
Publisher Copyright:
© 2020 Elsevier B.V.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

PY - 2020/10/24

Y1 - 2020/10/24

N2 - In this paper, we study exact, exponential-time algorithms for a variant of the classic LONGEST COMMON SUBSEQUENCE problem called the REPETITION-BOUNDED LONGEST COMMON SUBSEQUENCE problem (or RBLCS, for short): Let an alphabet S be a finite set of symbols and an occurrence constraint Cocc be a function Cocc:S→N, assigning an upper bound on the number of occurrences of each symbol in S. Given two sequences X and Y over the alphabet S and an occurrence constraint Cocc, the goal of RBLCS is to find a longest common subsequence of X and Y such that each symbol s∈S appears at most Cocc(s) times in the obtained subsequence. The special case where Cocc(s)=1 for every symbol s∈S is known as the REPETITION-FREE LONGEST COMMON SUBSEQUENCE problem (RFLCS) and has been studied previously; e.g., in [1], Adi et al. presented a simple (exponential-time) exact algorithm for RFLCS. However, they did not analyze its time complexity in detail, and to the best of our knowledge, there are no previous results on the running times of any exact algorithms for this problem. Without loss of generality, we will assume that |X|≤|Y| and |X|=n. In this paper, we first propose a simpler algorithm for RFLCS based on the strategy used in [1] and show explicitly that its running time is O(1.44225n). Next, we provide a dynamic programming (DP) based algorithm for RBLCS and prove that its running time is O(1.44225n) for any occurrence constraint Cocc, and even less in certain special cases. In particular, for RFLCS, our DP-based algorithm runs in O(1.41422n) time, which is faster than the previous one. Furthermore, we prove NP-hardness and APX-hardness results for RBLCS on restricted instances.

AB - In this paper, we study exact, exponential-time algorithms for a variant of the classic LONGEST COMMON SUBSEQUENCE problem called the REPETITION-BOUNDED LONGEST COMMON SUBSEQUENCE problem (or RBLCS, for short): Let an alphabet S be a finite set of symbols and an occurrence constraint Cocc be a function Cocc:S→N, assigning an upper bound on the number of occurrences of each symbol in S. Given two sequences X and Y over the alphabet S and an occurrence constraint Cocc, the goal of RBLCS is to find a longest common subsequence of X and Y such that each symbol s∈S appears at most Cocc(s) times in the obtained subsequence. The special case where Cocc(s)=1 for every symbol s∈S is known as the REPETITION-FREE LONGEST COMMON SUBSEQUENCE problem (RFLCS) and has been studied previously; e.g., in [1], Adi et al. presented a simple (exponential-time) exact algorithm for RFLCS. However, they did not analyze its time complexity in detail, and to the best of our knowledge, there are no previous results on the running times of any exact algorithms for this problem. Without loss of generality, we will assume that |X|≤|Y| and |X|=n. In this paper, we first propose a simpler algorithm for RFLCS based on the strategy used in [1] and show explicitly that its running time is O(1.44225n). Next, we provide a dynamic programming (DP) based algorithm for RBLCS and prove that its running time is O(1.44225n) for any occurrence constraint Cocc, and even less in certain special cases. In particular, for RFLCS, our DP-based algorithm runs in O(1.41422n) time, which is faster than the previous one. Furthermore, we prove NP-hardness and APX-hardness results for RBLCS on restricted instances.

KW - APX-hardness

KW - Dynamic programming

KW - Exponential-time exact algorithms

KW - NP-hardness

KW - Repetition-bounded longest common subsequence problem

KW - Repetition-free

UR - http://www.scopus.com/inward/record.url?scp=85089290906&partnerID=8YFLogxK

U2 - 10.1016/j.tcs.2020.07.042

DO - 10.1016/j.tcs.2020.07.042

M3 - Journal article

AN - SCOPUS:85089290906

SN - 0304-3975

VL - 838

SP - 238

EP - 249

JO - Theoretical Computer Science

JF - Theoretical Computer Science

ER -