Variant of Constrained Longest Common Subsequence Problem

variants of constrained longest common subsequence n.w
1 / 18
Embed
Share

This study introduces the Doubly-Constrained Longest Common Subsequence (DC-LCS) problem, which extends the classical Longest Common Subsequence problem to include constraints on symbol occurrences. The research provides insights into mathematical formulations for sequence comparison in Computational Biology and discusses fixed-parameter algorithms and hardness results for related problems. Various instances of constrained LCS problems are explored, shedding light on the complexity of sequence matching tasks in computational settings.

  • Variant LCS
  • DC-LCS problem
  • Computational Biology
  • Sequence Comparison
  • Fixed-Parameter Algorithm

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Variants of constrained longest common subsequence Paola Bonizzoni, Gianluca Della Vedova, Riccardo Dondi, Yuri Pirola Information Processing Letters, Volume 110, pp.877 881(2010) Presenter: Shian-Liang Lin 1 2025/6/10

  2. Abstract(1/2) We consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s1and s2over an alphabet , a set Csof strings, and a function Co: N, the DC-LCS problem consists of finding the longest subsequence s of s1and s2such that s is a supersequence of all the strings in Csand such that the number of occurrences in s of each symbol is upper bounded by Co( ). The DC-LCS problem provides a clear mathematical formulation of a sequence comparison problem in Computational Biology and generalizes two other constrained variants of the LCS problem that have been introduced previously in the literature: the Constrained LCS and the Repetition-Free LCS. 2025/6/10 2

  3. Abstract(2/2) We present two results for the DC-LCS problem. First, we illustrate a fixed-parameter algorithm where the parameter is the length of the solution which is also applicable to the more specialized problems. Second, we prove a parameterized hardness result for the Constrained LCS problem when the parameter is the number of the constraint strings (|Cs|) and the size of the alphabet . This hardness result also implies the parameterized hardness of the DC-LCS problem (with the same parameters) and its NP- hardness when the size of the alphabet is constant. 2025/6/10 3

  4. CLCS Problem 1 (Constrained Longest Common Subsequence ) Input: two strings s1and s2, a string constraint Cs. Output: a longest common subsequence s of s1and s2, so that each string in Csis a subsequence of s. Example: S1 : a c b a S2 : a b c b a Cs: b a CLCS: c b a , a c b 2025/6/10 4

  5. RF-LCS Problem 2 (Repetition-Free Longest Common Subsequence) Input: two strings s1and s2. Output: a longest common subsequence s of s1and s2, so that s contains at most one occurrence of each symbol . Example: S1 : a c b a S2 : a b c b a RF-LCS: a c b a(4) => a c b, a c a, c b a(3) 2025/6/10 5

  6. DC-LCS(1/2) Problem 3 (Doubly-Constrained Longest Common Subsequence) Input: two strings s1and s2, a string constraint Cs, and an occurrence constraint Co: N. Output: a longest common subsequence s of s1 and s2, so that each string in Csis a subsequence of s and s contains at most Co( ) occurrences of each symbol . 2025/6/10 6

  7. DC-LCS(2/2) Example: S1 : a c b a S2 : a b c b a Cs: b a Co( ) 1 , DC-LCS: a c b, a c a, c b a(3) 2025/6/10 7

  8. A fixed-parameter algorithm for DC-LCS(1/6) We present a fixed-parameter algorithm for the DC-LCS problem when |Cs| 1 (hence the result also holds for the RF-LCS problem). 2025/6/10 8

  9. A fixed-parameter algorithm(2/6) h b a L S2 1 1 1 1 1 a 1 1 1 1 1 b 1 1 1 1 1 c 1 1 1 1 1 b 1 1 1 1 1 a 1 1 1 1 1 { } I S1 a c b a {a} ... {b} S2 0 0 0 0 a 0 0 0 0 b 0 0 0 1 c 0 0 0 1 b 0 0 0 1 a 0 0 0 1 S2 0 0 0 0 a 0 0 0 0 b 0 0 0 1 c 0 0 0 1 b 0 0 0 1 a 0 0 0 1 C C S1 a c b S1 a c b a 0 0 1 1 1 1 a 0 0 1 1 1 1 2025/6/10 9 {c}

  10. A fixed-parameter algorithm(3/6) h b a L {b} . S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 1 1 b 0 0 1 1 a 0 0 1 1 {c} C S1 a c b a 0 0 0 1 1 1 {b,c} S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 0 0 b 0 0 0 1 a 0 0 0 1 C S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 0 0 b 0 0 0 0 a 0 0 0 0 C S1 a c b S1 a c b a 0 0 0 0 1 1 a 0 0 0 0 0 0 2025/6/10 10

  11. A fixed-parameter algorithm(4/6) h b a L {a,b} S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 0 0 b 0 0 0 1 a 0 0 0 1 {b,c} C S1 a c b a 0 0 0 0 1 1 {a,b, c} S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 0 0 b 0 0 0 1 a 0 0 0 1 S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 0 0 b 0 0 0 0 a 0 0 0 0 C C S1 a c b S1 a c b a 0 0 0 0 1 1 a 0 0 0 0 0 1 2025/6/10 11

  12. A fixed-parameter algorithm(5/5) S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 1 1 b 0 0 1 1 a 0 0 1 1 C S1 a c b DC-LCS: c b a a 0 0 0 1 1 1 V[ i , j , h[0]= , L={c} ] S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 0 0 b 0 0 0 0 a 0 0 0 0 S2 0 0 0 0 a 0 0 0 0 b 0 0 0 0 c 0 0 0 0 b 0 0 0 1 a 0 0 0 1 C C S1 a c b S1 a c b a 0 0 0 0 0 1 a 0 0 0 0 1 1 2025/6/10 12 V[ i , j , h[1]= b , L={b,c} ] V[ i , j , h[2]= a , L={a,b,c}]

  13. A fixed-parameter algorithm(6/6) Total number of entries of the matrix V [ , , , ] is |s1||s2||sc|2k. In case 3 and case 4 require at most O(k). => Time complexity = |s1||s2||sc|2k O(k) 2025/6/10 13

  14. Perfect family of hash functions(1/4) Given a set S, a family F of hash functions from S to {1, 2, . . . ,k} is called perfect if, for any S S of size k, there exists an injective hash function f F from S to the set of labels {1, 2, . . . ,k} For example: s1 = aaaa bbbbb cccc dddd s2 = ddddd ccc bbbb aaaa =>Co(a) = Co(b) = 4, Co(c) =Co(d) = 3 2025/6/10 14

  15. Perfect family of hash functions(2/4) Then the set S( ) is equal to {(a, 1), (a, 2), (a, 3), (a, 4), (b, 1), (b, 2), (b, 3), (b, 4), (c, 1), (c, 2), (c, 3), (d, 1),(d, 2), (d, 3)}. fi F S K (a,1) 1 (a,2) 2 . . . . . k (d,3) 2025/6/10 15

  16. Perfect family of hash functions(3/3) S1= a b c b a S2= a c b a Sc= b a =>Co(a) = 2, Co(b) =Co(c) = 1 if k=3: fi(a1)=1 or fi(b)=2 fi(c)=3 fi+1(a2)=1 or fi+2(b)= 1 fi+1(b)=2 fi+1(c)=3 fi+2(a1)= 2 .. fi+2(a2)= 3 2025/6/10 16

  17. Perfect family of hash functions(4/4) By computing the recurrence for all hash functions in F , we are certain to find a solution of length k, if such a solution exists. There exists a perfect family of hash functions of size O(log | |)2O(k) that can be computed in O(| | log | |)2O(k) time. the algorithm has an overall O(|s1| log |s1|)2O(k) +O(|s1||s2||sc|2O(k) log | |) time complexity. 2025/6/10 17

  18. Thanks for listening. 2025/6/10 18

Related


More Related Content