
Genome Rearrangement Paradigm Shift in Molecular Biology
Sequence comparison in molecular biology is undergoing a major shift towards chromosome comparison based on global rearrangements rather than local mutations. This paper explores genome rearrangements and sorting by reversals, presenting approximation algorithms for signed permutations and sorting by reversals. The concepts of Gollan permutation, maximum cycle decomposition, and new lower bounds are discussed, offering insights into the complexity of genetic rearrangement problems.
Uploaded on | 0 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Genome Rearrangements and Sorting by Reversals VINEET BAFNA AND PAVEL A. PEVZNER SIAM J. COMPUT, Vol. 25, No. 2, pp. 272-289(1996) Presenter: Shian-Liang Lin 2025/6/10 0
Abstract Sequence comparison in molecular biology is in the beginning of a major paradigm shift--a shift from gene comparison based on local mutations to chromosome comparison based on global rearrangements. The classical methods of sequence comparison do not work for global rearrangements, and little is known in computer science about the edit distance between sequences if global rearrangements are allowed. In the simplest form, the problem of gene rearrangements corresponds to sorting by reversals, i.e., sorting of an array using reversals of arbitrary fragments. The gene rearrangement problem forces us to consider reversals of signed permutations, as the genes in DNA could be positively or negatively oriented. An approximation algorithm for signed permutation is presented, which provides a performance guarantee of 3/2. Finally, using the signed permutations approach, an approximation algorithm for sorting by reversals is described which achieves a performance guarantee of 7/4. 2025/6/10 1
Reversals over permutations Input: 4 2 1 3 Reversal distance =2 Reverse [1,3] 1 2 4 3 Reverse [3,4] 1 2 3 4 2025/6/10 2
The Gollan permutation G(r12): 0 3 1 5 2 7 4 9 6 11 8 12 10 13 G(r13): 0 3 1 5 2 7 4 9 6 11 8 13 10 12 14 Black edge( Gray edge ( ) : if j and j are not consecutive in ) : if (i, j) is a breakpoint of . 2025/6/10 3
Maximum cycle decomposition Black edge( Gray edge ( ) : if j and j are not consecutive in ) : if (i, j) is a breakpoint of . 2025/6/10 4
Anew lower bound ? ??= ? ?? 1+ 1 ? ?? 1 ? ?? 1 ? ?? ? ?? 2 ? ?? 1 ? ?? + ? ?? 1 ? ?? + ? ?? 1 ? ?? + ? ?? 2 ? ?? 1 + ? ?0 ? ?? ? ?? 2 ? ?? 1 ? ?0 ? ?0 ? ?? Assume cycle are all 4-cycles: each 4-cycle has two black edges(two breakpoints) ? ? ? ? ? ? =? ? 2 2 2025/6/10 5
A new lower bound (1) 0 n+1 cycle 2 (2) cycle size k: (3) edge k cycle edge k=6( cycle size ): the performance ratio of our algorithms is 2025/6/10 6
Approximation algorithm for signed permutations Define a transformation from a signed permutation of order n to an unsigned permutation as follows: replace +i by 2i- 1,2i and -i by 2i, 2i- 1. : 0 -7 +1 +2 +4 +5 +3 -6 +8 9 : 0 14 13 1 2 3 4 7 8 9 10 5 6 12 11 15 16 17 2025/6/10 7
Approximation algorithm for signed permutations : 0 -7 +1 +2 +4 +5 +3 -6 +8 9 +2 11 10 5 15 +3 +4 17 4 +5 14 16 9 8 -6 +8 7 0 9 2 +1 3 1 13 12 6 0 -7 ? ? 6 1=5 ? ? 6 0 = 7 2025/6/10 : 0 14 13 1 2 3 4 7 8 9 10 5 6 12 11 15 16 17 8
SignedSort() At most n-1 breakpoints O(n) time Steps in sorting: b( )- 1/2c4( ) steps Time complexity : O(n2) Performance ratio : 3/2 2025/6/10 9
?: 0 3 1 5 2 6 4 7 2025/6/10 10
?: 1 4 5 2 4 1 5 2 ? : 1 4 2 5 ? : 2025/6/10 11
ImprovedSort(?) O(n3/2) time O(n2) time Steps in sorting: b( )- 1/4c4( ) steps Time complexity: O(n2) Performance ratio : 7/4 2025/6/10 12
ImprovedSort(?) G(r6) ?: 0 3 1 5 2 6 4 7 ?: 0 1 2 3 4 5 6 7 ???: 0 1 5 4 3 2 6 7 ?: 0 4 1 2 5 6 3 7 ?: 0 1 2 3 4 5 6 7 2025/6/10 13
ImprovedSort(?) 2025/6/10 14
ImprovedSort(?) : 0 5 10 6 2 7 3 1 8 9 4 11 4 1 5 0 2 3 6 8 7 10 11 9 2025/6/10 15
ImprovedSort(?) A1 A A2 B1 B B2 2025/6/10 16
ImprovedSort(?) Steps in sorting: b( )- 1/2c4( ) steps Time complexity : O(n2) Performance ratio : 3/2 2025/6/10 17
ImprovedSort(?) O(n3/2) time O(n2) time Steps in sorting: b( )- 1/4c4( ) steps Time complexity: O(n2) Performance ratio : 7/4 2025/6/10 18
: 0 14 13 1 2 3 4 7 8 9 10 5 6 12 11 15 16 17 11 10 5 11 10 5 15 4 4 14 7 7 0 0 1 13 12 6 1 13 12 6 : 0 11 12 6 5 10 9 8 7 4 3 2 1 13 14 15 16 17 2025/6/10 19
: 0 11 12 6 5 10 9 8 7 4 3 2 1 13 14 15 16 17 11 10 5 11 10 5 4 4 7 7 0 12 1 13 6 13 12 6 : 0 1 2 3 4 7 8 9 10 5 6 12 11 13 14 15 16 17 2025/6/10 20
: 0 1 2 3 4 7 8 9 10 5 6 12 11 13 14 15 16 17 11 10 5 10 5 11 4 4 7 7 6 13 12 6 : 0 1 2 3 4 7 8 9 10 5 6 11 12 13 14 15 16 17 2025/6/10 21
: 0 1 2 3 4 7 8 9 10 5 6 11 12 13 14 15 16 17 11 10 5 11 10 5 4 4 7 7 6 6 : 0 1 2 3 4 10 9 8 7 5 6 11 12 13 14 15 16 17 2025/6/10 22
: 0 1 2 3 4 10 9 8 7 5 6 11 12 13 14 15 16 17 11 10 5 5 4 4 7 7 6 6 : 0 1 2 3 4 6 5 7 8 9 10 11 12 13 14 15 16 17 2025/6/10 23
2025/6/10 24