Advanced Techniques for Approximate String Matching with Inversions and Translocations

approximate string matching allowing n.w
1 / 20
Embed
Share

Explore the intricate problem of approximate string matching considering inversions and translocations. Delve into an algorithmic approach, complexities, and efficient implementations for enhanced text analysis.

  • String Matching
  • Algorithm
  • Translocations
  • Inversions
  • Text Analysis

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Approximate String Matching Allowing for Inversions and Translocations Domenico Cantone, Simone Faro, Emanuele Giaquinta Proceedings of Prague Proceedings of Prague Stringology Stringology Conference Conference 2010, pp. 37 2010, pp. 37- -51 51 Presenter: Tzu-Chiang Hsu ( ) 2015/11/03

  2. Abstract (1/3) The approximate string matching problem consists in finding all locations at which a pattern P of length m matches a substring of a text T of length n, after a given finite number of edit operations. In this paper we investigate such problem when the string distance involves translocations of equal length adjacent factors and inversions of factors.

  3. Abstract (2/3) In particular, we devise a O(nm max( , ))-time and O(m2)-space algorithm, where and are respectively the maximum length of the factors involved in any translocation and inversion. Our algorithm is based on the dynamic-programming approach and makes use of the Directed Acyclic Word Graph of the pattern.

  4. Abstract (3/3) Moreover we show that under the assumptions of equiprobability and independence of characters our algorithm has a O(n log m) average time complexity. Finally, we briefly sketch in an appendix an efficient implementation, based on bit-parallelism.

  5. reversal = inversion attagcct atcgatct

  6. translocation acatgcct agcccatt

  7. Approximate string matching allowing for inversions and translocations cabccbabcc cbaabcbc text: pattern:

  8. abcbc Directed Acyclic Word Graph

  9. abcbc {0} {2,4} {1,3} {1} {2} {4} {3}

  10. abcbc abacbabcc 121121221 {0}{1}{0} {2,4} {1,3} {0}{1}{2} {2,4}

  11. acbabcc 0 abcbc

  12. acbabcc 0 x abcbc

  13. acbabcc 0 x 2 cbcba abcbc

  14. acbabcc 0 x 2 0 abcbc

  15. acbabcc 0 x 2 0 3 abcbc

  16. acbabcc 0 x 2 0 3 abcbc

  17. O(mn max(, ))

  18. c {0,2} c b a b cb, b {1,3} cbc, bc c {2} cbcba, bcba, cba, ba, a a b cbcb, bcb a {4} {3}

Related


More Related Content