Efficient String Matching Algorithm Using Condensed Alphabets

a very fast string matching algorithm based n.w
1 / 27
Embed
Share

"Discover a highly efficient string matching algorithm based on condensed alphabets, presenting a novel approach to reduce verifications during the searching phase. Compare against existing solutions, showcasing superior performance in practical scenarios."

  • Efficient
  • String Matching
  • Algorithm
  • Condensed Alphabets
  • Flexibility

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. A Very Fast String Matching Algorithm Based on Condensed Alphabets Grabowski, Szymon, Simone Faro, and Emanuele Giaquinta. Algorithmic Aspects in Information and Management. AAIM 2016. Lecture Notes in Computer Science, Vol. 9778. Presenter : Shie-Yan Lee 2017/6/27

  2. Abstract String matching is the problem of finding all the substrings of a text which correspond to a given pattern. It s one of the most investigated problem in computer science, mainly due to its various applications in many fields. In recent years most solutions to the problem focused on efficiency and flexibility of the searching procedure and effective techniques appeared to speed-up previous solutions. In this paper we present a simple and very efficient algorithm for string matching. It can be seen as an extension of the Skip-Search algorithm to condensed alphabets with the aim of reducing the number of verifications during the searching phase. From our experimental results it turns out that the new variant obtains in most cases the best running time when compared against the most effective algorithms in literature. This makes the new algorithm one of the most flexible solutions in practical cases.

  3. String matching problem pattern ? ? text ? ?

  4. String matching problem pattern ? ? text ? ? ? ? ?

  5. pattern ? ? The Skip Search Algorithm text ? ? Preprocessing phase Size of alphabet ? Search phase

  6. pattern ? ??? The Skip Search Algorithm text ? ????????????? Preprocessing phase Size of alphabet ? Search phase ? ? = all positions of char ? in ? ? ? ? ? 0 , 2 ? 1 ?

  7. pattern ? ??? The Skip Search Algorithm text ? ????????????? Preprocessing phase Size of alphabet ? Search phase ? ?), For each positions ??(? ??? ? = ?? 1 , for ? = 1 ,2 , , examine substring ? ? ? ?? ? ?[??] + ? 1 ?11 ?2 ?5 ?8 ? ? ? ? 0 , 2 ????????????? ? 1 ?

  8. pattern ? ??? The Skip Search Algorithm text ? ????????????? Preprocessing phase Size of alphabet ? Search phase ? ?), For each positions ??(? ??? ? = ?? 1 , for ? = 1 ,2 , , examine substring ? ? ? ?? ? ?[??] + ? 1 ?2 ?5 ?8 ?11 ? ? ? ? 0 , 2 ????????????? ? 1 ? ???

  9. pattern ? ??? The Skip Search Algorithm text ? ????????????? Preprocessing phase Size of alphabet ? Search phase ? ?), For each positions ??(? ??? ? = ?? 1 , for ? = 1 ,2 , , examine substring ? ? ? ?? ? ?[??] + ? 1 ?2 ?5 ?8 ?11 ? ? ? ? 0 , 2 ????????????? ? 1 ??? ? ???

  10. pattern ? ??? The Skip Search Algorithm text ? ????????????? Preprocessing phase Size of alphabet ? Search phase ? ?), For each positions ??(? ??? ? = ?? 1 , for ? = 1 ,2 , , examine substring ? ? ? ?? ?2 ?5 ?8 ? ?[??] + ? 1 ?11 ? ? ? ? 0 , 2 ????????????? ? 1 ? ???

  11. pattern ? ??? The Skip Search Algorithm text ? ????????????? Preprocessing phase Size of alphabet ? Search phase ? ?), For each positions ??(? ??? ? = ?? 1 , for ? = 1 ,2 , , examine substring ? ? ? ?? ?2 ?5 ?8 ? ?[??] + ? 1 ?11 ? ? ? ? 0 , 2 ????????????? ? 1 ?

  12. pattern ? ??? The Skip Search Algorithm text ? ????????????? Preprocessing phase Size of alphabet ? Search phase ?1 ?3 ?2 ????????? ????????? ?????????

  13. Complexity of the Skip Search Algorithm Preprocessing phase : ? ? Searching phase : ? ?? Worst case : ? ??

  14. The Alpha Skip Search Algorithm Preprocessing phase Search phase

  15. The Alpha Skip Search Algorithm pattern ? ? Preprocessing phase text ? ? Search phase Size of alphabet ? Arrange all the substrings of length ? ? = log?? in a trie ?? . ? = log39 = 2 pattern ? ????????? ?? ?? ?? ?? ?? ??

  16. Trie ( / ) A , to , tea , ted , ten , i , inn , in

  17. The Alpha Skip Search Algorithm pattern ? ????????? ? ? ? ?? ?? ? ?? ?? ?? ? ? ? ? ?? ? ?? ?? ?? ?? ?? ?? ?????? ? ? ?? 6 ?? 7 ?? 3 ?? 2 , 5 ?? 0 ?? 1 , 4

  18. The Alpha Skip Search Algorithm pattern ? ????????? Preprocessing phase text ? ??????????????????? Search phase Look into buckets of text ? ? ? + ? 1 , and examine. ( ? = ? ? ? + 1 1, ? = 1,2,3, ) ?1 ?????? ? ? ??????????????????? ?? 6 ?? 7 3 ?? ?? 3 ?? 2 , 5 ????????? ?? 0 ?? 1 , 4

  19. The Alpha Skip Search Algorithm pattern ? ????????? Preprocessing phase text ? ??????????????????? Search phase Look into buckets of text ? ? ? + ? 1 , and examine. ( ? = ? ? ? + 1 1, ? = 1,2,3, ) ?2 ?????? ? ? ??????????????????? ?? 6 ?? 7 7 ?? ?? 3 ?? 2 , 5 ?? 0 ????????? ?? 1 , 4

  20. The Alpha Skip Search Algorithm ?3 ?2 ?1 ????????? ????????? ????????? ?

  21. Complexity of the Alpha Skip Search Algorithm Worst case : ? ?? ? log?? ? log?? Expected time complexity : ?

  22. New Fast Variant of the Skip Search Algorithm pattern ? ? Size of alphabet ? const ? (1 ? ?) text ? ? Preprocessing phase indexes all subsequences of the length ? of the pattern ? (substring of length ? converted into a numeric value.) ? ?-bit ? ?? ? ??+1 ? ? ??+? 2 ??+? 1 ???? ?????????? ? ?

  23. New Fast Variant of the Skip Search Algorithm pattern ? ? Size of alphabet ? const ? (1 ? ?) text ? ? Preprocessing phase indexes all subsequences of the length ? of the pattern ? (substring of length ? converted into a numeric value.) ? ? ? ... 1233 11 , 5 1234 25 1235 2 , 9

  24. New Fast Variant of the Skip Search Algorithm Searching phase examine the candidate substrings ? = ? ? ? + 1 1 for k = 1,2,3, ? ? ? ... ?1 1233 11 , 5 text ? 1234 25 1235 2 , 9 ?1 ?

  25. New Fast Variant of the Skip Search Algorithm Searching phase examine the candidate substrings ? = ? ? ? + 1 1 for k = 1,2,3, ? ? ? ... ?1 1233 11 , 5 text ? 1234 25 1235 2 , 9 ? compared with ? ?

  26. Complexity of this Algorithm Preprocessing phase: ? ?? = ?(?) Search phase: ? ?? (?????)

  27. Running time

Related


More Related Content