
Advanced Pattern Matching Techniques for Efficient Data Streaming
Explore the cutting-edge methods of pattern matching in data streaming, including approximate pattern matching and string decomposition, designed by experts in the field. Discover algorithms that optimize space and time efficiency while swiftly identifying patterns amidst data streams. Dive into the world of pattern matching with advanced algorithms like Knuth-Morris-Pratt and Boyer-Moore, and witness the evolution of techniques such as edit distance sketches and 2-Hamming Approximate Pattern Matching.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Streaming ?-edit approximate pattern matching via string decomposition Sudatta Bhattacharya Michal Kouck Charles U., Prague
Pattern matching [Knuth-Morris-Pratt 77, Boyer-Moore 77] ? a b c d e f g h c d e f n ? c d e f Input: Pattern ? and text ?. Goal: Find all occurrences of pattern ? in text ?. ? = ? ? = ?
?-edit approximate pattern matching ? a b c s e f g h c d i e f n ? c d e f Input: Pattern ? and text ?. Goal: Find all approximate occurrences of pattern ? in text ? up-to ? modifications (symbol insertion/deletion/substitution). ??= min ? ??(?,? ?,? ) report: min(??,? + 1)
Streaming pattern matching Phase 2: ? a b c s e f g h c d i e f n Phase 1: ? c d sketch e f Phase 1: Build sketch from ?. Phase 2: Perform pattern matching using the sketch.
Our result Streaming ?-edit pattern matching algorithm that uses: space ?(?2) time ?(?2) per arriving symbol
Previous results ?-edit Starikovskaya 17: Kociumaka-Porat-Starikovskaya 22: ?(?5) ours: space ?(?8 ?) time per symbol ?(?12? + ?13) ?(?8) ?(?2) ?(?2) ?-Hamming Porat-Porat 09: Clifford-Fontaine-Porat- Sach-Starikovskaya 09: Clifford-Kociumaka-Porat 19: ?(?3) ?(?2) ?(?2) ?(?) ?( ?) ?( ?)
Edit distance sketches ? sketch? ??(?,?) ? sketch? sketch size ?(?8) ?(?3) Belazzougui-Zhang 16: (?3) Jin-Nelson-Wu 21: Kociumaka-Porat-Starikovskaya 21: ?(?2) ?(?2) Bhattacharya-K. 23:
String decomposition [Bhattacharya-K. 23] ? ? ? ? ??? ?1 ?2 ?? 1 1. Each block has grammar of size ?(?). ? ??? ? ? ?? 1 ?1 ?2 2. Can be computed in streaming fashion. 3. For a pair of strings ?, ?: ? ? ? ? ?? ?? 1 ?1 ?2 ? ?) ?,???? ?? ?? ?,? = ??(???? ?? ?=1
Streaming ?-edit approximate pattern matching ?2-Hamm. Approx. Pattern Matching ? ??? ? ? ?? 1 ?? ?+2 ?? ?+1 ? ? ? ? ? ?? ?? 1 ?1 ?2 ? Clifford-Kociumaka-Porat 19 Key message: Pattern matching on grammars with up to ? ?2 mismatches.
Open problems Smaller space: ?(?) ? Recover the edit operations ?