Optimizing Sorting and Hashing Algorithms for Efficient String Manipulation

hkbu icpc seminar3 n.w
1 / 19
Embed
Share

Explore advanced techniques for sorting, counting, and hashing operations in algorithm design to efficiently manage and process strings of characters. Learn how to implement bucket sort, handle collisions, and leverage hash functions effectively to enhance performance.

  • Algorithm Optimization
  • String Manipulation
  • Sorting Techniques
  • Hashing Algorithms

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. HKBU ICPC Seminar3 Zhu Xuliang

  2. Problem Sort a1, a2, , an in O(n), where 1 <= ai <= n

  3. Problem Sort a1, a2, , an in O(n), where 1 <= ai <= n Step1: Use cnt[x] to store the number of ai = x; Step2: Output cnt[x] times of x from 1 to n; Bucket sort

  4. Problem Given a list of string s1, s2, , sn (only contains a to z ), count the number of different strings.

  5. Problem Given a list of string s1, s2, , sn, count the number of different strings. Map string si -> int ai Cnt[ai]++

  6. Hash function If si = sj, Hash(si) = Hash(sj) If si != sj, Hash(si) != Hash(sj) Hash(s) = (s[0] a ) * 260+ (s[1] a ) * 261 + + (s[n] a ) * 26n Is it true?

  7. Hash function If si = sj, Hash(si) = Hash(sj) If si != sj, Hash(si) != Hash(sj) Hash(s) = (s[0] a ) * 270+ (s[1] a ) * 271 + + (s[n] a ) * 27n Hash(s) maybe too large Mod a large prime number

  8. Hash collision If si != sj, Hash(si) maybe equal Hash(sj) Double Hash si = sj <==> Hash1(si) = Hash1(sj) and Hash2(si) = Hash2(sj) si != sj <==> Hash1(si) != Hash1(sj) or Hash2(si) != Hash2(sj)

  9. Hash table Hash table: unordered_map (C++) / hashmap (java) Red-Black tree: map (C++) / treemap (java)

  10. Background of Mod We know a mod p = x, b mod p = y, where a and b are large number (a > b > 10100), p is a prime, and x, y are not zero. Calculate (use x, y, and p): (1) (a + b) mod p (2) (a b) mod p (3) (a * b) mod p (4) (a / b) mod p (if a mod b == 0)

  11. Background of Mod (1) (a + b) mod p = (x + y) mod p (2) (a b) mod p = (x y + p) mod p (3) (a * b) mod p = (x * y) mod p (4) (a / b) mod p (if a mod b == 0) = (a * b-1) mod p = (x * y-1) mod p How to calculate y-1mod p?

  12. Background of Mod Fermat s little theorem ( ) If gcd(a, b) = 1, y-1mod p = yp-2mod p y-1is called Modular Multiplicative Inverse ( / ) (4) (a / b) mod p (if a mod b == 0) = (x * (yp-2 mod p)) mod p

  13. Problem Give a string s1 and s2 (only contains a to z ), judge whether s1 is substring of s2?

  14. Problem Give a string s1 and s2 (only contains a to z ), judge whether s1 is substring of s2? Assume len(s1) = m, len(s2) = n. Check all substrings s of s2 with len(s) = m.

  15. Problem Give a string s1 and s2 (only contains a to z ), judge whether s1 is substring of s2? Assume len(s1) = m, len(s2) = n. Check all substrings s of s2 with len(s) = m. Time complexity O(mn).

  16. Problem Give a string s1 and s2 (only contains a to z ), judge whether s1 is substring of s2? Hash(s1) = x; Check all substrings s of s2 with len(s) = m by hash function. Time complexity?

  17. Problem Check all substrings s of s2 with len(s) = m by hash function. Hash(s2(0, m-1)) = (s2[0] a ) * 270+ (s2[1] a ) * 271 + + (s2[m - 1] a ) * 27n Hash(s2(1, m)) = (s2[1] a ) * 270+ (s2[2] a ) * 271 + + (s2[m] a ) * 27n Hash(s2(1, m)) * 27 - (s2[m] a ) * 27m+1+ (s2[0] a ) * 270= Hash(s2(0, m-1)) Time complexity: O(nlogm) or O(n)?

  18. Problem Another method for hash: Hash(s2(i, j)) * 27i= Hash(s2(0, j)) Hash(s2(0, i-1)) Hash(s2(i, j)) = (Hash(s2(0, j)) Hash(s2(0, i-1))) * 27-i

  19. Problem Give a string s1 and s2 (only contains a to z ), judge whether s1 is substring of s2? Another O(n) method: KMP (Knuth Morris Pratt algorithm)

Related


More Related Content