Optimal Space Hashing Techniques for Efficient Searching

randomized algorithms cs648 n.w
1 / 31
Embed
Share

Learn about hashing techniques for achieving optimal space utilization and fast search times in randomized algorithms. Explore concepts such as perfect hashing, universal hash families, collision definitions, and more. Understand how to construct hash tables effectively for querying data structures in O(1) time.

  • Hashing Techniques
  • Optimal Space
  • Fast Search
  • Randomized Algorithms
  • Perfect Hashing

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Randomized Algorithms CS648 Lecture 12 Hashing - II 1

  2. RECAP OF LAST LECTURE

  3. Problem Definition Examples: ? = 1018, ? = 103 ? = 1,2, ,? called universe ? ?and ? = |?| ? ? Aim Given a set ?, build a data structure storing ? s.t. we can answer in O(1) time : Does? ?? for any given ? ?.

  4. Hashing Answering a Query: Does? ?? 1. ? ?(?); 2. Search the list stored at ?[?]. Hash table: Elements of ? ?: an array of size ?. Hash function ? : ? [?] ? 0 1 Properties of ? : ? ? computable in O(1) time. Space required by ?: O(1). ? ? How many bits needed to encode ? ?

  5. Collision Definition: Two elements ?,? ?are said to collide under hash function ?if ? ? = ? ? ? 0 1 Worst case time complexity of searching an item ? : No. of elements in ? colliding with ?. ? ?

  6. Universal Hash Family Definition: A collection ? of hash-functions is said to be universal if there exists a constant ? such that for any ?,? ?, ? ? ?? ?? ? ? = ? ? This definition appears strange in the beginning! But we shall soon see that there is a very natural way to arrive at this definition.

  7. Perfect hashing using O(??) space Let ? be Universal Hash Family. Let ? : the number of collisions for ? when ? ?? ? Question: What is ?[?] ? ??,?= ? if ? ? = ?(?) otherwise ? ? = ??,? ?<? ??? ?,? ? ? ? = ?[??,?] ?<? ??? ?,? ? = ?[??,?= ?] ?<? ??? ?,? ? ? ? ?<? ??? ?,? ? =? ? ?(? ?) ?

  8. Perfect hashing using O(??) space Let ? be Universal Hash Family. Let ? : the number of collisions for ? when ? ?? ? ? ? ?(? ?) ? Lemma2:For ? = ???, there will be no collision with probability at least 1 Lemma1: ?[?] = 2. Algorithm1: Perfect hashing for ? Fix ? = ???; Repeat 1. Pick ? ?? ; 2. ? the number of collisions for ? under ?. Until ? = ?. Build the hash table. Theorem: A perfect hash function can be computed for ? in expected O(??) time.

  9. HASHING WITH OPTIMAL SPACE AND WORST CASE O(1) SEARCH TIME

  10. Optimal space hashing with worst case O(1) search time ? be Universal Hash Family. ? : no. of collisions for ? when ? ?? ? ? ? ?(? ?) Lemma1: ?[?] = . ? Question: What is ?[?] when ? = ? ? Answer: ?(? ?) ?.

  11. Optimal space hashing with worst case O(1) search time ? be Universal Hash Family. ? : no. of collisions for ? when ? ?? ? Lemma1: ?[?] =(? ?) ? ? when ? = ??. 0 Algorithm: Fix? = ??; Repeat 1. Pick? ?? ; 2. ? no. of collisions for ? under ?; Until ? ?; Build the hash table; //primary hash table For each 0 ? < ? If size of list ?[?] > 1 1. Build a perfect hash table for list ?[?]; 2. Make ?[?] point to this hash table; 1 ? ?

  12. Optimal space hashing with worst case O(1) search time ? be Universal Hash Family. ? : no. of collisions for ? when ? ?? ? Lemma1: ?[?] =(? ?) ? ? when ? = ??. 0 Algorithm: Fix? = ??; Repeat 1. Pick? ?? ; 2. ? no. of collisions for ? under ?; Until ? ?; Build the hash table; //primary hash table For each 0 ? < ? If size of list ?[?] > 1 1. Build a perfect hash table for list ?[?]; 2. Make ?[?] point to this hash table; 1 ? ?

  13. Optimal space hashing with worst case O(1) search time ? be Universal Hash Family. ? : no. of collisions for ? when ? ?? ? Lemma1: ?[?] =(? ?) ? ? when ? = ??. 0 Algorithm: Fix? = ??; Repeat 1. Pick? ?? ; 2. ? no. of collisions for ? under ?; Until ? ?; Build the hash table; //primary hash table For each 0 ? < ? If size of list ?[?] > 1 1. Build a perfect hash table for list ?[?]; 2. Make ?[?] point to this hash table; 1 ? ?

  14. Optimal space hashing with worst case O(1) search time ? be Universal Hash Family. ? : no. of collisions for ? when ? ?? ? Lemma1: ?[?] =(? ?) ? ? when ? = ??. 0 Algorithm: Fix? = ??; Repeat 1. Pick? ?? ; 2. ? no. of collisions for ? under ?; Until ? ?; Build the hash table; //primary hash table For each 0 ? < ? If size of list ?[?] > 1 1. Build a perfect hash table for list ?[?]; 2. Make ?[?] point to this hash table; 1 ? ?

  15. ? be Universal Hash Family. ? : no. of collisions for ? when ? ?? ? Lemma1: ?[?] =(? ?) when ? = ??. ? ? ? 0 0 1 2 1 2 . . . . . . ? 1 ? 1 Let ?? : number of elements in ?[?] Extra Space required: ?<? ?????>1??2 ???? 1 Is there any relation between ? and ?? s? ?= ?<? ?????>1 ?<? ?????>1??2 = 2? + ?<? ?????>1?? ?<??????>1??2< ?? 2

  16. Theorem: A given set ? can be preprocessed in expected O(?) time to build a data structure (2-level hash table) of O(?) size such that any search query can be answer in worst case O(1) time.

  17. WHY SUCH A DEFINITION FOR UNIVERSAL HASH FAMILY ?

  18. Why does hashing work so well in Practice ? A simple hash function:? ? = ? ??? ?. ? works so well in practice because the set ? is usually a uniformly random subset of ?. As a result ? ??,? ?? ? ? = ? ? ? It is easy to fool this hash function such that it achieves O(s) search time. This makes us think: Can we achieve expected O(1) search time for any given set ?. similar question while Quick Sort Randomized Quick Sort

  19. Universal Hash Family A simple hash function:? ? = ? ??? ?. ? ??,? ?? ? ? = ? ? ? Definition: A collection ? of hash-functions is said to be universal if there exists a constant ? such that for any ?,? ?, ? ? ?? ?? ? ? = ? ?

  20. A SIMPLE AND COMPACT UNIVERSAL HASH FAMILY

  21. The starting point The simple hash function:? ? = ? ??? ?. Problem: Two elements in ?,? ? are bound to collide if ? divides |? ?| . Is there some operation which when applied over any ? distributes |? ?| randomly uniformly over [0,1, ,? 1] ?

  22. mod operation ?: a non-negative integer ?: a positive integer ?mod? {0,1, ,? 1}. Question: How is |?mod? ?mod?| related to |? ?|mod? ? Consider some Examples: | 55 mod 31 43 mod 31 | = ?? and | 55 43| mod 31 = ?? 12 12 | 91 mod 31 102 mod 31 | = ?? and | 91 102| mod 31 = ?? 20 11 {?, ? ?} Answer: Let ?= |? ?| mod? . Then |?mod? ?mod?| = ??

  23. mod operation ? : a prime number ? : {1,2, ,? 1} Consider any ? ?. Question: What can we say about set ?? = {?? mod ? | ? ?} ? Example: ? = 7, ? = 2. ? 1 2 3 4 5 6 3? mod 7 3 6 2 5 1 4

  24. mod operation ? : a prime number ? : {1,2, ,? 1} Consider any ? ?. Question: What can we say about set ?? = {?? ??? ? |? ?} ? Example: ? = 7, ? = 3. ? 1 2 3 4 5 6 3? mod 7 4? mod 7 6 2 5 1 4 3 4 1 5 2 6 3 Fact: ?? = ? for all ? ?. Proof: ?? ??? ? = ?? ??? ? ? divides (?? ??) ? divides ?(? ?) ? divides ?or? divides (? ?) Not possible

  25. mod operation ? : a prime number ? : {1,2, ,? 1} Consider any ? ?. Define set ?? = {?? ??? ? |? ?} ? Fact: ?? = ? for all ? ?. Question: If ? ??, then what can we say about (?? ??? ?) ? Answer: distributed randomly uniformly over ?. Can you now see, that the above answer plays the key role in formulating the hash function ??? = (?? ??? ?) ??? ? ?

  26. ??? = (?? ??? ?) ??? ? 1 2 Good fact: An element ? is mapped to a random element in {0, ,? 1}. ?? ??? ? . . . Slightly bad fact : Once element ? is mapped to a location, the mapping of ? + is no more random. ? ? + So it is not clear whether |??? + - ??? | is mapped uniformly randomly over {0, ,? 1}. So let us see ??() a bit more closely ? = ? 1

  27. Probability of collision between ?and? + Let ??? = (?? ??? ?) ??? ? ?and? + will collide under ??if |??mod? ? + ?mod?| is divisible by ?. Question: What is relation between |??mod? ? + ?mod?| and ?mod? ? Answer: |??mod? ? + ?mod?| is either ?mod? or ? ?mod?.

  28. Probability of collision between ?and? + Let ??? = (?? ??? ?) ??? ? Lemma: If ?and? + collide under ??, then either ?mod? is divisible by ? or ? ?mod? is divisible by ?. {1, ,? 1} { ?mod? | 1 ? ? 1} = ?? Students must realize that it is a necessary condition and not sufficient condition for collision. To get an idea, study the example given at the last slide of this lecture. Let ? ?{1, , ? 1}. Probability of collision between ? and ? + = P( ?mod? is divisible of ? = 2 P( ?mod? is divisible of ?) or ? ?mod? is divisible by ?) ? 1 ? ? 1 2 ? = 2

  29. Theorem: Let??? = (?? ??? ?) ??? ?, then H={??| 1 ? ? 1} is universal.

  30. Example ? = 7, ? = 4. ??? = (?? ??? 7) ??? 4. ? ? 1 2 3 4 5 6 ? 1 ? 1 2 3 4 5 6 1 2 3 4 5 6 Observe that =1 2 4 6 1 3 5 Question: How many collisions between 2 and 3 ? Answer: two (for ?=3,4). Here ? ??? 7 = 4 for ?=4. And 7 ? ??? 7 = 4 for ?=3 3 6 2 5 1 4 4 1 5 2 6 3 5 3 1 6 4 2 6 5 4 3 2 1 Table storing?? ??? 7 Question: How many collisions between 2 and 4 ? Answer: No collisions! (although ? ??? 7 = 4 for ? = 2 here.)

  31. Homework: Let??,?? = (?? ??? ? + ?) ??? ?, Then prove that H={??,?| 1 ?,? ? 1} is universal. In particular, show that for any ?,? ?, =1 ?? ?? ? ? = ? ? ? Hence it is slightly better than the hash family discussed just now.

Related


More Related Content