Filters in CSCI 333: Bloom, Quotient, and Cuckoo

filters bloom quotient cuckoo n.w
1 / 10
Embed
Share

Explore the concepts of Bloom, Quotient, and Cuckoo filters in CSCI 333, including their operations, growth strategies, and handling of RAM constraints. Learn how Quotient filters work based on techniques from Donald Knuth's renowned book, "The Art of Computer Programming."

  • Filters
  • CSCI 333
  • Quotient
  • Bloom
  • Cuckoo

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Filters (Bloom, Quotient, & Cuckoo) CSCI 333

  2. Bloom Filters Are there any problems with Bloom filters? What operations do they support/not support? How do you grow a Bloom filter? What if your filter itself exceeds RAM (how bad is locality)?

  3. Quotient Filters Based on a technique from a homework question in Donald Knuth s The Art of Computer Programming: Sorting and Searching, volume 3 (Section 6.4, exercise 13) Quotienting Idea: 1 0 1 1 0 0 1 0 1 1 0 1 1 1 0 0 1 0 1 Hash:

  4. Quotient Filters Based on a technique from a homework question in Donald Knuth s The Art of Computer Programming: Sorting and Searching, volume 3 (Section 6.4, exercise 13) Quotienting Idea: Remaining bits are discarded/lost 1 0 1 1 0 0 1 0 1 1 0 1 1 1 0 0 1 0 1 Hash: Quotient: q most significant bits Remainder: r least significant bits

  5. Building a Quotient Filter The quotient is used as an index into an m-bucket array, where the remainder is stored. Essentially, a hashtable that stores the remainders as a value Collisions are resolved using linear probing and 3 extra bits per bucket is_occupied: whether a slot is the canonical slot for some value currently stored in the filter is_continuation: whether a slot holds a remainder that is part of a run (but not the first) is_shifted: whether a slot holds a remainder that is not in its canonical slot

  6. Quotient Filter Example Table of objects with quotients/ remainders for reference Hash table with external chaining Hash table with linear probing + bits [https://www.usenix.org/conference/hotstorage11/dont-thrash-how-cache-your-hash-flash]

  7. Quotient Filter Example [https://www.usenix.org/conference/hotstorage11/dont-thrash-how-cache-your-hash-flash]

  8. Quotient Filter Concept-check What are the possible reasons for a collision? Which collisions are treated as false positives What parameters does the QF give the user? In other words: What knobs can you turn to control the size of the filter? What knobs can you turn to control the false positive rate of the filter?

  9. Why QF over BF? Supports deletes Supports merges Good cache locality How many locations accessed per operation? Some math can show that runs/clusters are expected to be small Don t Thrash, How to Cache Your Hash on Flash also introduces the Cascade filter, a write-optimized filter made up of increasingly large QFs that spill over to disk. Similar idea to Log-structured merge trees, which we will discuss soon!

  10. Cascade Filter [https://www.usenix.org/conference/hotstorage11/dont-thrash-how-cache-your-hash-flash]

More Related Content