Data Structures Overview

1 / 27

Embed Share

"Learn about arrays, linked lists, and binary search trees in this overview of data structures. Understand the performance of different data structures and explore set representations, hash tables, and hash functions."

geck275 Follow

Uploaded on Jul 15, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Hash Tables Hash Tables

Overview of Data Structures Overview of Data Structures Arrays Access: O(1) Insertion: O(N) (average case) Deletion: O(N) (average case) Linked lists Access: O(N) Insertion: O(N) (average case O(1) at front) Deletion: O(N) (average case O(1) at front) Binary Search Trees Access: O(logN) if balanced Insertion: O(logN) if balanced Deletion: O(logn) if balanced

BST: Better Performance BST: Better Performance Divide and conquer reduce work by a factor of 2 at each step Can we reduce by a bigger factor?

Set Representation Set Representation To represent keys with no duplicates: Use an array Store value k at index k in array Could just store 0 (not in set) or 1 (in set) at index k (or true/false) Problems memory use, often sparsely filled array; negative ints, ... Big-O analysis: Access: Insertion: Deletion: index 0 1 2 3 4 5 6 7 value 0 0 2 0 4 5 0 0

Hash Tables Hash Tables Hash tables overcome the problems of arrays but maintain fast access, insertion, deletion Use an array and hash functions to determine the index of each element hash: to mix randomly, to chop into pieces hash function: mapping that takes a large data value (not necessarily an integer) and reduces it to a smaller piece of data (usually an integer) maps values to indexes hash code: the output of a hash function for a particular input Ex: On last slide, the hash function was hashCode(i) = i hash table: array that stores elements via hashing Call the array entries "buckets" in previous slide, we had 8 buckets

Another Hash Function Another Hash Function Want to allow negative integers hashCode(i) = abs(i) % capacity, where capacity is number of buckets int hashCode(int i) { return abs(i) % capacity; } Suppose capacity = 8 mySet.add(3); mySet.add(-4); mySet.add(10); mySet.add(115); // collides with 3 index 0 value 1 2 10 3 3 115 4 -4 5 6 7

Collisions Collisions collision: a hash function maps two or more values to same index mySet.add(3); mySet.add(-4); mySet.add(10); mySet.add(115); // collides with 3 Must have way of resolving collisions index 0 1 2 3 4 5 6 7 value 10 3 115 -4

Hash Function Example Hash Function Example Use names as the key Take the second letter of the name, take int value of letter (a=0, b=1, etc.), divide by 8 and take the remainder Ex: What does "Norbert" hash to? o 15 15 % 8 = 7 Mary 1 % 8 = 1 Huy 21 % 8 = 6 Brandon 18 % 8 = 2 Scott 3 % 8 = 3 Shyam 8 % 8 = 0 Bobby 15 % 8 = 7 Colin 15 % 8 = 7 (collision) Not a 1-1 mapping from keys to hash values What's the max number of values that this function can hash perfectly (without collisions)?

Hash Function Hash Function Another example of a hash function: maps a string to the sum of the ascii values for its characters int hashCode(string s) { int sum = 0; for(int i = 0; i < s.length(); i++) sum += s[i]; return sum; } Question: what's the hash code for "BAD"?

Collision Handling Collision Handling How to insert an element when an element is already occupying the bucket it's mapped to

Collision Resolution: Probing Collision Resolution: Probing probing: resolve a collision by moving to another index linear probing: move to next available index, wrapping around if necessary mySet.add(3); mySet.add(-4); mySet.add(10); mySet.add(115); // collides with 3 index 0 value 1 2 10 3 3 4 -4 5 115 6 7 quadratic probing: a variation that adds terms of a polynomial to index when a collision occurs index + 1, index + 4, index + 9, ... Disadvantage: clusters elements at consecutive indexes slows down lookup Resize when load factor reaches some limit load factor of table with n elements: n/capacity

Collision Resolution: Chaining Collision Resolution: Chaining Elements in hash table are another data structure linked list, balanced binary tree Uses more space Everything goes in the right bucket, can't run out of indexes index value 0 NULL 1 NULL 2 3 NULL 4 5 NULL 6 NULL 12 34 74 42 14

Chaining: how to add element Chaining: how to add element Make sure you avoid duplicates Add to linked list of new element's bucket fastest to add to front Insertion: O(1) index value 0 NULL 1 NULL 2 3 NULL 4 5 NULL 6 NULL 12 34 74 42 14

Implement Implement HashSet HashSet Class Class Exercise: Represent a set of integers using a hash table Include the following member functions: HashSet() add(int value) clear() contains(int value) remove(int value)

HashSet.h HashSet.h #ifndef _HASHSET_H #define _HASHSET_H #include<string> struct HashNode { int data; HashNode* next; HashNode(int data = 0; HashNode* next = NULL) { this data = data; this next = next; } };

HashSet.h HashSet.h class HashSet { public: HashSet(); // construct new empty set ~HashSet(); // frees memory allocated by set void add(int value); // adds value to set if not already present void clear(); // removes all elements from set bool contains(int value) const; // returns true if value is in set void print() const; // prints hash table void remove(int value); // Removes value from set, if present private: HashNode** elements; // array of HashNode pointers int capacity; int size; int hashCode(int value) const; // returns integer index of value }; #endif

HashSet.cpp HashSet.cpp #include "HashSet.h" using namespace std; HashSet::HashSet() { elements = new HashNode* [10](); capacity = 10; size = 0; } HashSet::~HashSet() { // to do delete [] elements; } int HashSet::hashCode(int value) const { return abs(value) % capacity; }

add Method add Method Add an element to the hash table Don't add duplicate elements Compute hash code, and add element to corresponding linked list (if it's not already there) index value 0 NULL 1 NULL 2 3 NULL 4 5 NULL 6 NULL 12 34 table.add(52); 74 42 52 14

Exercise: add Method Exercise: add Method Write the add Method for HashSet https://www.youtube.com/watch?v=PCIvOGveIK0

HashSet.cpp HashSet.cpp void HashSet::add(int value) { if(!contains(value)) { int bucket = hashCode(value); // add to linked list HashNode* newNode = new HashNode(value); newNode->next = elements[bucket]; elements[bucket] = newNode; size++; } }

HashSet.cpp HashSet.cpp void HashSet::print() const { for(int i = 0; i < capacity; i++) { cout << "[" << i << "]: "; HashNode* cur = elements[i]; while(cur != NULL) { cout << " -> " << cur->data; cur = cur->next; } cout << endl; } cout << "size = " << size << endl; }

contains Method contains Method Is a value in the hash table? Returns true or false. Compute the value's hash code, and look through linked list at that index. table.contains(42); // true table.contains(54); // false index value 0 NULL 1 NULL 2 3 NULL 4 5 NULL 6 NULL 12 34 74 42 14

Exercise: contains Method Exercise: contains Method Write the contains method for HashSet

contains Method contains Method bool HashSet::contains(int value) { int bucket = hashCode(value); HashNode* cur = elements[bucket]; while(cur != NULL) { if(cur->data == value) return true; cur = cur->next; } return false; }

remove Method remove Method

remove Method remove Method // if value is in hash table, remove it. void HashSet::remove(int value) { if(!contains(value)) return; int bucket = hashCode(value); HashNode* temp = NULL; if(elements[bucket] data == value){ temp = elements[bucket]; elements[bucket] = temp next; } // element to remove isn't first else{ HashNode* cur = elements[bucket]; HashNode* temp = NULL; while(cur next != NULL) { if(cur next data == value){ temp = cur next; cur next = cur next next; } cur = cur next; } } delete temp; }

Rehash Rehash rehash: move to larger array when table too full Can't copy old array into larger array hash codes will change load factor = (# of elements)/(# of buckets) Common to rehash when load factor is around 0.75 Reduces collisions linked lists are shorter

Data Structures Overview

Download Presentation

Presentation Transcript

Related

More Related Content