
Advanced Techniques in Database Systems Management
Explore topics like B+ tree node coalescence, hashing, query execution, and leaf coalescence in database systems. Understand the main purpose of index structures and B+ tree rules to optimize search processes and manage dynamic changes efficiently.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
CSCE-608 Database Systems Spring 2025 Instructor: Jianer Chen Office: PETR 428 Phone: 845-4259 Email: chen@cse.tamu.edu Notes 27: Node coalescence in B+ trees, Hashing
in tables (relations) lock table DDL language DDL complier database administrator concurrency control file logging & recovery manager transaction manager index/file manager buffer manager DML (query) language query execution engine database programmer DML complier main memory buffers secondary storage (disks) DBMS
The Main Purpose of Index Structures Speedup the search process quickly figure out blocks containing the desired tuples index a=6(R) otherwise have to scan the entire R disks But also need to handle dynamic changes of R 3
B+tree rules A B+tree node of order n k2 p2 k1 k1 k2 kn pn pn+1 p1 kn p3 p1 p2 p3 pn+1 pn Rule 1. All leaves are at same lowest level (balanced tree) Rule 2. Pointers in leaves point to records except for sequence pointer Rule 3. Number of keys/pointers in nodes: Max. # pointers n+1 n+1 n+1 Max. # keys n n n Min. # pointers (n+1)/2 (n+1)/2 +1 2 Min. # keys (n+1)/2 1 (n+1)/2 1 Non-leaf Leaf Root 4
Delete in B+tree Basic idea: 1. Find the leaf L where the record r should be deleted; 2. If L remains at least half-full after deleting r, then delete r, and return; 3. Else consider the sibling L of L; 3.1 If L is more than half-full, then move a record from L to L, and return; 3.2 Else combine L and L into a single leaf; 3.3 But now you need to consider the case of deleting a child from L s parent (recursively) 5
Delete in B+tree Simple case: the node remains at least half-full after deletion. Re-distribute keys among siblings Coalesce with a sibling and delete a pointer from its father Observation: when two siblings both are no more than half full, coalesce them into a single node (which is nearly full) 6
Leaf Coalescence p k pkp k i = (n+1)/2 1 t = (n+1)/2 p1 k1 piki --- p* q* q1h1 qt ht --- 8
Leaf Coalescence p k pkp k i = (n+1)/2 1 t = (n+1)/2 p1 k1 piki --- p* q* q1h1 qt ht --- p k pkp k p1 k1 piki q1h1 qt ht -q* q* 9
Leaf Coalescence p k pkp k i = (n+1)/2 1 t = (n+1)/2 p1 k1 piki --- p* q* q1h1 qt ht --- p k pkp k p1 k1 piki q1h1 qt ht -q* q* 10
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 38 60 80 10 40 50 60 75 3 5 10 20 80 90 12
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 38 60 80 10 40 50 60 75 3 5 10 20 80 90 Delete 5 13
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 38 60 80 10 40 50 60 75 3 5 10 20 80 90 14
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 38 60 80 10 40 50 60 75 3 5 10 20 80 90 The sibling is just half-full, so we should coalesce 15
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 38 60 80 10 Leaf coalescence 40 50 60 75 3 10 20 10 20 80 90 3 5 10 20 16
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) Less than half-full 38 60 80 10 Leaf coalescence 40 50 60 75 3 10 20 10 20 80 90 3 5 10 20 17
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) Less than half-full more than half-full, so we can re- distribute pointers at nonleaves 38 60 80 10 Leaf coalescence 40 50 60 75 3 10 20 10 20 80 90 3 5 10 20 18
Leaf coalescence : Delete key 5 order n=3 Delete(prt, 5) Less than half-full more than half-full, so we can re- distribute pointers at nonleaves 38 60 80 10 Leaf coalescence 40 50 60 75 3 10 20 10 20 80 90 3 5 10 20 19
Key re-distribution at Nonleaves order n=3 Delete(prt, 5) Less than half-full more than half-full, so we can re- distribute pointers at nonleaves 38 60 80 10 Leaf coalescence 40 50 60 75 3 10 20 10 20 80 90 3 5 10 20 20
Key re-distribution at Nonleaves order n=3 Delete(prt, 5) Less than half-full more than half-full, so we can re- distribute pointers at nonleaves 38 60 60 80 38 10 Leaf coalescence 40 50 60 75 3 10 20 10 20 80 90 3 5 10 20 21
Key re-distribution at Nonleaves order n=3 Delete(prt, 5) Less than half-full more than half-full, so we can re- distribute pointers at nonleaves 38 60 80 60 80 38 10 Leaf coalescence 40 50 60 75 3 10 20 10 20 80 90 3 5 10 20 22
Key re-distribution at Nonleaves order n=3 60 80 80 38 40 50 60 75 3 10 20 80 90 23
Nonleaf Coalescence p k pk p k i+1 = (n+1)/2 1 t+1= (n+1)/2 p1 k1 piki pi+1 q1h1 qt ht qt+1 25
Nonleaf Coalescence p k pk p k i+1 = (n+1)/2 1 t+1= (n+1)/2 p1 k1 piki pi+1 q1h1 qt ht qt+1 p k pk p k p1 k1 piki pi+1 q1h1 qt ht qt+1 q1h1 qt ht qt+1 26
Nonleaf Coalescence p k pk p k i+1 = (n+1)/2 1 t+1= (n+1)/2 p1 k1 piki pi+1 q1h1 qt ht qt+1 p k pk p k p1 k1 piki pi+1 q1h1 qt ht qt+1 q1h1 qt ht qt+1 ? 27
Nonleaf Coalescence p k pk p k i+1 = (n+1)/2 1 t+1= (n+1)/2 p1 k1 piki pi+1 q1h1 qt ht qt+1 p k pk p k p1 k1 piki pi+1 q1h1 qt ht qt+1 q1h1 qt ht qt+1 ? 28
Nonleaf Coalescence p k pk p k i+1 = (n+1)/2 1 t+1= (n+1)/2 p1 k1 piki pi+1 q1h1 qt ht qt+1 p k pk p k p1 k1 piki pi+1 q1h1 qt ht qt+1 q1h1 qt ht qt+1 k 29
Nonleaf Coalescence p k pk p k i+1 = (n+1)/2 1 t+1= (n+1)/2 p1 k1 piki pi+1 q1h1 qt ht qt+1 p k pk p k p1 k1 piki pi+1 q1h1 qt ht qt+1 q1h1 qt ht qt+1 k 30
Nonleaf coalescence : Delete key 5 order n=3 31
Nonleaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 55 10 60 61 72 55 58 10 20 3 5 32
Nonleaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 55 10 60 61 72 55 58 10 20 3 5 delete 5 33
Nonleaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 55 10 60 61 72 55 58 10 20 3 5 34
Nonleaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 55 10 60 leaf coalescence 61 72 55 58 10 20 3 10 20 3 5 10 20 35
Nonleaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 55 less than half-full 10 60 leaf coalescence 61 72 55 58 10 20 3 10 20 3 5 10 20 36
Nonleaf coalescence : Delete key 5 order n=3 Delete(prt, 5) 55 less than half-full just half-full, so we need to coalesce 10 60 leaf coalescence 61 72 55 58 10 20 3 10 20 3 5 10 20 37
Nonleaf coalescence : Delete key 5 order n=3 55 60 Delete(prt, 5) 55 nonleaf coalescence 55 60 60 leaf coalescence 61 72 55 58 10 20 3 10 20 3 5 10 20 38
Nonleaf coalescence : Delete key 5 order n=3 55 60 Delete(prt, 5) 55 nonleaf coalescence new root 55 60 60 leaf coalescence 61 72 55 58 10 20 3 10 20 3 5 10 20 39
Nonleaf coalescence : Delete key 5 order n=3 55 60 61 72 55 58 3 10 20 40
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); 41
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); delete at a leaf 42
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); delete at a nonleaf 43
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); delete at a nonleaf recursion 44
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); delete at a nonleaf half-full condition is satisfied recursion 45
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); delete at a nonleaf half-full condition is satisfied recursion key re-distribution 46
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); delete at a nonleaf half-full condition is satisfied recursion key re-distribution node coalescence 47
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); delete at a nonleaf half-full condition is satisfied recursion key re-distribution node coalescence decide if a new root 48
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); report if pt is less than half-full 49
Pseudo Code for Deletion in a B+tree Delete(pt, (k,p), belowmin); \\ technically, the smallest key kmin in *pt is also returned \\ (k,p) is the data record to be deleted from the subtree rooted at pt; belowmin = true if \\ after deletion, pt has fewer than the required min # of pointers; Case 1.pt is a leaf delete (k,p) in pt; IFpt has at least (n+1)/2 data pointers OR pt is the root THEN return (belowmin = false) ELSE return (belowmin = true); Case 2.pt is not a leaf find a key ki in pt such that ki-1 k < ki; Delete(pi , (k, p), belowmin'); IF (not belowmin') THEN return(belowmin= false); ELSE IF pi has an adjacent sibling p' that has more than the required min # of pointers THEN move one key-pointer pair from p' to pi; ELSE combine pi with an adjacent sibling of pi into a single node; IFpt is the root with only one pointer pi THENpt = pi; return(belowmin= false); IFpt has at least (n+1)/2 pointers OR pt is the root THEN return(belowmin= false) ELSE return(belowmin= true); 50