Solving RMQ and LCA Problems in Data Structures and Algorithms

two equivalent problems n.w
1 / 17
Embed
Share

Explore the equivalence and practical applications of solving Range Minimum Query (RMQ) and Lowest Common Ancestor (LCA) problems on integer arrays, Cartesian Trees, Euler tours, suffix trees, suffix arrays, LCP arrays, and more. Learn about efficient solutions, data structures, and query complexities for these fundamental problems in computer science.

  • RMQ
  • LCA
  • Data Structures
  • Algorithms
  • Tree Structures

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Two equivalent problems RMQ on integer arrays RMQ on 1 integer arrays LCA on a Cartesian Tree Euler tour of the Cartesian tree RMQ on the array of node-levels LCA on general trees RMQ on 1 integer arrays Euler tour of the Cartesian tree RMQ on the array of node-levels

  2. Search for k-mismatches T CCGTACGATCAGTACAGTACAGTACTTTTTTAAACCGGAGACTACA P If O(1) O(k) time CCGAACTATC Problem: Find longest match between P[i, ] and T[j, ] Data Structure 1. Concatenate P and T into a string X = T$P Construct a data structure on X that retrieves FAST the longest match between any pair of suffixes of X 2. LCA or LCP query

  3. Suffix Tree & LCA LCA 0 # s i 1 si 12 mississippi# 1 p i 3 ssi 2 ppi# ppi# 1 $ 4 $ 13 ppi# # 6 3 i# ppi# pi# 7 4 11 14 8 5 2 1 10 9 Longest match(3,13) T#P = mississippi#si$ 1 2 4 6 8 10 13

  4. Suffix Array & LCP RMQ SA 12 11 14 8 5 2 1 10 9 13 7 4 6 3 Longest match(3,13) Lcp 0 1 1 1 4 0 0 1 0 2 2 1 3 LCP T#P = mississippi#si 1 2 4 6 8 10 13 si sippi# sissippi# ssippi# ssissippi# RMQ LCP Surprisingly, also LCA RMQ

  5. The RMQ problem RMQA(i,j) returns the index of the smallest element in the subarray A[i..j]. A[3] A[4] A[5] A[6] A[7] A[8] A[0] A[1] A[2] A[9] 10 2 12 1 12 13 21 15 14 10 RMQ(2,7) = 3 Trivial solution: Precompute RMQ for every pair of indices. This takes (n2) space, and O(1) query time

  6. RMQ on a general array A[3] A[4] A[5] A[6] A[7] A[8] A[0] A[1] A[2] A[9] 10 25 22 34 7 19 9 12 26 16 Cartesian Tree 4 0 6 2 5 7 1 3 9 8

  7. RMQ(i,j) = LCA(i,j) on Cartesian trees A[3] A[4] A[5] A[6] A[7] A[8] A[0] A[1] A[2] A[9] 10 25 22 34 7 19 9 12 26 16 4 0 6 2 5 7 1 3 9 8

  8. RMQ 1 Generic RMQ LCA LCA(u,v) = shallowest node between u and v during a depth first search traversal of T. Node at the lowest level 0 1 array 3 Node Level 12 1 2 1 2 3 2 3 2 1 2 3 2 1 8 3 9 1 3 2 3 1 4 1 7 1 3 5 6 5 3 5 Euler tour 11 1 2 4 7 10 6 5 2 4 6 7 LCAT(4,6) = 3

  9. We are left with RMQ on 1 array RMQA(i,j) returns the index of the smallest element in the subarray A[i..j]. A[3] A[4] A[5] A[6] A[7] A[8] A[0] A[1] A[2] A[9] 10 11 12 11 12 13 14 15 14 13 RMQ(2,7) = 3 Recall the trivial solution: Precompute RMQ for every pair of indices. This takes (n2) space, and O(1) query time

  10. Sparse Table Preprocess sub arrays of len 2k, for every k=0,1, , log n M(i,j) = index of min value in A[i, i+ 2j -1] A[3] A[4] A[5] A[6] A[7] A[8] A[0] A[1] A[2] A[9] 10 11 12 11 12 13 14 15 14 13 M(2,0)=2 Total space is O(n log n) RMQ query ? M(2,1)=3 M(2,2)=3 M(2,3)=3

  11. Querying the Sparse Table j i 2kelements a1 ... ... 2kelements Total space is O(n log n) RMQ query takes O(1) time = j i log( ) k

  12. Bucketing A [i] = min in the i-th block of A. B [i] is the position (index) of that min. A A [0] A [i] A [2n/logn] A ... ... ... ... ... log n 2 2n blocks ... B[i] log n B[0] B[2n/logn]

  13. Use the Bucketing A [0] A [i] A [2n/logn] A ... ... ... log 2 ... ... n Preprocess A for RMQ using SparseTable Space is (2n/log n) * log (2n/log n) = O(n) RMQ queries on A take O(1) time Preprocess every block of A for border RMQ Space is O(n), border RMQ take O(1) time. RMQ(i,j) takes O(1) time, if i,j lie in distinct blocks

  14. In-block RMQ over 1 arrays , i j RMQ i j = ( , ) ( , ) RMQ i j X Y X 3 4 5 6 5 4 5 6 5 4 Y 0 1 2 3 2 1 2 3 2 1 X = Y +1 +1 +1 -1 -1 +1 +1 -1 -1 = log n 1 2 2 ( ) O n There are normalized blocks ( ) O n Set Table[BlockEnc, i, j] = RMQ(i,j) ( ) = 2 log ( ) O n n O n Table entries

  15. LZ-parsing (gzip) 0 # s i 1 si 12 mississippi# 1 p i 3 ssi 2 ppi# ppi# 1 4 ppi# # 6 3 i# ppi# pi# 7 4 11 8 5 2 1 10 9 <m><i><s><si><ssip><pi> T = mississippi# 1 2 4 6 8 10

  16. LZ-parsing (gzip) It is on the path to 6 0 # s By maximality check only nodes i 1 12 mississippi# 1 p Leftmost occ = 3 < 6 si i 3 ssi 2 ppi# ppi# 1 4 Leftmost occ = 3 < 6 ppi# # 6 3 i# ppi# pi# 7 4 11 8 5 2 1 10 9 <ssip> 1. Longest repeated prefix of T[6,...] 2. Repeat is on the left of 6 T = mississippi# 1 2 4 6 8 10

  17. LZ-parsing (gzip) min-leaf Leftmost copy 0 # s 3 i 1 si 12 2 mississippi# 1 p Parsing: 1.Scan T 3 i 3 ssi 4 9 2.Visit ST and stop when 2 2 ppi# ppi# 1 4 min-leaf current pos ppi# # 6 3 i# ppi# pi# 7 4 11 8 5 2 1 10 9 <m><i><s><si><ssip><pi> T = mississippi# 1 2 4 6 8 10 Precompute the min descending leaf at every node in O(n) time.

More Related Content