Conjunctive Queries
Massive parallelism is crucial for handling large data sets in today's world. This study explores theoretical models like the BSP and LogP models to capture computation in massively parallel systems, focusing on minimizing communication, synchronization, and data skew bottlenecks. The approach emphasizes strict bounds on communication and data skew, aiming to minimize synchronization steps for efficient parallel complexity.
Uploaded on Feb 15, 2025 | 0 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Parallel Evaluation of Conjunctive Queries Paraschos Koutris and Dan Suciu University of Washington PODS 2011, Athens
Motivation Massive parallelism is necessary nowadays for handling huge amounts of data Parallelism has been popularized in various forms: The MapReduce architecture Languages on the top of MapReduce: PigLatin, Hive Systems for data analytics: Dremmel, SCOPE What is a good theoretical model to capture computation in such massively parallel systems?
Todays Parallel Models Classic models for parallelism: Circuit complexity, PRAM (Parallel Random Access Machines) The BSP (Bulk-Synchronous Parallel) model [Valiant, 90] The LogP model [Culler at al, 93] The main bottlenecks: Communication + Synchronization + Data Skew Communication Synchronization Data Skew [Afrati and Ullman,EDBT 10] minimize 1 step n/a memory O(n ), <1 [Karlof et al., SODA 10] implicit restriction minimize [Hellerstein,SIGMOD 10] (Coordination Complexity) n/a minimize n/a Our Approach O(n) minimize load balancing
Our Approach Strict bounds on communication and data skew Minimize synchronization Parallel complexity = # synchronization steps Example: Algorithms A and B process the same amount of data Algorithm B is more efficient than algorithm A Algorithm A Algorithm B
The Massively Parallel Model A universe U, a relational schema and a database instance D P servers: relation R partitioned to R1 , R2 , , RP A value a from U is generic: Copy a Test for equality: is a = b ? Feed to a hash function : h(a), h (a, b) hash functions can be chosen randomly at the beginning Computation proceeds in parallel steps, each with 3 phases: Broadcast Phase: The P servers exchange some data B globally, shared among all servers. We require size(B) = O(n ), < 1 Communication Phase: Each server sends data to other servers Computation Phase: local computation An algorithm for a query Q is load balanced if the expected maximum load is O(n / P) where n = size of input + output data
Datalog Notation for MP R(@s,x,y): the fragment of relation R stored at server s Broadcasting to all servers: R(@*,x) :- S(@s,x),T(@s,x) Point-to-point communication using a hash function h: R(@h(x,y),x,y,z) :- S(@s,x,y,z),T(@s,x) Local computation at server s: R(@s,x,y) :- S(@s,x,y),T(@s,x) Intersection Q(x):-R(x),S(x) Communication Phase R2(@h(x),x) :- R(@s,x) T2(@h(x),x) :- S(@s,x) Computation Phase Q(@s,x) :- R2(@s,x), T2(@s,x)
The Main Result We study relational queries which are: Conjunctive: conjunction of atoms Full: every variable must appear in the head of the query Q : Which full conjunctive queries can be answered by a load balanced algorithm in one MP step? Main Theorem Every tall-flat conjunctive query can be evaluated in one MP step by a load balanced algorithm Conversely, if a query is not tall-flat, then any algorithm consisting of one MP step can not be load balanced
Tall-Flat Queries Tall Queries: Q(x,y,z):- R(x),S(x,y),T(x,y,z) Flat Queries: Q(x,y,z,w) :- R(x,y),S(x,z),T(x,w) Combine them to get the tall-flat queries: L(x1,x2,x3,x4,y1,y2,y3) : R1(x1), R2(x1,x2), R3(x1,x2,x3), R4(x1,x2,x3,x4), S1(x1,x2,x3,x4,y1), S2(x1,x2,x3,x4,y2), S3(x1,x2,x3,x4,y3) Tall part Flat part
Outline Algorithms for Semijoin Flat Queries Tall Queries Combine for Tall-Flat Queries Impossible Queries
Semijoin: a nave approach Semijoin operator Q(x,y):- R(x), S(x,y) Communication Phase: send tuples S(a,b),R(a) to server h(a) Computation Phase: locally perform the semijoin S(0,a) S(0,d) S(0,e) S(0,w) S(0,c) S(3,d) S(3,f) Hashing S(1,c) S(4,a) S(2,b) S(2,a) S(5,a) Load balanced?
A better approach Semijoin Broadcast Phase compute frequent values : set F = frequent(S) Communication Phase R2(@h(x),x) :- R(@s,x), not F(@s,x) S(@h(x),x,y) :- S(@s,x, y), not F(@s,x) R2(@*,x) :- R(@s,x), F(@s,x) S(@h2(x,y),x):- S(@s,x,y), F(@s,x) Computation Phase Q(@s,x,y) :- R2(@s,x), S2(@s,x,y) Same approach as SkewJoin in PigLatin Computing frequent elements : given a relation R(x, )find the values of x with frequency more than a threshold Sampling Local Counting
The Broadcast Phase Do we really need a broadcast phase before distributing the data to the servers? Theorem Any algorithm computing a semijoin in 1 MP step without a broadcast phase is not load balanced The purpose of the broadcast phase is to extract information on the data distribution (e.g. identify the frequent values)
Full Join Join Q(x,y,z):-R(x,y),S(y,z) Communication Phase CASE : frequent(R) HR(@h(x,y),x,y) :- R(@s,x,y), RF(y) DS(@*,y,z) :- S(@s,y,z), RF(y) CASE : frequent(S) , not frequent(R) HS(@h2(y,z),y,z):- S(@s,y,z), SF(y), not RF(y) DR(@*,x,y) :- R(@s,x,y), SF(y), not RF(y) CASE : not frequent(R) , not frequent(S) TR(@h3(y),x,y):- R(@s,x,y), not RF(y), not RS(y) TS(@h3(y),y,z):- S(@s,y,z), not RF(y), not RS(y) Computation Phase J1(@s,x,y,z) :- HR(@s,x,y), DS(y,z) J2(@s,x,y,z) :- DR(x,y), HS(@s,y,z) J3(@s,x,y,z) :- TR(@s,x,y), TS(@s,y,z) Q(@s,x,y,z) :- J1(@s,x,y,z);J2(@s,x,y,z);J3(@s,x,y,z) Similar idea to [Zu et al., SIGMOD 08]
Flat Queries How can we extend the above ideas to compute flat queries? Q(x,y,z,w) :- R(x,y),S(x,z),T(x,w) We introduce a second step in the broadcast phase to find the frequent values that definitely appear in the final result Why would that be a problem? a is frequent in R, S and does not exist in T The cost of replication of a-tuples would not be justified by the output size The idea generalizes for any flat query, with only 2 broadcast steps
Tall Queries Compute a tall query Q(x,y,z) :- R(x),S(x,y),T(x,y,z) Construct a decision tree to decide whether a tuple will be hashed (and how) or broadcast Example: a tuple t = S(a,b) t Yes! x in frequent(T) x in frequent(S) x,y in frequent(T) Yes! No! Broadcast Send to h(a,b) @h(x,y) @h(x,y) @h(x,y,z) @h(x)
The Main Algorithm Reminder: A tall-flat query consists of a tall and a flat part Tall-query techniques (decision tree) handle the tall part Flat-query techniques handle the flat part We can thus design an algorithm which computes any tall-flat query in 1 MP step (with a 2-step broadcast phase) Main Theorem (Part 1) Every tall-flat conjunctive query can be evaluated in one MP step by a load balanced algorithm
Impossibility Theorems Lemma 1 The query RST(x,y):- R(x),S(x,y),T(y) can not be computed in 1 MP step by a load balanced algorithm Lemma 2 The query J(x,y):- R(x),S(x),T(y) can not be computed in 1 MP step by a load balanced algorithm Main Theorem (Part 2) Any non tall-flat query can not be computed in 1 MP step by a load balanced algorithm
Open Questions How can we leverage data statistics (e.g. relation sizes, value distributions) to design better MP algorithms? What is the minimum number of parallel steps for any query? What is the parallel complexity of other classes of queries (e.g. with union, projections)? At what point does it become more expensive in practice to have a broadcast phase instead of 2 steps?