Tractable Orders for Direct Access to Ranked Answers of Conjunctive Queries
This work discusses an innovative approach for direct access to ranked answers of conjunctive queries, focusing on lexicographic orders and sum-of-weights orders. It delves into the problem, background, algorithms, and solutions for achieving efficient access to desired answers without materializing all responses. The study emphasizes quasilinear preprocessing and polylogarithmic access time to manage data complexity effectively.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Tractable Orders for Direct Access to Ranked Answers of Conjunctive Queries Nofar Carmeli1, Nikolaos Tziavelis2, Wolfgang Gatterbauer2, Benny Kimelfeld1, Mirek Riedewald2 1Technion, Israel 2Northeastern University, Boston This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 International License. See https://creativecommons.org/licenses/by-nc-sa/4.0/ for details
Motivation Travel Employees Renumeration Name Role Address Period Role Salary Address Cost Jack Junior dev Boston 11/2020 Junior dev 4000 Boston 50 Jill Senior dev Brookline 11/2020 Senior dev 4500 Brookline 100 Joanna Senior dev Braintree 12/2020 Junior dev 7000 Braintree 200 12/2020 Senior dev 7100 ? Name,Role,Address,Period,Salary,Cost Employees Name,Role,Address ,Renumeration Period,Role,Salary ,Travel(Role,Salary) Query Answers Want: Median Boxplot Jump to any rank without materializing all answers Name Role Address Period Salary Cost Jack Junior dev Boston 11/2020 4000 50 Jill Senior dev Brookline 11/2020 4500 100 sort by salary+cost Joanna Senior dev Braintree 11/2020 4500 200 4th result Jack Junior dev Boston 12/2020 7000 50 2
Outline Direct access: Problem & Background Lexicographic orders Sum-of-weights orders Selection Problem Conclusion 3
Direct Access Problem Ranked Also called: random access, ?th answer Problem: query Input: database instance of size ? Algorithm: Preprocessing Access: given ?, return answer ? in the list of answers (or out-of-bound) & ordering sorted Our focus: quasilinear preprocessing, polylog access time <?polylog?,polylog ?> (data complexity) 4
Answer Orderings Lexicographic Ordering of free variables e.g. [Address, Salary, Cost, Role, Name, Period] or just [Address, Salary] (partial lex. order) Sum-of-weights Weights to domain values of free variables Ranking by the sum of weights Can simulate any lexicographic order Salary w 4000 1 4500 2 Name Role Address Period Salary Cost w 7000 3 Jack Junior dev Boston 11/2020 4000 50 11 Equivalent to [Address, Salary] Jack Junior dev Boston 12/2020 7000 50 13 Address w Joanna Senior dev Braintree 11/2020 4500 200 22 Boston 10 Jill Senior dev Brookline 11/2020 4500 100 32 Braintree 20 Brookline 30 5
Related work (Unranked) Enumeration [BaganDurandGrandjean CSL 07][Brault-Baron PhD Thesis 13] const (or polylog) delay possible * free-connex Ranked enumeration[TAjwaniGatterbauerRiedewaldYang PVLDB 20] sum of weights (or lexicographic), log delay, free-connex Direct access (restricted order support) via elimination order [Brault-Baron PhD Thesis 13] via join tree [CZeeviBerkholzKimelfeldSchweikardt PODS 20] via q-tree (dynamic settings, q-hierarchical only) [Keppeler PhD Thesis 20] All using: data complexity, RAM model 6
Definitions An acyclic CQ has a graph with: A free-connex CQ also requires: 2. tree 3. for every variable X: the nodes containing X are connected 4. a subtree with exactly the free variables 1. a node for every atom possibly also subsets ? ?,? ? ?,?,? ? ?,? ,? ?,? ,?(?,?) ? ? ?,? ?,? 7
Overview of Direct Access CQs Acyclic <?polylog?,polylog ?> Tractable Free-connex LEX, SUM intractable LEX tractable, SUM intractable Not L-connex or disruptive trio L-connex and no disruptive trio Both tractable Free atom 8 * Lower bounds assume: no self-joins, conventional hypotheses in fine-grained complexity
Outline Direct access: Problem & Background Lexicographic orders Sum-of-weights orders Selection Problem Conclusion 9
Dichotomy ?1(?1,?2,?3) ? ?1,?2,?(?2,?3) ?2(?1,?2,?3) ? ?1,?3,?(?3,?2) Given: CQ ?, ordering ? of free(?), lexicographic access in <?polylog?,polylog ?> * free-connex, no disruptive trio Disruptive Trio ?2 ?1 x share an atom * Lower bounds assume: (1) no self-joins (2) hardness of matrix multiplication and hyperclique detection ?3 last out of the three 10
Partial Lexicographical Ordering possible a completion for a feasible full ordering a subset of Given: CQ ?, ordering ? of free(?), partial lexicographic access in <?polylog?,polylog ?> * free-connex, no disruptive trio L-connex, L-connex ?,? L * Lower bounds assume: (1) no self-joins (2) hardness of matrix multiplication and hyperclique detection ?,? ?,? free 11
Hardness Assumption: ?1 cannot be enumerated in Reduction: ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? Enumerate ?1?1,?2 ? ?1,?3,?(?3,?2) binary search for next different ??, ?? values using Lexicographic direct-access ?2?1,?2,?3 ? ?1,?3,?(?3,?2) Log number of direct-access calls between answers ?1 enumeration with polylog delay, contradiction 12
Algorithm Adopt previous approach [C Zeevi Berkholz Kimelfeld Schweikardt PODS 20] Free-connex to full acyclic, then use a join-tree Preprocessing: DP up the tree computes how many answers in a subtree use each tuple Access: recurse down the tree splits the desired index between the children Access 6 ?? a1 w 8 6 = 1 4 + 2 a2 4 Access 2 ?? b1 w 3 Access 1 b2 1 ?? a1 ?? c1 w ? ?1 1 ?? b1 ?? d1 w a1 c2 1 ? ? ?1, ?3 ?2 1 a2 c2 1 b1 d2 1 b1 d3 1 ? ?2, ?4 b2 d4 1 13
Algorithm Adopt previous approach [C Zeevi Berkholz Kimelfeld Schweikardt PODS 20] Free-connex to full acyclic, then use a join-tree Preprocessing: DP up the tree computes how many answers in a subtree use each tuple Access: recurse down the tree splits the desired index between the children Access 6 ?? a1 w 8 a2 4 Access 2 ?? b1 w ?? a1 ?? c1 ?? b1 ?? d1 Resulting order: 3 Access 1 b2 1 a1 c1 b1 d2 ?? a1 ?? c1 w a1 c1 b1 d3 1 a1 c1 b2 d4 ?? b1 ?? d1 w a1 c2 1 a1 c2 b1 d1 1 a2 c2 1 a1 c2 b1 d2 b1 d2 1 a1 c2 b1 d3 b1 d3 1 a1 c2 b2 d4 b2 d4 1 14
Algorithm Adopt previous approach [C Zeevi Berkholz Kimelfeld Schweikardt PODS 20] Free-connex to full acyclic, then use a join-tree Preprocessing: DP up the tree computes how many answers in a subtree use each tuple Access: recurse down the tree splits the desired index between the children ?? a1 w 8 a2 4 ?? b1 w ? ?1 3 1 b2 1 no disruptive trio layered join-tree ?? a1 ?? c1 w ? ? ?1, ?3 ?2 2 1 3 ?? b1 ?? d1 w a1 c2 1 1 a2 c2 1 Problems Tree determines order Independent branches Solutions Specialized trees Modified access ? ?2, ?4 4 b1 d2 1 1. 2. 1. 2. b1 d3 1 b2 d4 1 15
Outline Direct access: Problem & Background Lexicographic orders Sum-of-weights orders Selection Problem Conclusion 16
Dichotomy Given: CQ ?, sum-of-weights access in <?polylog?,polylog ?> * acyclic, an atom contains all free variables ?1(?,?) ? ?,?,? ,?(?,?) * Lower bounds assume: (1) no self-joins (2) hardness of 3-SUM and hyperclique detection ?2(?,?) ?(?,?),?(?,?) 17
Hardness Observation: Binary search finds a weight with logarithmic accesses 3SUM hypothesis given 3 sets of integers ? = ? = ? = ?, deciding ? ?,? ?,? ? s.t. ? + ? + ? = 0 cannot be done in time ? ?2 ? for any ? > 0 Use two independent free variables ? ? ? ? ?2(?,?) ?(?,?),?(?,?) Binary search for ? ( ?) ?1 0 ?1 0 ?2 0 ?2 0 ?1 ?2 ?1 ?2 ?1+ ?1 ?1+ ?2 ?2+ ?1 ?2+ ?2 Direct access impossible in <?2 ?,?1 ?> ? ? ? ? ?1 ?2 0 0 ?1 ?2 0 0 ? ? 18
Outline Direct access: Problem & Background Direct Access Lexicographic orders CQs Acyclic Sum-of-weights orders Selection Problem Free-connex Not L-connex or disruptive trio L-connex and no disruptive trio Conclusion Free atom 19
Selection vs Direct Access Selection Direct Access Data Structure Preprocessing ?(? polylog ?) No Preprocessing Answers ?(? polylog ?) ?(polylog ?) <? polylog ?,polylog ?> <1,n polylog ?> 20
Selection Dichotomy Given: full CQ ?, sum-of-weights selection in O(?????) * At most two maximal atoms w.r.t. hyperedge containment ?1?,?,? ? ?,? ,? ?,? ,?(?) ?2?,?,?,? ? ?,? ,? ?,? ,?(?,?) * Lower bounds assume: (1) no self-joins (2) hardness of 3-SUM and hyperclique detection 21
Tractable Cases for Selection ? ?,?,? ? ?,? ,? ?,? ? ? ? = ? ? = ? ? ? ? ? ?1 ?2 ?3 ?1 ?2 ?3 ?4 ?1 ?2 ?3 a d d f Sort ?,? on weights => Every row/column sorted ?1 ?2 ?3 ?4 b d e f a e e g c e Selection on a union of sorted matrices of dimensions ?? ?? possible in time O( max(??,??)) [Frederickson Johnson 1984] 22 * We do not materialize the matrices, but compute the values of the cells on-the-fly
Tractable Cases for Selection (Lexicographic) ? ?,?,? ? ?,? ,? ?,? Lex Order [?,?,?]: Direct access intractable (disruptive trio) Selection tractable (as sum-of-weights) 23
Intractable Cases for Selection 3-path ?3? (reduction from Boolean triangle ? ) ? () ? ?,? ,? ?,? ,?(?,?) ? ? ? ? ?3?(?,?,?) ? ?,? ,? ?,? ,?(?,?) Sum-of-weights Selection ? ? + ? ? = 0 ? = ? Identify answers to ? with ? = ? Set weights s.t. weight of triangle answers becomes 0 Binary search to find zero weight answer 24
Outline Direct access: Problem & Background Lexicographic orders Sum-of-weights orders Selection Problem Conclusion 25
Overview and Conclusion Selection <1,?polylog? > Direct Access <?polylog?,polylog ?> Tractable Tractable CQs CQs Acyclic Acyclic Maximal hyperedges 2 Explored Free-connex Free-connex Both intractable LEX tractable, SUM intractable Not L-connex or disruptive trio L-connex and no disruptive trio Not L-connex or disruptive trio L-connex and no disruptive trio Both tractable Unexplored SUM intractable Free atom Free atom Both unexplored Full LEX tractable * Upper bounds for direct access are <sorting-cost,log ?> 26 * Lower bounds assume: no-self joins, hypotheses in fine-grained complexity
Thank you! 27