Logic Query Plan Optimization Techniques for Database Systems

csce 608 database systems n.w
1 / 55
Embed
Share

Explore cost estimation for query plans, query optimization via logic and size, and improving logic plans using commutative and associative operators. Learn about efficient algorithms, optimizing logic laws, and proof techniques for optimizing database query performance.

  • Database Optimization
  • Query Plans
  • Logic Laws
  • Improving Efficiency
  • Database Systems

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. CSCE-608 Database Systems Spring 2025 Instructor: Jianer Chen Office: PETR 428 Phone: 845-4259 Email: chen@cse.tamu.edu Notes 20: Cost Estimation for Query Plans

  2. Query Optimization An input database program P Prepare a collection A of efficient algorithms for operations in relational algebra; parser View processing, Semantic checking parse tree preprocessing parse tree parse tree-lqp convertor logic query plan push selections, group joins apply logic laws logic query plan reduce the size of intermediate results Optimization via logic and size logic query plan Lqp-pqp convertor physical query plan take care of issues in optimization and security. choices of algorithms, data structures, and computational modes Optimization via algorithms and cost Machine executable code

  3. Improving logic plan via logic laws Major Steps: 1. Move selections so they can be applied early ( reduces table size); 2. Combine cross product make natural joins and theta joins ( has more efficient algorithms); 3. group commutative and associative binary operations (e.g., , U, ) (for later opt.); 4. May consider other operations (e.g., , , , ) with selections to C

  4. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , U, , : both commutative and associative (but not and ); D

  5. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T)

  6. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T

  7. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T t3 inT t1 t2in R S

  8. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T A t3 inT A A t1 t2in R S

  9. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T A t3 inT A A t1 t2in R S

  10. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T A t3 inT A A t1 t2in R S t1in R t2in S

  11. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T A t3 inT A A t1 t2in R S B A1 B t1in R A2 t2in S B

  12. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T A t3 inT A A t1 t2in R S B t3 inT A1 B A t1in R t2in S A2 A2 t2in S B B

  13. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T A t3 inT A t2 t3 inS T A t1 t2in R S A2 A1 B B t3 inT A1 B A t1in R t2in S A2 A2 t2in S B B

  14. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t = (t1 t2) t3 in (R S) T A t3 inT B A1 A t1in R t2 t3 inS T A t1 t2in R S A2 A1 B B t3 inT A1 B A t1in R t2in S A2 A2 t2in S B B

  15. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t1 (t2 t3) = t in R (S T) t = (t1 t2) t3 in (R S) T A A A t3 inT B A1 A t1in R t2 t3 inS T A t1 t2in R S A2 A1 B B t3 inT A1 B A t1in R t2in S A2 A2 t2in S B B

  16. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Proof for : (R S) T = R (S T) t1 (t2 t3) = t in R (S T) t = (t1 t2) t3 in (R S) T A A A t3 inT B A1 A t1in R t2 t3 inS T A t1 t2in R S A2 A1 B B t3 inT A1 B A t1in R t2in S A2 A2 t2in S B B to prove the other direction

  17. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Group each commutative and associative binary operator into a group .

  18. Improving logic plan via logic laws Commutative operator: R * S = S * R Associative operator: (R * S) * T = R * (S * T) , , , : both commutative and associative (but not Dand ); Group each commutative and associative binary operator into a group . D C E B C D E A A B

  19. Improving logic plan via logic laws Major Steps: 1. Move selections so they can be applied early ( reduces table size); 2. Combine cross product make natural joins and theta joins C ( has more efficient algorithms); 3. group commutative and associative binary operations (e.g., , , ) (for later opt.); 4. May consider other operations (e.g., , , , ) with selections to

  20. Query Optimization An input database program P Prepare a collection C of efficient algorithms for operations in relational algebra; parser View processing, Semantic checking parse tree preprocessing parse tree parse tree-lqp convertor logic query plan push selections, group joins apply logic laws logic query plan reduce the size of intermediate results Optimization via logic and size logic query plan Lqp-pqp convertor physical query plan take care of issues in optimization and security. choices of algorithms, data structures, and computational modes Optimization via algorithms and cost Machine executable code

  21. Improving logic plan via logic laws Limits for applying pure logic laws S S R R Which one should be used?

  22. Improving logic plan via logic laws Limits for applying pure logic laws could be small if R and S have few matching tuples could be large if R and S have many matching tuples S S R R Which one should be used?

  23. Improving logic plan via logic laws Limits for applying pure logic laws could be significantly smaller if there are many duplicates could be small if R and S have few matching tuples could be large if R and S have many matching tuples S S R R could be insignificant if there are few duplicates Which one should be used?

  24. Query Optimization An input database program P Prepare a collection C of efficient algorithms for operations in relational algebra; parser View processing, Semantic checking parse tree preprocessing parse tree parse tree-lqp convertor logic query plan push selections, group joins apply logic laws logic query plan reduce the size of intermediate results Optimization via logic and size logic query plan Lqp-pqp convertor physical query plan take care of issues in optimization and security. choices of algorithms, data structures, and computational modes Optimization via algorithms and cost Machine executable code

  25. Query Optimization An input database program P Prepare a collection C of efficient algorithms for operations in relational algebra; parser View processing, Semantic checking parse tree preprocessing parse tree parse tree-lqp convertor logic query plan push selections, group joins apply logic laws logic query plan reduce the size of intermediate results Optimization via logic and size logic query plan Lqp-pqp convertor physical query plan take care of issues in optimization and security. choices of algorithms, data structures, and computational modes Optimization via algorithms and cost Machine executable code

  26. Improving logic plan via relation size Major Steps:

  27. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations:

  28. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations: T(R), B(R), V(R,A) (the # of different values on attribute A)

  29. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations: T(R), B(R), V(R,A) (the # of different values on attribute A) 2. Set up estimation rules for size parameters on relational algebraic operators;

  30. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations: T(R), B(R), V(R,A) (the # of different values on attribute A) 2. Set up estimation rules for size parameters on relational algebraic operators; 3. Using logic laws to convert a logic query into the one that minimizes the (estimated) sizes of intermediate relations.

  31. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations: T(R), B(R), V(R,A) (the # of different values on attribute A) 2. Set up estimation rules for size parameters on relational algebraic operators; 3. Using logic laws to convert a logic query into the one that minimizes the (estimated) sizes of intermediate relations. 150 500 1500 R S S R 2000 5000 5000 2000

  32. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations: T(R), B(R), V(R,A) (the # of different values on attribute A) 2. Set up estimation rules for size parameters on relational algebraic operators; 3. Using logic laws to convert a logic query into the one that minimizes the (estimated) sizes of intermediate relations. 150 500 1500 R S S R 2000 5000 5000 2000

  33. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations: T(R), B(R), V(R,A) (the # of different values on attribute A) 2. Set up estimation rules for size parameters on relational algebraic operators; 3. Using logic laws to convert a logic query into the one that minimizes the (estimated) sizes of intermediate relations. 150 500 1500 R S S R 2000 5000 5000 2000

  34. Improving logic plan via relation size Major Steps: 1. Collect size parameters for stored relations: T(R), B(R), V(R,A) (the # of different values on attribute A) 2. Set up estimation rules for size parameters on relational algebraic operators; 3. Using logic laws to convert a logic query into the one that minimizes the (estimated) sizes of intermediate relations. 150 500 1500 R S S R 2000 5000 5000 2000

  35. Estimating size parameters (T,B,V) The values T(R), B(R), V(R,A) for a stored relation R can be obtained by statistic analysis (and recorded with the relation)

  36. Estimating size parameters (T,B,V) The values T(R), B(R), V(R,A) for a stored relation R can be obtained by statistic analysis (and recorded with the relation) How do we know the values for intermediate relations?

  37. Estimating size parameters (T,B,V) The values T(R), B(R), V(R,A) for a stored relation R can be obtained by statistic analysis (and recorded with the relation) How do we know the values for intermediate relations? Suppose we know T(R), B(R), V(R,A), T(S), B(S), V(S,D)

  38. Estimating size parameters (T,B,V) The values T(R), B(R), V(R,A) for a stored relation R can be obtained by statistic analysis (and recorded with the relation) How do we know the values for intermediate relations? Suppose we know T(R), B(R), V(R,A), T(S), B(S), V(S,D) Now we compute R S. How do we known T(R S), B(R S), V(R S,A), V(R S,D)?

  39. Estimating size parameters (T,B,V) The values T(R), B(R), V(R,A) for a stored relation R can be obtained by statistic analysis (and recorded with the relation) How do we know the values for intermediate relations? Suppose we know T(R), B(R), V(R,A), T(S), B(S), V(S,D) Now we compute R S. How do we known T(R S), B(R S), V(R S,A), V(R S,D)? B(R S) can be computed based on T(R S).

  40. Estimating size parameters (T,B,V) The values T(R), B(R), V(R,A) for a stored relation R can be obtained by statistic analysis (and recorded with the relation) How do we know the values for intermediate relations? Suppose we know T(R), B(R), V(R,A), T(S), B(S), V(S,D) Now we compute R S. How do we known T(R S), B(R S), V(R S,A), V(R S,D)? B(R S) can be computed based on T(R S). How about V(R S,A) and V(R S,D)?

  41. Estimating size parameters (T,B,V) The values T(R), B(R), V(R,A) for a stored relation R can be obtained by statistic analysis (and recorded with the relation) How do we know the values for intermediate relations? Suppose we know T(R), B(R), V(R,A), T(S), B(S), V(S,D) Now we compute R S. How do we known T(R S), B(R S), V(R S,A), V(R S,D)? B(R S) can be computed based on T(R S). How about V(R S,A) and V(R S,D)? How do we use V(R,A) and V(S,D)?

  42. Estimating size parameters (T,B,V)

  43. Estimating size parameters (T,B,V) , , :

  44. Estimating size parameters (T,B,V) , , : size parameters can be calculated precisely

  45. Estimating size parameters (T,B,V) , , : size parameters can be calculated precisely : T( (S)) = min{ T(S)/2, AV(R,A) }

  46. Estimating size parameters (T,B,V) , , : size parameters can be calculated precisely : T( (S)) = min{ T(S)/2, AV(R,A) } In average, there are (probably) about two copies of each tuple(?) T( (S)) = T(S)/2 (?)

  47. Estimating size parameters (T,B,V) , , : size parameters can be calculated precisely : T( (S)) = min{ T(S)/2, AV(R,A) } In average, there are (probably) about two copies of each tuple(?) T( (R)) = T(R)/2 (?)

  48. Estimating size parameters (T,B,V) , , : size parameters can be calculated precisely : T( (S)) = min{ T(S)/2, AV(R,A) } In average, there are (probably) about two copies of each tuple(?) T( (R)) = T(R)/2 (?) (Probably) every combination of different attribute values make a tuple(?) T( (S)) = AV(R,A) (?) (this number could be larger than T(S)!)

  49. Estimating size parameters (T,B,V) , , : size parameters can be calculated precisely : T( (S)) = min{ T(S)/2, AV(R,A) } In average, there are (probably) about two copies of each tuple(?) T( (R)) = T(R)/2 (?) (Probably) every combination of different attribute values make a tuple(?) T( (R)) = AV(R,A) (?) (this number could be larger than T(S)!)

  50. Estimating size parameters (T,B,V) , , : size parameters can be calculated precisely : T( (S)) = min{ T(S)/2, AV(R,A) } In average, there are (probably) about two copies of each tuple(?) T( (R)) = T(R)/2 (?) (Probably) every combination of different attribute values make a tuple(?) T( (R)) = AV(R,A) (?) (this number could be larger than T(R)!)

Related


More Related Content