
Query Execution and Optimization in SQL: A Comprehensive Overview
Explore the intricacies of query execution, optimization, and relational operators in SQL. Learn how to query multiple relations effectively and understand the semantics and evaluation of SQL queries. Dive into join queries, group by operations, and eliminate duplicates for efficient SQL data retrieval.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
SQL The Query Language R & G - Chapter 5 Based on Slides from UC Berkeley and book.
2 Query Execution We start from here Declarative Query (SQL) Query Optimization and Execution (Relational) Operators File and Access Methods Buffer Management Disk Space Management
Multi- Relation Queries Multi-relation Queries
4 Querying Multiple Relations SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid = R.sid AND R.bid = 102 Reserves Sailors sid 1 2 1 bid 102 102 101 day 9/12 9/13 10/01 sid 1 2 3 4 sname Popeye OliveOyl Garfield Bob rating age 10 11 1 5 22 39 27 19
5 Querying Multiple Relations SELECT S.sname FROM Sailors S, Reserves R Cartesian product (1, 101, 10/1) X X X X X (2, 102, 9/13) X X X (1, 102, 9/12) X X X X Popeye OliveOyl Garfield Bob
6 Join Queries SELECT [DISTINCT] <column expression list> FROM <table1 [AS t1], ... , tableN [AS tn]> [WHERE <predicate>] [GROUP BY <column list>] [HAVING <predicate>] [ORDER BY <column list>] ;
7 Query Semantics SELECT [DISTINCT] target-list FROM relation-list WHERE qualification FROM: compute cross product of tables. WHERE: Check conditions, discard tuples that fail. SELECT: Specify desired fields in output. DISTINCT (optional): eliminate duplicate rows. Note: this is likely a terribly inefficient strategy! Query optimizer will find more efficient plans.
Conceptual SQL Evaluation 8 SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification Project away columns (just keep those used in SELECT, GBY, HAVING) Eliminate duplicates SELECT [DISTINCT] Apply selections (eliminate rows) Eliminate groups WHERE HAVING Relation cross-product FROM GROUP BY
Find sailors who have reserved at least one boat 9 SELECT S.sid FROM Sailors AS S, Reserves AS R WHERE S.sid = R.sid Will DISTINCT make a difference here?
10 About Range Variables Needed when ambiguity could arise. e.g., same table used multiple times in FROM( self-join ) SELECT x.sname, x.age, y.sname, y.age FROM Sailors AS x, Sailors AS y WHERE x.age > y.age Sailors sid 1 2 3 4 sname Popeye OliveOyl Garfield Bob rating age 10 11 1 5 22 39 27 19
11 Arithmetic Expressions SELECT S.age, S.age-5 AS age1, 2*S.age AS age2 FROM Sailors AS S WHERE S.sname = 'Popeye' SELECT S1.sname AS name1, S2.sname AS name2 FROM Sailors AS S1, Sailors AS S2 WHERE 2*S1.rating = S2.rating - 1
12 String Comparisons SELECT S.sname FROM Sailors s WHERE S.sname LIKE 'P_p%' '_' stands for any one character and '%' stands for 0 or more arbitrary characters. Most DBMSs now support standard regex as well (incl. PostgreSQL)
Find sidof sailors whove reserved a red or green boat 13 SELECT R.sid FROM Boats B, Reserves R WHERE R.bid=B.bid AND (B.color='red' OR B.color='green') ... or: OR SELECT R.sid FROM Boats B, Reserves R WHERE R.bid=B.bid AND B.color='red' UNION UNION ALL ALL SELECT R.sid FROM Boats B, Reserves R WHERE R.bid=B.bid AND B.color='green'
Find sidof sailors whove reserved a red AND a green boat 14 SELECT R.sid FROM Boats B,Reserves R WHERE R.bid=B.bid AND (B.color='red' AND AND B.color='green')
Find sidof sailors whove reserved a red AND a green boat 15 SELECT R.sid FROM Boats B,Reserves R WHERE R.bid=B.bid AND (B.color='red' AND AND B.color='green')
Find sidof sailors whove reserved a red AND a green boat 16 SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color='red' INTERSECT INTERSECT SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color='green'
Find sidof sailors whove reserved a red AND a green boat 17 Could use a self-join: SELECT R1.sid FROM Boats B1, Reserves R1, Boats B2, Reserves R2 WHERE R1.sid=R2.sid AND R1.bid=B1.bid AND R2.bid=B2.bid AND (B1.color='red' AND B2.color='green')
Find sids of sailors who have not reserved a boat 18 SELECT S.sid FROM Sailors S EXCEPT EXCEPT SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid=R.sid
19 Nested Queries: IN Names of sailors who ve reserved boat #102 SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=102) IN
20 Nested Queries: NOT IN Names of sailors who ve not reserved boat #103 SELECT S.sname FROM Sailors S WHERE S.sid NOT IN (SELECT R.sid FROM Reserves R WHERE R.bid=103) NOT IN
21 Nested Queries with Correlation Names of sailors who ve reserved boat #102 SELECT S.sname FROM Sailors S WHERE EXISTS EXISTS (SELECT * FROM Reserves R WHERE R.bid=102 AND S.sid=R.sid) Subquery must be recomputed for each Sailors tuple. Think of subquery as a function call that runs a query
More on Set-Comparison Operators We have seen: IN, EXISTS can also have: NOT IN, NOT EXISTS Other forms: <op> ANY, <op> ALL 22 Find sailors whose rating is greater than that of some sailor called Popeye SELECT * FROM Sailors S WHERE S.rating > ANY (SELECT S2.rating FROM Sailors S2 WHERE S2.sname='Popeye') > ANY
A Tougher Query 23 Find sailors who ve reserved ALL boats (relational division: no counterexample boats ) SELECT S.sname FROM Sailors S Sailors S such that WHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid there is no boat B that FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid)) ... is missing a Reserves tuple showing that S reserved B
A Tougher Query 24 Find sailors who ve reserved ALL boats (here we use set difference: from all the boats remove the ones that Sailor S has reserved. If empty, then S is good) SELECT S.sname FROM Sailors S Sailors S such that WHERE NOT EXISTS ((SELECT B.bid FROM Boats B) EXCEPT (SELECT R.bid FROM Reserves R WHERE R.sid=S.sid)) there is no boat B that that has not been reserved by S
A Tougher Query 25 Find sailors who ve reserved ALL boats ( here we use count aggregates: count the total number of boats and the number of boats reserved by S) SELECT S.sname FROM Sailors S Sailors S such that WHERE (SELECT COUNT(B.bid) FROM Boats B) = (SELECT COUNT (DISTINCT R.bid) FROM Reserves R WHERE R.sid=S.sid) total number of boats equal to the number of distinct boats reserved by S
26 ARGMAX? The Sailor with the highest rating What about ties for highest? SELECT MAX(S.rating) FROM Sailors S; -- OK SELECT S.*, MAX(S.rating) FROM Sailors S; -- Not OK
27 ARGMAX? The Sailor with the highest rating What about ties for highest? SELECT * FROM Sailors S WHERE S.rating >= ALL (SELECT S2.rating FROM SELECT * FROM Sailors S WHERE S.rating = (SELECT MAX(S2.rating) FROM Sailors S2) Sailors S2) SELECT * FROM Sailors S ORDER BY rating DESC LIMIT 1;
28 NULL Values Field values are sometimes unknown or inapplicable SQL provides a special value null for such situations. The presence of null complicates many issues. E.g.: Special syntax IS NULL and IS NOT NULL Assume rating IS NULL. Consider predicate rating>8 . True? False? What about AND, OR and NOT connectives? SUM? We need a 3-valued logic (true, false and unknown). Meaning of constructs must be defined carefully. (e.g., WHERE clause eliminates rows that don t evaluate to true.) New operators (in particular, outer joins) possible/needed.
29 NULL Values: Truth table p q p OR q p AND q p = q TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE Unknown Unknown TRUE Unknown TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE Unknown Unknown FALSE Unknown FALSE TRUE Unknown Unknown Unknown TRUE Unknown FALSE Unknown Unknown FALSE Unknown Unknown Unknown Unknown Unknown
NULLs bname Downtown Boston Perry Mianus Kenmore bcity assets 9M 1.7M .4M NULL Given: branch2= Horse Horse Boston Aggregate operations: SUM -------- 11.1M returns SELECT SUM(assets) FROM branch2 NULL is ignored Same for AVG, MIN, MAX But.... COUNT(assets) retunrs 4! Let branch3 an empty relation Then: SELECT SUM(assets) FROM branch3 returns NULL but COUNT(<empty rel>) = 0
31 ARGMAX? The Sailor with the highest rating What about ties for highest? SELECT MAX(S.rating) FROM Sailors S; -- OK SELECT S.*, MAX(S.rating) FROM Sailors S; -- Not OK
32 ARGMAX? The Sailor with the highest rating What about ties for highest? SELECT * FROM Sailors S WHERE S.rating >= ALL (SELECT S2.rating FROM SELECT * FROM Sailors S WHERE S.rating = (SELECT MAX(S2.rating) FROM Sailors S2) Sailors S2) SELECT * FROM Sailors S ORDER BY rating DESC LIMIT 1;
Joins 33 SELECT (column_list) FROM table_name [INNER | NATURAL | {LEFT | RIGHT | FULL} | {OUTER}] JOIN table_name ON qualification_list WHERE INNER is default SELECT sname FROM sailors S JOIN reserves R ON S.sid=R.sid; SELECT sname FROM sailors S NATURAL JOIN reserves R WHERE R.bid = 102;
35 Constraints Over Multiple Relations CREATE TABLE Sailors ( sid sname CHAR(10), rating INTEGER, age REAL, PRIMARY KEY (sid), CHECK ( (SELECT COUNT (s.sid) FROM Sailors s) + (SELECT COUNT (b.bid) FROM Boats b) < 100 )) Number of boats plus number of sailors is < 100 INTEGER,
Constraints Over Multiple Relations 36 Number of boats plus number of sailors is < 100 CREATE TABLE Sailors ( sid sname CHAR(10), rating INTEGER, age REAL, PRIMARY KEY (sid), ) INTEGER, Awkward and wrong! Only checks sailors! ASSERTION is the right solution; not associated with either table. Unfortunately, not supported in many DBMS. Triggers are another solution. CREATE ASSERTION smallClub CHECK ( (SELECT COUNT (S.sid) FROM Sailors S) + (SELECT COUNT (B.bid) FROM Boats B) < 100 )
Views Views
38 Views: Named Queries CREATE VIEW view_name AS select_statement Makes development simpler Often used for security Not materialized CREATE VIEW Redcount AS SELECT b.bid, COUNT(*) AS scount FROM Boats2 b, Reserves2 r WHERE r.bid = b.bid AND b.color = 'red' GROUP BY b.bid
Views Instead of Relations in Queries CREATE VIEW Redcount AS SELECT b.bid, COUNT(*) AS scount FROM Boats2 b, Reserves2 r WHERE r.bid = b.bid AND b.color = 'red' GROUP BY b.bid 39 Redcount SELECT bname, scount FROM Redcount r, Boats2 b WHERE r.bid = b.bid AND scount < 10
Views create view vs INTO (2) CREATE VIEW branch2 AS SELECT bname, bcity FROM branch (1) SELECT bname, bcity FROM branch INTO branch2 vs (1) creates new table that gets stored on disk (2) creates virtual table (materialized when needed) Therefore: changes in branch are seen in the view version of branch2 (2) but not for the (1) case.
41 Subqueries in FROM Like a view create on the fly SELECT bname, scount FROM Boads2 b, (SELECT b.bid, COUNT(*) FROM Boats2 b, Reserves2 r WHERE r.bid=b.bid AND b.color='red' GROUP BY b.bid) AS Reds(bid, scount) WHERE Reds.bid=b.bid AND scount < 10
42 Common Table Expressions: WITH Another view creation on the fly syntax WITH Reds(bid, scount) AS (SELECT b.bid, COUNT(*) FROM Boats2 b, Reserves2 r WHERE r.bid=b.bid AND b.color='red' GROUP BY b.bid) SELECT bname, scount FROM Boads2 b, Reds WHERE Reds.bid=b.bid AND scount < 10
Find the rating for which the average age of sailors is the minimum over all ratings : SELECT Temp.rating, Temp.avgage FROM (SELECT S.rating, AVG(S.age) AS avgage, FROM Sailors S GROUP BY S.rating) AS Temp WHERE Temp.avgage = (SELECT MIN(Temp.avgage) FROM Temp)
SQL: Modification Commands View Updates: Suppose we have a view: CREATE VIEW branch-loan AS SELECT bname, lno FROM loan And we insert: INSERT INTO branch-loan VALUES( Perry , L-308) Then, the system will insert a new tuple ( Perry , L-308, NULL) into loan
SQL: Modification Commands What about... CREATE VIEW depos-account AS SELECT cname, bname, balance FROM depositor as d, account as a WHERE d.acct_no = a.acct_no INSERT INTO depos-account VALUES( Smith , Perry , 500) How many relations we need to update? Many systems disallow
46 Discretionary Access Control GRANT privileges ON object TO users [WITH GRANT OPTION] Object can be a Database, Table or a View Privileges can be: Select Insert Delete References (cols) allow to create a foreign key that references the specified column(s) All Can later be REVOKED Users can be single users or groups See R&G Chapter 21 for more details.
Example An example in MySQL: CREATE USER 'jeffrey'@ localhost' IDENTIFIED BY 'mypass'; GRANT ALL ON db1.* TO 'jeffrey'@'localhost'; GRANT SELECT ON db2.invoice TO 'jeffrey'@'localhost'; ALTER USER 'jeffrey'@'localhost' WITH MAX_QUERIES_PER_HOUR 90;
Another one from the book CREATE VIEW ActiveSailors(name, age, day) AS SELECT S.sname, S.age, R.day FROM Sailors S, Reserves R WHERE S.sid = R.sid AND S.rating > 6 GRANT INSERT,DELETE ON Reserves TO Yuppy WITH GRANT OPTION GRANT SELECT ON Reserves TO Michael GRANT UPDATE (rating) ON Sailors to Bill GRANT REFERENCES (bid) ON Boats TO Jim
Embedded SQL Embedded SQL
Embedded SQL SQL is not a general purpose programming language. + Tailored for data retrieval and manipulation + Relatively easy to optimize and parallelize - Can t write entire apps in SQL alone Options: Make the query language turing complete Avoids the impedance mismatch but, loses advantages of relational lang simplicity Allow SQL to be embedded in regular programming languages. The SQL standard defines embeddings of SQL in a variety of programming languages such as Pascal, PL/I, Fortran, C, and Cobol . Java and C++.