Relational Algebra and Database Query Operations

cop4710 database systems n.w
1 / 26
Embed
Share

Explore the fundamentals of relational algebra, database querying, and common operations, such as set operations and relational operators. Learn how to manipulate data using high-level query languages like SQL and understand the importance of relational algebra in database systems.

  • Database
  • Relational Algebra
  • Query Operations
  • SQL
  • Set Operations

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. COP4710 Database Systems Relational Algebra Tallahassee, Florida Tallahassee, Florida

  2. Why Do We Learn This? Querying the database: specify what we want from our database Find all the people who earn more than $1,000,000 and pay taxes in Tallahassee Could write in C++/Java, but a bad idea Instead use high-level query languages: Theoretical: Relational Algebra, Datalog Practical: SQL Relational algebra: a basic set of operations on relations that provide the basic principles 1

  3. What is an Algebra? Mathematical system consisting of: Operands --- variables or values from which new values can be constructed Operators --- symbols denoting procedures that construct new values from given values Examples Arithmetic(Elementary) algebra, linear algebra, Boolean algebra What are operands? What are operators? 2

  4. What is Relational Algebra? A specialized algebra Whose operands are relations or variables that represent relations Whose operators are designed to do common things (?) that we need to do with relations in a database Most importantly! can be used as a query language for relations! Operators Based on set algebra (unordered lists with no duplicates) Retrieve and manipulate tuples in a relation Takes one or more relations as its input and outputs a new relation We can chain operators together to create more complex operations 3

  5. Relational Operators at a Glance Five basic R.A. operations: Basic Set Operations union, difference (no intersection, no complement) Selection: Projection: Cartesian Product: X When our relations have attribute names: Renaming: Derived operations: Intersection, complement Joins (natural join, equi-join, theta join, semi-join, ) 4

  6. Set Operations Union: all tuples in R1 or R2, denoted as R1 U R2 R1, R2 must have the same schema R1 U R2 has the same schema as R1, R2 Example: Active-Employees U Retired-Employees If any, is duplicate elimination required Difference: all tuples in R1 but not in R2, denoted as R1 R2 R1, R2 must have the same schema R1 - R2 has the same schema as R1, R2 Example All-Employees - Retired-Employees 5

  7. Selection Returns all tuples (rows) which satisfy a condition c, denoted as c(R) c is a condition: =, <, >, AND, OR, NOT Output schema: same as input schema Find all employees with salary more than $40,000: Salary > 40000(Employee) SSN Name Dept-ID Salary SSN Name Dept-ID Salary 111060000 Alex 1 30K 983210129 Chris 2 45K 754320032 Bob 1 32K 983210129 Chris 2 45K 6

  8. Projection Unary operation: returns certain columns, denoted as A1, ,An(R) Eliminates duplicate tuples ! Input schema R(B1, , Bm) Condition: {A1, , An} {B1, , Bm} Output schema S(A1, , An) Example: project social-security number and names: SSN, Name(Employee) SSN Name Dept-ID Salary SSN Name 111060000 Alex 1 30K 111060000 Alex 754320032 Bob 1 32K 754320032 Bob 983210129 Chris 2 45K 983210129 Chris 7

  9. Selection vs. Projection Think of relation as a table How are they similar? How are they different? Horizontal vs. vertical Duplicate elimination for both? What about in real systems? Why do you need both? 8

  10. Cartesian Product Each tuple in R1 with each tuple in R2, denoted as R1 x R2 Input schemas R1(A1, ,An), R2(B1, ,Bm) Output schema is S(A1, , An, B1, , Bm) Two relations are combined! Very rare in practice (why?); but joins are very common Example: Employee x Dependent 9

  11. Example Employee Dependent Employee-SSN Dependent-Name SSN Name 111060000 Chris 111060000 Alex 754320032 Brandy 754320032 David Employee x Dependent SSN Name Employee-SSN Dependent-Name 111060000 Alex 111060000 Chris 111060000 Alex 754320032 David 754320032 Brandy 111060000 Chris 754320032 Brandy 754320032 David 10

  12. Renaming Does not change the relational instance, denoted as Notation: S(B1, ,Bn)(R) Changes the relational schema only Input schema: R(A1, , An) Output schema: S(B1, , Bn) Example: Soc-sec-num, firstname(Employee) SSN Name Soc-sec-num firstname 111060000 Alex 111060000 Alex 754320032 Bob 754320032 Bob 983210129 Chris 983210129 Chris 11

  13. Set Operations: Intersection Intersection: all tuples both in R1 and in R2, denoted as R1 R2 R1, R2 must have the same schema R1 R2 has the same schema as R1, R2 Example UnionizedEmployees RetiredEmployees Intersection is derived: R1 R2 = R1 (R1 R2) why ? 12

  14. Theta Join A join that involves a predicate denoted as R1 R2 Input schemas: R1(A1, ,An), R2(B1, ,Bm) Output schema: S(A1, ,An,B1, ,Bm) Derived operator: R1 R2 = (R1 x R2) 1. Take the (Cartisian) product R1 x R2 2. Then apply SELECTCto the result As for SELECT, C can be any Boolean-valued condition 13

  15. Theta Join: Example Sells Bar Bar Beer Price Name Address AJ's 1800 Tennessee AJ s Bud 2.5 Michael's Pub 513 Gaines AJ s Miller 2.75 Michael s Pub Bud 2.5 Michael s Pub Corona 3.0 BarInfo := Sells Sells.Bar=Bar.NameBar Bar Beer Price Name Address AJ's 1800 Tennessee AJ s Bud 2.5 AJ's 1800 Tennessee AJ s Miller 2.75 Michael's Pub 513 Gaines Michael s Pub Bud 2.5 Michael's Pub 513 Gaines Michael s Pub Corona 3.0 14

  16. Natural Join Notation: R1 R2 Input Schema: R1(A1, , An), R2(B1, , Bm) Output Schema: S(C1, ,Cp) Where {C1, , Cp} = {A1, , An} U{B1, , Bm} Meaning: combine all pairs of tuples in R1 and R2 that agree on the join attributes: {A1, ,An} {B1, , Bm} (called the join attributes) 15

  17. Natural Join: Examples Employee Dependent SSN Dependent-Name SSN Name 111060000 Chris 111060000 Alex 754320032 Brandy 754320032 David Employee Dependent = SSN, Name, Dependent-Name( Employee.SSN=Dependent.SSN(Employee x Dependent) SSN Name Dependent-Name 111060000 Alex Chris 754320032 Brandy David 16

  18. Natural Join: Examples R S A B B C Z U X Y X Z V W Y Z Z V Z V R S A B C X Z U X Z V Y Z U Y Z V Z V W 17

  19. Equi-join Special case of theta join: condition c contains only conjunctions of equalities Result schema is the same as that of Cartesian product May have fewer tuples than Cartesian product Most frequently used in practice: R1 = = R2 Natural join is a particular case of equi-join A lot of research on how to do it efficiently 18

  20. A Joke About Join A join query walks up to two tables in a restaurant and asks : Mind if I join you? 19

  21. Building Complex Expressions Algebras allow us to express sequences of operations in a natural way Example In arithmetic algebra: (x + 4)*(y - 3) Relational algebra allows the same Three notations, just as in arithmetic: 1. Sequences of assignment statements 2. Expressions with several operators 3. Expression trees 20

  22. Sequences of Assignments Create temporary relation names Renaming can be implied by giving relations a list of attributes Example: R3 := R1 CR2 can be written: R4 := R1 x R2 (R4: temporary relation) R3 := C(R4) 21

  23. Expressions with Several Operators Example: the theta-join R3 := R1 JOINCR2 can be written: R3 := C(R1 x R2) Precedence of relational operators: 1. Unary operators --- selection, projection, renaming --- have highest precedence, bind first 2. Then come products and joins 3. Then intersection 4. Finally, union and set difference bind last But you can always insert parentheses () to force the order you desire 22

  24. Expression Trees Leaves are operands either variables standing for relations or particular constant relations Interior nodes are operators, applied to their child or children 23

  25. Expression Tree: Examples Given Bars(name, addr), Sells(bar, beer, price), find the names of all the bars that are either on Tennessee St. or sell Bud for less than $3 UNION RENAMER(name) PROJECTname PROJECTbar SELECTaddr = Tennessee St. SELECT price<3 AND beer= Bud Bars Sells 24

  26. Summary of Relational Algebra Why bother ? Can write any RA expression directly in C++/Java, seems easy Two reasons: Each operator admits sophisticated implementations (think of and C) Expressions in relational algebra can be rewritten: optimized (age >= 30 AND age <= 35)(Employees) Method 1: scan the file, test each employee Method 2: use an index on age Employees Relatives Iterate over Employees, then over Relatives? Or iterate over Relatives, then over Employees? Sort Employees, Relatives, do merge-join hash-join etc. 25

More Related Content