Design Phase in CMPT 275: Overview of Data Persistence and Relational DB Design

Design Phase in CMPT 275: Overview of Data Persistence and Relational DB Design
Slide Note
Embed
Share

In CMPT 275's Design Phase, focus is placed on data persistence and relational database design. The process involves normalization to ensure data integrity, reduce redundancies, and optimize performance. Various aspects such as low-level design, high-level design, and object-oriented classes are considered. Examples and illustrations are provided to help students understand the concepts effectively.

  • CMPT 275
  • Design Phase
  • Data Persistence
  • Relational Database Design
  • Normalization

Uploaded on Feb 24, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. CMPT 275 Phase: Design 1

  2. Map of design phase DESIGN LOW LEVEL DESIGN HIGH LEVEL DESIGN Data Persistance Subsystem Classes Module Interfaces Class Interfaces Interaction Diagrams Modularization architecture User Interface User Manual Implementation Janice Regan, 2008 2

  3. Implementation issues related to Data Persistence NORMALIZATION 3

  4. Relational DB Design We will structure our relational database table(s) using Normalization process of assigning attributes to tables series of stages called Normal Forms 1st normal form: fixed length records 2nd normal form: remove partial dependencies 3rd normal form: remove transitive dependencies Janice Regan, 2008 4

  5. Relational DB Design We will structure our relational database table(s) using Normalization Advantage: assures equal length records reduces data redundancies hence helps eliminate problems that result from redundancies Disadvantage: decrease performance as we normalize to higher forms, higher forms require more tables Janice Regan, 2008 5

  6. An example To illustrate how to normalize, we shall use the example of a Student Registration System. Here are some requirements: 1. For each student, we need to remember: student-id, name, address, phone, courses 2. For each of the course taken, must remember: credit, semester, grade, room, instructor s office, instructor. Janice Regan, 2008 6

  7. An example To illustrate how to normalize, we shall use the example of a Student Registration System. Here are some requirements: 3. Students can repeat same course in a later semester 4. There is only one offering a a given course in a semester 5. For each of the course attempted, must remember: semester, grade, room, instructor, instructor's office Janice Regan, 2008 7

  8. OO Classes: Class diagram Student Course offering Course Student ID name address Course name semester room Instructor Instructor s office List of students List of grades Course Name Receives a grade for takes 0..* 0..* 0..* 1 phone credit List of courses Janice Regan, 2008 8

  9. Our First Table From our requirements, we could create the following database table: (horizontal lines separate records, representing single student objects) Grade Credit Room Instructor Std-name Std-address Std-phone Coursename Std-id Sem Instructor s office Paul K. Dr. Klaus M. Nole V. Karu W. Loti Dr. Quel Dr. Klaus Dr. Yu 15438 Brook St., Bby AQ2 AQ1 ASB WM EDC AQ1 AQ2 Cmpt 101 Cmpt 150 Bus 152 Engl 102 Biol 234 Cmpt 354 03-2 03-1 03-2 03-1 03-2 03-1 03-2 4 3 3 2 3 3 3 C- B A- A+ B+ D A- ASB985 ASB352 WM543 AQ834 EDC243 ASB985 ASB111 294.2563 Will B. Kim L. Xiao T. 25636 47352 21544 Elf Ave., Van. Mer Cr., Poco Alpha St., Bby 256.2453 939.2766 295.9976 Each record is uniquely identified by the student number (Std-id). The primary key for the table is therefore Std-id. This table is unnormalized (contains records of varying length) Janice Regan, 2008 9

  10. Our First Table: Is there a problem? student-id name address phone course credit semester grade room instructor instructor s office NOT ALL RECORDS ARE OF THE SAME LENGTH !! Group of attributes repeated for each course taken by 1 student Group of attributes repeated each time a particular course is attempted by 1 student So far, attributes are in an unnormalized form. Objects, transformed into DB records, will not all be of same length. Each record contains all information about one student. Janice Regan, 2008 10

  11. What is the problem? Problem: attributes that are lists (multiple courses per student, multiple attempts per course) do not produce fixed length records Solution: remove lists by adding additional rows one to hold each attribute in the list Consider the example: for a student, have 1 complete row per course attempted/taken This results in a table in First Normal Form Janice Regan, 2008 11

  12. First Normal Form Definition of First Normal Form (1NF): Tables do not have repeating groups, i.e., each row/column intersection can contain one and only one value, not a set of values. All the key attributes are defined, no blank (null) values of keys are permitted Janice Regan, 2008 12

  13. Our example Defining primary key attributes Which attributes are needed to assure each record is uniquely identified Std-Id is not enough a student can take multiple courses Std-id and course name is not enough a student can take the same course more than once if they wish Std-id, course name and semester is enough Each time the student takes a course it is uniquely identified as a single record in the table Janice Regan, 2008 13

  14. First Normal Form of Table (1NF) Course-name Semester Grade Credit Room InstructorInstructor s Std-address Std-phone Std-id Std-name office ASB985 ASB352 WM543 AQ834 EDC243 ASB 985 ASB111 Paul K. Paul K. Paul K. Will B. Kim L. Xiao T. Xiao T. Dr. Klaus M. Nole V. Karu W. Loti Dr. Quel Dr. Klaus Dr. Yu 15438 15438 15438 25636 47352 21544 21544 Brook St., Bby Brook St., Bby Brook St., Bby Elf Ave., Van. Merry Cr., Poco Alpha St., BBY Alpha St., Bby AQ2 AQ1 ASB WM EDC AQ1 AQ2 Cmpt 101 Cmpt 150 Bus 152 Engl 102 Biol 234 Cmpt 354 Cmpt 354 03-2 03-1 03-2 03-1 03-2 03-1 03-2 4 3 3 2 3 3 3 C- B A- A+ B+ D A- 294.2563 294.2563 294.2563 256.2453 939.2766 295.9976 295-9976 Result: Single table with compound (multi-attribute) primary key. Primary Key: each row uniquely identified by one single attribute Compound Primary Key: each row uniquely identify by a or group of attributes The compound primary key for the above table is: Std-id, Course-name, Semester Janice Regan, 2008 14

  15. Is there still a problem? Yes! Our table in First Normal Form could still contain data redundancies due to partial dependencies. Partial dependencies are based only on a part of the compound primary key. Consider an attribute A, that is dependent on the compound primary key K If A is dependent on all components of the compound primary key the A is fully dependent on K If A is dependent on some but not all of the components of the primary key then A is partially dependent on K Janice Regan, 2008 15

  16. Redundancy: Examples + problems Examples of redundancy and partial dependence: For each course a student takes the student s name, address and phone number are repeated. A student s name and address are dependent on the student s id but not on the course name or semester For each course a student repeats the course credit is repeated. The course credit is dependent on the course name but not the student s id or the semester Janice Regan, 2008 16

  17. Problems related to Redundancy Redundancy Insert anomalies: e.g. Each time a student takes a course the student information must be entered, this adds to the potential for error Delete anomalies: if delete the row where info about Std-id 47352 is stored will also delete info that cannot be found anywhere else in DB table namely that Dr. Quel s office is EDC243 Update problems: because of redundant data, if a student moves, need to change student's address in all rows corresponding to every instance of every course the student had ever taken. Problems occur if one occurrence is missed or an error is made in one occurrence Janice Regan, 2008 17

  18. Partial Dependencies Definition: non-key attribute(s) dependent on only some of primary key(s) Examples: Phone #, Std-name, and address depend only on Std- id (not course name or semester) Credit depends only on course name (not on Std-id or semester) Instructor, Instructor s office, room, and grade depend on course and semester (not Std-id) Janice Regan, 2008 18

  19. Partial Dependencies Definition: non-key attribute(s) dependent on only some of primary key(s) When an attribute is only partially dependent on the primary keys of the table there may be redundant occurrences of that attribute in the table Therefore, To remove redundancies we should remove partial dependencies Janice Regan, 2008 19

  20. Problem with 1NF non-key attribute(s) may depend on some but not all of the primary key(s) e.g.: primary keys are Std_id, Course- name and semester address depends only on Std-id Janice Regan, 2008 20

  21. From 1NF to 2NF Solution: Remove partial dependencies Determine if there are any partial dependencies. If so, divide 1NF table into several tables such that in each table each non-primary key attribute is dependent on only the primary key (or compound primary key) of that table. Note that if primary key of 1NF table is not a compound primary key, there cannot be partial dependencies and hence the 1NF table is already in 2NF. Janice Regan, 2008 21

  22. Second Normal Form - Example Transform our 1NF table into a 2NF table STEP 1: We determine dependencies on single primary key: Std-id Course-name Semester Phone #, Std-name, Std-address credit none dependent only on semester Janice Regan, 2008 22

  23. 2NF Example - Step 1 DB tables look like: Course Table Student Table Course-name credit Std-id Std-name Std-address Std-phone Cmpt 101 4 15438 Paul K. Brook St. Bby 294.2563 25636 Will B. Elf Ave., Van. 256.2453 Cmpt 150 3 47352 Kim L. Merry Cr., Poco 939.2766 Bus 152 3 21544 Xiao T. Alpha St., Bby 295.9976 Engl 102 2 Biol 234 3 Cmpt 354 3 Janice Regan, 2008 23

  24. Second Normal Form - Example STEP 2: We determine dependencies on pairs of primary keys: Course-name + Semester room, instructor, instructor s office Course-name + Std-id none Semester + Std-id none Janice Regan, 2008 24

  25. 2NF Example - Step 2 Course Offering Table Course-name Semester Room Instructor Instructor s office Cmpt 101 03-2 AQ2 Dr. Klaus ASB985 Cmpt 150 03-1 AQ1 M. Nole ASB352 Bus 152 03-2 ASB V. Karu WM543 Engl 102 03-1 WM W. Loti AQ834 Biol 234 03-2 EDC Dr. Quel EDC243 Cmpt 354 03-1 AQ1 Dr. Klaus ASB985 Cmpt 354 03-2 AQ2 Dr. Yu ASB111 Janice Regan, 2008 25

  26. Second Normal Form - Example Step 3 We determine dependencies on whole compound primary key: Course-name + Semester + Std-id grade Janice Regan, 2008 26

  27. 2NF Example - Step 3 Student Registration Table Std-id Course-name Grade Semester 03-2 03-1 03-2 03-1 03-2 03-1 03-2 C- B A- A+ B+ D A- 15438 15438 15438 25636 47352 21544 21544 Cmpt 101 Cmpt 150 Bus 152 Engl 102 Biol 234 Cmpt 354 Cmpt 354 Janice Regan, 2008 27

  28. 2NF Example, alternate Step 3-1 Introducing Association Looking at the data model (class diagram), we can recognize the many-to-many multiplicity relationship between Student, grade and CourseOffering Student Course offering Course Course Name credit Student ID name address phone List of courses Course name semester List of grades room Instructor Instructor s office List of students Receives a grade for takes 0..* 0..* 1 0..* These 3 attributes are used to implement many-to-many multiplicity relationships Janice Regan, 2008 28

  29. Association Class However, this data model does not lead to DB tables with records of fixed length because these 3 attributes are of varying size for each object of Student and Course Offering class types, so we introduce yet another class that associates 1 student to many attempts at (registrations to) one course and 1 course to many attempts (registrations) per 1 student. For each of these attempts there is one grade Janice Regan, 2008 29

  30. Association Class An association class takes a many-many relation and breaks it into two 1-many relationships Student and Student Registration have a 1-to-many multiplicity relationship Course Offering and Student Registration have a 1-to- many multiplicity relationship The association class will contain the attributes that are lists (that cause the many to may relationship) The association class wil contain the attributes that depend upon all the variables (lists) in the association class. Janice Regan, 2008 30

  31. 2NF Example, alternate Step 3 - 2 This relationship can be broken down into 2 1-to-many multiplicity relationships by creating an association class Student-Registration Student Course offering Course Course Name credit 1 Student ID name address phone Course name semester room Instructor Instructor s office 0..* Student Registration Student ID 1 Course name semester grade 1 0..* 0..* Janice Regan, 2008 31

  32. 2NF Example, alternate Step 3 - 3 Course Offering Table We can therefore store the attributes that depend on this association into a Student Registration table. The compound primary key of this table is the union of the primary keys of the Student and the Course Offering tables: Student Registration Table Std-id Course-name Semester Course-name Semester Room Instructor Instructor s office Cmpt 101 03-2 AQ2 Dr. Klaus ASB985 Cmpt 150 03-1 AQ1 M. Nole ASB352 Bus 152 03-2 ASB V. Karu WM543 Engl 102 03-1 WM W. Loti AQ834 Biol 234 03-2 EDC Dr. Quel EDC243 Cmpt 354 03-1 AQ1 Dr. Klaus ASB985 Cmpt 354 03-2 AQ2 Dr. Yu ASB111 Grade Course Table Course-name credit Cmpt 101 4 Cmpt 150 3 Student Table Std-id Std-name Std-address Std-phone 03-2 03-1 03-2 03-1 03-2 03-1 03-2 Cmpt 101 Cmpt 150 Bus 152 Engl 102 Biol 234 Cmpt 354 Cmpt 354 15438 15438 15438 25636 47352 21544 21544 C- B A- A+ B+ D A- 25636 Will B. Elf Ave., Van. 256.2453 47352 Kim L. Merry Cr., Poco 939.2766 Bus 152 3 21544 Xiao T. Alpha St., Bby 295.9976 Engl 102 2 Biol 234 3 Cmpt 354 3 Janice Regan, 2008 32

  33. Second Normal Form To get Student Registration System in 2NF we need 4 tables (files) Multiplicities come from our Requirement Analysis phase With this 2NF DB, students do not have to register to a course to be admitted to an institution Room and instructor for a course offering can be entered even if there are no students registered yet Less redundancy:Most update problems have been eliminated, but we can still have multiple occurrences of instructor and instructor s office Janice Regan, 2008 33

  34. Second Normal Form Definition of 2NF: The table is in 1NF The table includes no partial dependencies Janice Regan, 2008 34

  35. Is there still a problem? Yes! Our tables in 2NF could still contains data redundancies due to transitive dependencies. When one non-primary key attribute is dependent on another non-primary key attribute, the second non-primary key attribute is transitively dependent on the first non-primary key attribute. Janice Regan, 2008 35

  36. Transitive Dependencies: example instructor s office (non-primary key attribute) is transitively dependent on instructor (another non- primary key attribute) but not on any of the primary key attributes for that particular table (course and/or semester) Solution: Conversion from 2NF to 3NF Determine the transitive dependencies. Split 2NF table containing the transitive dependency such that the dependency is represented by its own table. Janice Regan, 2008 36

  37. 3NF Example Course Offering Table Instructor Table Course Table Course-name credit Course-name Semester Room Instructor Instructor Instructor s office Cmpt 101 03-2 AQ2 Dr. Klaus Cmpt 101 4 Cmpt 150 3 Dr. Klaus ASB985 M. Nole ASB352 V. Karu WM543 Cmpt 150 03-1 AQ1 M. Nole Bus 152 3 Bus 152 03-2 ASB V. Karu W. Loti AQ834 Dr. Quel EDC243 Dr. Yu ASB111 Engl 102 2 Biol 234 3 Engl 102 03-1 WM W. Loti Biol 234 03-2 EDC Dr. Quel Cmpt 354 3 Cmpt 354 03-1 AQ1 Dr. Klaus Cmpt 354 03-2 AQ2 Dr. Yu Student Registration Table Std-id Course-name Semester Grade Student Table Std-id Std-name Std-address Std-phone 03-2 03-1 03-2 03-1 03-2 03-1 03-2 Cmpt 101 Cmpt 150 Bus 152 Engl 102 Biol 234 Cmpt 354 Cmpt 354 15438 15438 15438 25636 47352 21544 21544 C- B A- A+ B+ D A- 25636 Will B. Elf Ave., Van. 256.2453 47352 Kim L. Merry Cr., Poco 939.2766 21544 Xiao T. Alpha St., Bby 295.9976 Janice Regan, 2008 37

  38. Third Normal Form Definition: Every table is in 2NF. There are no transitive dependencies. Janice Regan, 2008 38

  39. Normalization Summary When normalizing, we seek to make sure that attributes depend on the key (1NF) on the whole key (2NF) on nothing but the key (3NF) When normalized: records have fixed length no insert/delete/update anomalies minimize redundancy Janice Regan, 2008 39

Related


More Related Content