Advanced Databases: Unveiling the Evolution of Database Systems

cs345 advanced databases n.w
1 / 30
Embed
Share

Dive into the realm of advanced databases with a focus on database fundamentals, the relational model, key historical figures, and the efficient implementation of the relational model. Explore the journey from the birth of the relational model to the dominant role it plays in today's data landscape.

  • Databases
  • Relational Model
  • Database Systems
  • Evolution
  • Efficiency

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. CS345: Advanced Databases Chris R

  2. What this course is Database fundamentals: Theory Old Crusty, Good SQL stuff No/New/Not-Yet SQL New stuff: Knowledge bases & Inference Databases is a strange and beautiful area: Theory, Algorithms, Systems, & Applications It s a bit scattered, and I love it.

  3. A Brief, Biased Database History

  4. Charles Bachmann Edgar Codd Jim Gray Three Turing Award Winners Seminal contributions made in Industry

  5. The Birth of the Relational Model (1971) database: a handful of relations (tables) with fixed schema. WorksIn(Employee,Dept) Query with small # of operations: Selection (filter), Projection, Join, Union. Basically, an operational finite model theory.

  6. Data and Query Model R(A,B) = { (a1,b2), ,(an,bn) } S(B,C,D) = { (b 1,c1,d1), ,(b m,cm,dm) } Data (R) ={ a : exists b. (a,b) in R } Projection F(R) ={ (a,b) : F( (a,b) ) for t in R } F : D(R) -> {True, False} Selection Join(R,S) = Join { (a,b,c,d) : (a,b) in R & (b,c,d) in S}

  7. Key idea of the Relational Model Declarative User says what they want--- n not ot how to get it.

  8. Key question: Can one implement the Relational Model efficiently?

  9. System R Pat Selinger In,1974 System R shows possible to get good performance. 1st Implementation of SQL. IBM didn t Push it, worried about IMS cannibalization, but

  10. Others Come on to the Scene Larry Ellison hears about IBM s Research prototype and founds a company .

  11. Fast Forward to Today Relational model is dominate model of data.

  12. Takeaways about Database Research Started with mathematical elegance and with close ties to industry. Improve runtime performance as a proxy to increase programmer productivity.

  13. The Big Ideas

  14. Independence Declarative languages can improve productivity Different team members work independently Backend, Storage, UI, BI, Etc. Transactional model. Challenge: Support efficient concurrent access?

  15. Performance Parallel programming is hard; SQL is most popular parallel programming language. How do you deal with asymmetry of memory hierarchy (Disk/MM/Cache)? How do you structure parallel optimization? Concurrency?

  16. Manageability Systems live over time, and the system should automate many routine tasks. Maintain derived data products (views) Self-monitoring systems (autonomic)

  17. Course Topics

  18. A user says what they want not how to get it.

  19. Topic 1: QP Fundamentals Query Processing Fundamentals 1. Empirical Join evaluation from 70s! 2. System R: The Archetype (Cardinalityw) 3. Formal Query Languages 4. Acyclic Query Evaluation (Structure) 5. Worst-case Optimal Join Algorithms (S + C) This will be the most formal part of the course.

  20. Analyzing your data before it was big (when it was just very large )

  21. Topic 2: OLAP-Style Analytics Building new and old data systems: 1. Theory of Materialized View 2. Gamma (Parallel DBs) 3. MapReduce & the Rise of NoSQL (2000s) 4. NewSQL & Optimizing Joins on MR (theory) 5. Fagin s Algorithm (theory) 6. Statistical Analytic Systems

  22. My biased view of the future

  23. Topic 3: Next-Generation Systems 1. Information Extraction 2. Probabilistic Query Evaluation (Theory) 3. Scalable Inference 4. Knowledge Bases

  24. Transactions.

  25. Topic 4: OLTP Style Transactional Systems 1. The rise of Key-Value Stores 2. The case for determinism 3. CALM & CAPs 4. The Return of Main Memory DBs. 5. Spanner, F1, and Data Centers

  26. Course Logistics

  27. Grading Course Project (More next) Do something interesting with data. Teams OK Form teams soon and email me by Jan 12. Midterm Exam

  28. Projects in each topic 1. Knowledgebase Construction Pick a domain and build a KBC system for it with DeepDive 2. Join Algorithms Certificate versions (see me) MapReduce? GraphLab? Spark? 3. Analytics Systems You are free to choose other projects 4. Transactional Systems.

  29. Datasets Snapshot of the web marked up with NLP tools and structured data (KBP and KBA challenges) 500k+ docs used by PaleoBiologists and structured data. We can mark up even more stuff. Benchmark ML, graphs if you want to work on analytics or join evaluation.

  30. Wednesday Wednesday we begin the ancient art of join evaluation. All who pass this way must pass through this ancient topic! Read: Shapiro. not too carefully, we ll go through details

More Related Content