Database System Architecture: Organizing Computer Systems

slide1 n.w
1 / 89
Embed
Share

Explore the intricacies of database system architecture and how computer systems are organized, covering aspects such as networking, parallelism, and distribution. Learn about centralized, client-server, parallel, and distributed system architectures, as well as different network types. Dive into topics like centralized systems, parallelism granularity, client-server architecture, and server system architecture. Gain insights into the core principles and structures that underpin modern database systems.

  • Database System Architecture
  • Computer Systems
  • Networking
  • Parallelism
  • Distribution

Uploaded on | 3 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. 1 CHAPTER 17: DATABASE-SYSTEM ARCHITECTURE

  2. 2 17.0: INTRODUCTION

  3. 3 17.0: Introduction Database-system architecture: how do I organize the underlying computer system? Three aspects: Networking (task assignment) Parallelism (task speed-up) Distribution (data safety)

  4. 4 Outline 17.1: Centralized and Client-Server Architecture 17.2: Server System Architectures 17.3: Parallel System 17.4: Distributed Systems 17.5: Network types

  5. 5 17.1: CENTRALIZED AND CLIENT-SERVER ARCHITECTURES

  6. 6 17.1.1: Centralized systems Centralized system: run on a single computer system core core Cache memory core core

  7. 7 17.1.1: Centralized systems Single-user systems: one processor, one or two hard disks one user at a time no concurrency control device, crash recovery only by backup before update Multiuser systems: more disks, more memory, more processors large number of user at a time need of concurrency control device and crash recovery provision

  8. 8 17.1.1: Centralized systems Parallelism: coarse vs fine granularity Coarse-granularity parallelism: a few powerful processors (1 or 2) share the same memory a task runs on a single processor high throughput, high execution time Fine-granularity parallelism: a lot of processors (1000 or more ) memory may not be shared a task is parallelized (split between processors)

  9. 9 17.2: Client-Server architecture

  10. 10 17.2: Client-server architecture Two-tier VS three-tier architecture Transactional Remote Procedure Call

  11. 11 17.2: SERVER SYSTEM ARCHITECTURE

  12. 12 17.2: Server System Architecture Transaction-server systems client: just send transaction requests to server server: stores the data performs transaction Data-server systems: client send data requests to server perform read and write on received data server just store the data (files or pages)

  13. 13 17.2.1: Transaction Servers

  14. 14 17.2.1: Transaction Servers Locks Transaction 1 Update record r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 No lock blocking r Record r

  15. 15 17.2.1: Transaction Servers Locks Transaction 1 Update record r Transaction 2 Update record r Create Lock_r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 Lock_r Lock_r blocks r Process the update Record r

  16. 16 17.2.1: Transaction Servers Locks for locks: mutual exclusion Transaction 1 Update record r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 No lock blocking r Record r

  17. 17 17.2.1: Transaction Servers Locks for locks: mutual exclusion Transaction 1 Update record r Transaction 2 Update record r Create Lock_r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 Loc No lock blocking r Record r

  18. 18 17.2.1: Transaction Servers Locks for locks: mutual exclusion Transaction 2 Update record r Transaction 1 Update record r Create Lock_r Check the lock table Lock table Wait until latch released Lock1 Lock2 Lock3 Lock4 Loc Record r

  19. 19 17.2.1: Transaction Servers Get rid of the lock manager Independent lock manager process Risk of overhead on the lock manager Send directly requests to the lock table Mutual exclusion on the lock table When lock release, send notification to the waiting processes Lock manager still used to handle deadlock detection

  20. 20 17.2.2: Data servers Principle: server just stores the data client handle all the back end functionalities Suitable when high-speed connection client and server machines have comparable power tasks are computation intensive Main issue cost of communication between client and server

  21. 21 17.2.2: Data servers Shipping granularity Item shipping: unit of communication: tuples or objects good for slow networks (we send only what we need) bad for high-speed networks (overhead of messages) Page hipping unit of communication: pages or files bad for slow networks (slowing down with useless information) good for high-speed network (less messages) Prefetching send information likely to be used in near future

  22. 22 17.2.2: Data servers Lock granularity Server File f Lock table L1 L2 L3 Lf Create lock Request: file f

  23. 23 17.2.2: Data servers Lock granularity Server pi Lock table L1 L2 L3 Lf

  24. 24 17.2.2: Data servers Lock granularity Server Reduce lock granularity pi Lock table L1 L2 L3 Lpi

  25. 25 17.2.2: Data servers Data caching Cache data shipped from the servers maintain even after the end of the transaction Ensure data coherency possible modifications on the data after caching! when use cached data, send a message to check if the data are up to date to acquire a lock on the data

  26. 26 17.2.2: Data servers Lock caching Check cached lock tables Cached lock table File f Lf Request for file f File f

  27. 27 17.2.3: Cloud-based servers Context: company running an information system entrust a part of the service to a third party Outsource the entire service machines and software are owned by a third party Cloud computing software (or applications) are provided by the service provider machines belongs to a third party interface are virtual machines

  28. 28 17.3: PARALLEL SYSTEMS

  29. 29 17.3: Parallel Systems Designed for very large databases (terabytes) high transaction throughput (thousands transactions per second) Two main measures of performance throughput: number of tasks the system can complete per second response time: time taken to complete a single Gain of performance by using parallel systems: speedup: perform a given task in less time scaleup: handle larger tasks

  30. 30 17.3.1: Speedup and Scaleup Linear speedup and scaleup Linear speedup: the speed to perform a given task is proportional to the degree of parallelism Linear scaleup If the size of a task is increased by n, and so is the degree of parallelism, the execution time remains the same

  31. 31 17.3.1: Speedup and Scaleup Two types of scaleup Batch scaleup: context: the size of the database increases tasks: large job, execution time depends on the size of the database measure of the size of the problem: size of the database Transaction scaleup: context: transaction rate increases tasks: small updates, rate aimed to increase with the size of the database measure of the size of the problem: transaction rate

  32. 32 17.3.1: Speedup and Scaleup Factor affecting performances Start-up costs: when a single task initiates Interference: when two tasks competes for the same share resource (bus, disks, locks ) Skew: obtain equal-size task division is difficult

  33. 33 17.3.2: Interconnection network

  34. 34 17.3.2: Interconnection network Bus: suitable for few processors only one component at a time can communicate Mesh: nodes not connecting to each other communicates by routing scales better with increasing parallelism Hypercube a component is connected to log(n) components at most log(n) links to traverse

  35. 35 17.3.3: Parallel Database architectures

  36. 36 17.3.3.1: Shared Memory Communication between processors: very efficient shared memory => memory write instead of messages Bottleneck on memory bus: not scalable Adding more processors will increase the bottleneck Memory cache on each processors enables to reduce reference on shared memory some data cannot be cached need to maintain cache coherency => increase overhead

  37. 37 17.3.3.2: Shared Disk Memory bus not a bottleneck anymore Provides fault tolerance tasks can be taken over by other processors in case of failure Bottleneck on access to disks Communication across processor is slower

  38. 38 17.3.3.3: Shared nothing Require high-speed interconnection network No memory or disk bottleneck Scalable: no limitation for the number of processors High cost for communication between nodes

  39. 39 17.3.3.4: Hierarchical Distributed-virtual memory: logically: single shared memory physically: multiple disjoint memory systems a software make the interface between processor view and hardware view

  40. 40 17.4: DISTRIBUTED SYSTEMS

  41. 41 17.4: Distributed system Database stored on several computers Communication: through high-speed networks or internet No share memory/disks Databases are geographically separated, and separately administered

  42. 42 17.4: Distributed system

  43. 43 17.4: Distributed Systems Purposes of distributed systems Sharing data: enable to access data geographically far and on different machines Autonomy: global administrator is responsible for the entire system some responsibilities are delegated to local administrator Availability: data are still available even if a site fails replication can handle this feature a failed site is not consider by transactions a process provides re-integration of recovered sites

  44. 44 17.4.1: An Example of a Distributed Database niiyang arnaud 11,000 100,000 Deposit +10,000

  45. 45 17.4.1: An Example of a Distributed Database niiyang arnaud 1000 100,000 Deposit +9000 Local transaction

  46. 46 17.4.1: An Example of a Distributed Database I want my money back!! niiyang arnaud 11,000 100,000 Order for transfer 9000 -9000 , +9000

  47. 47 17.4.1: An Example of a Distributed Database Thanks niiyang arnaud 2,000 109,000 Order for transfer 9000 Global transaction -9000 , +9000

  48. 48 17.4.2: Implementation issues Atomiticity A transaction may be executed on multiple site If it abort on one site, it aborts on all sites. If it commit on one site, it commit on all sites. Solution: two-phase commit protocol (2PC)

  49. 49 17.4.2: Implementation issues Atomiticity, 2PC S2 S1 S3 Transaction T Transaction T Transaction T Prepare T Prepare T Prepare T Coordinator

  50. 50 17.4.2: Implementation issues Atomiticity, 2PC S2 S1 S3 Transaction T Transaction T Transaction T <Ready T> <Ready T> <Ready T> Ready T Ready T Ready T Coordinator <Commit T>

Related


More Related Content