
Database System Architecture: Organizing Computer Systems
Explore the intricacies of database system architecture and how computer systems are organized, covering aspects such as networking, parallelism, and distribution. Learn about centralized, client-server, parallel, and distributed system architectures, as well as different network types. Dive into topics like centralized systems, parallelism granularity, client-server architecture, and server system architecture. Gain insights into the core principles and structures that underpin modern database systems.
Uploaded on | 3 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
1 CHAPTER 17: DATABASE-SYSTEM ARCHITECTURE
2 17.0: INTRODUCTION
3 17.0: Introduction Database-system architecture: how do I organize the underlying computer system? Three aspects: Networking (task assignment) Parallelism (task speed-up) Distribution (data safety)
4 Outline 17.1: Centralized and Client-Server Architecture 17.2: Server System Architectures 17.3: Parallel System 17.4: Distributed Systems 17.5: Network types
5 17.1: CENTRALIZED AND CLIENT-SERVER ARCHITECTURES
6 17.1.1: Centralized systems Centralized system: run on a single computer system core core Cache memory core core
7 17.1.1: Centralized systems Single-user systems: one processor, one or two hard disks one user at a time no concurrency control device, crash recovery only by backup before update Multiuser systems: more disks, more memory, more processors large number of user at a time need of concurrency control device and crash recovery provision
8 17.1.1: Centralized systems Parallelism: coarse vs fine granularity Coarse-granularity parallelism: a few powerful processors (1 or 2) share the same memory a task runs on a single processor high throughput, high execution time Fine-granularity parallelism: a lot of processors (1000 or more ) memory may not be shared a task is parallelized (split between processors)
9 17.2: Client-Server architecture
10 17.2: Client-server architecture Two-tier VS three-tier architecture Transactional Remote Procedure Call
11 17.2: SERVER SYSTEM ARCHITECTURE
12 17.2: Server System Architecture Transaction-server systems client: just send transaction requests to server server: stores the data performs transaction Data-server systems: client send data requests to server perform read and write on received data server just store the data (files or pages)
13 17.2.1: Transaction Servers
14 17.2.1: Transaction Servers Locks Transaction 1 Update record r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 No lock blocking r Record r
15 17.2.1: Transaction Servers Locks Transaction 1 Update record r Transaction 2 Update record r Create Lock_r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 Lock_r Lock_r blocks r Process the update Record r
16 17.2.1: Transaction Servers Locks for locks: mutual exclusion Transaction 1 Update record r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 No lock blocking r Record r
17 17.2.1: Transaction Servers Locks for locks: mutual exclusion Transaction 1 Update record r Transaction 2 Update record r Create Lock_r Check the lock table Lock table Lock1 Lock2 Lock3 Lock4 Loc No lock blocking r Record r
18 17.2.1: Transaction Servers Locks for locks: mutual exclusion Transaction 2 Update record r Transaction 1 Update record r Create Lock_r Check the lock table Lock table Wait until latch released Lock1 Lock2 Lock3 Lock4 Loc Record r
19 17.2.1: Transaction Servers Get rid of the lock manager Independent lock manager process Risk of overhead on the lock manager Send directly requests to the lock table Mutual exclusion on the lock table When lock release, send notification to the waiting processes Lock manager still used to handle deadlock detection
20 17.2.2: Data servers Principle: server just stores the data client handle all the back end functionalities Suitable when high-speed connection client and server machines have comparable power tasks are computation intensive Main issue cost of communication between client and server
21 17.2.2: Data servers Shipping granularity Item shipping: unit of communication: tuples or objects good for slow networks (we send only what we need) bad for high-speed networks (overhead of messages) Page hipping unit of communication: pages or files bad for slow networks (slowing down with useless information) good for high-speed network (less messages) Prefetching send information likely to be used in near future
22 17.2.2: Data servers Lock granularity Server File f Lock table L1 L2 L3 Lf Create lock Request: file f
23 17.2.2: Data servers Lock granularity Server pi Lock table L1 L2 L3 Lf
24 17.2.2: Data servers Lock granularity Server Reduce lock granularity pi Lock table L1 L2 L3 Lpi
25 17.2.2: Data servers Data caching Cache data shipped from the servers maintain even after the end of the transaction Ensure data coherency possible modifications on the data after caching! when use cached data, send a message to check if the data are up to date to acquire a lock on the data
26 17.2.2: Data servers Lock caching Check cached lock tables Cached lock table File f Lf Request for file f File f
27 17.2.3: Cloud-based servers Context: company running an information system entrust a part of the service to a third party Outsource the entire service machines and software are owned by a third party Cloud computing software (or applications) are provided by the service provider machines belongs to a third party interface are virtual machines
28 17.3: PARALLEL SYSTEMS
29 17.3: Parallel Systems Designed for very large databases (terabytes) high transaction throughput (thousands transactions per second) Two main measures of performance throughput: number of tasks the system can complete per second response time: time taken to complete a single Gain of performance by using parallel systems: speedup: perform a given task in less time scaleup: handle larger tasks
30 17.3.1: Speedup and Scaleup Linear speedup and scaleup Linear speedup: the speed to perform a given task is proportional to the degree of parallelism Linear scaleup If the size of a task is increased by n, and so is the degree of parallelism, the execution time remains the same
31 17.3.1: Speedup and Scaleup Two types of scaleup Batch scaleup: context: the size of the database increases tasks: large job, execution time depends on the size of the database measure of the size of the problem: size of the database Transaction scaleup: context: transaction rate increases tasks: small updates, rate aimed to increase with the size of the database measure of the size of the problem: transaction rate
32 17.3.1: Speedup and Scaleup Factor affecting performances Start-up costs: when a single task initiates Interference: when two tasks competes for the same share resource (bus, disks, locks ) Skew: obtain equal-size task division is difficult
33 17.3.2: Interconnection network
34 17.3.2: Interconnection network Bus: suitable for few processors only one component at a time can communicate Mesh: nodes not connecting to each other communicates by routing scales better with increasing parallelism Hypercube a component is connected to log(n) components at most log(n) links to traverse
35 17.3.3: Parallel Database architectures
36 17.3.3.1: Shared Memory Communication between processors: very efficient shared memory => memory write instead of messages Bottleneck on memory bus: not scalable Adding more processors will increase the bottleneck Memory cache on each processors enables to reduce reference on shared memory some data cannot be cached need to maintain cache coherency => increase overhead
37 17.3.3.2: Shared Disk Memory bus not a bottleneck anymore Provides fault tolerance tasks can be taken over by other processors in case of failure Bottleneck on access to disks Communication across processor is slower
38 17.3.3.3: Shared nothing Require high-speed interconnection network No memory or disk bottleneck Scalable: no limitation for the number of processors High cost for communication between nodes
39 17.3.3.4: Hierarchical Distributed-virtual memory: logically: single shared memory physically: multiple disjoint memory systems a software make the interface between processor view and hardware view
40 17.4: DISTRIBUTED SYSTEMS
41 17.4: Distributed system Database stored on several computers Communication: through high-speed networks or internet No share memory/disks Databases are geographically separated, and separately administered
42 17.4: Distributed system
43 17.4: Distributed Systems Purposes of distributed systems Sharing data: enable to access data geographically far and on different machines Autonomy: global administrator is responsible for the entire system some responsibilities are delegated to local administrator Availability: data are still available even if a site fails replication can handle this feature a failed site is not consider by transactions a process provides re-integration of recovered sites
44 17.4.1: An Example of a Distributed Database niiyang arnaud 11,000 100,000 Deposit +10,000
45 17.4.1: An Example of a Distributed Database niiyang arnaud 1000 100,000 Deposit +9000 Local transaction
46 17.4.1: An Example of a Distributed Database I want my money back!! niiyang arnaud 11,000 100,000 Order for transfer 9000 -9000 , +9000
47 17.4.1: An Example of a Distributed Database Thanks niiyang arnaud 2,000 109,000 Order for transfer 9000 Global transaction -9000 , +9000
48 17.4.2: Implementation issues Atomiticity A transaction may be executed on multiple site If it abort on one site, it aborts on all sites. If it commit on one site, it commit on all sites. Solution: two-phase commit protocol (2PC)
49 17.4.2: Implementation issues Atomiticity, 2PC S2 S1 S3 Transaction T Transaction T Transaction T Prepare T Prepare T Prepare T Coordinator
50 17.4.2: Implementation issues Atomiticity, 2PC S2 S1 S3 Transaction T Transaction T Transaction T <Ready T> <Ready T> <Ready T> Ready T Ready T Ready T Coordinator <Commit T>