High Availability Strategy with Microsoft SQL Server 2008

slide1 n.w
1 / 42
Embed
Share

Discover how to implement a high availability strategy for your enterprise using Microsoft SQL Server 2008. Learn about disaster recovery, SQL Server Always On technologies, and developing your availability solution. Understand key terms, business continuity planning, SLAs, recovery objectives, and protection levels against various failures and disasters.

  • SQL Server
  • High Availability
  • Disaster Recovery
  • Microsoft
  • Enterprise

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Building a High Availability Strategy for Your Enterprise Using Microsoft SQL Server 2008 Matt Hollingsworth Principal Program Manager Microsoft Corporation DAT303

  2. Agenda Introduction to High Availability and Disaster Recovery SQL Server Always On Technologies Developing Your Availability Solution Conclusion

  3. Introduction to High Availability and Disaster Recovery Definitions Introduce key terms and concepts Business Continuity Planning Overview of the BCP process SQL Server High Availability Planning How does BCP apply to SQL Server availability?

  4. High Availability and Disaster Recovery: Definition High Availability High availability is a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period Disaster Recovery Processes and procedures designed to restore business operations due to a natural or human-induced disaster Typically involves providing redundancy spanning multiple sites or across geographic regions Availability defined in terms of service level agreements (SLA) Recovery Time Data loss during unplanned downtime A highly available application should be accessible by users x% of the time

  5. Defining x and SLA Recovery Time Objective (RTO) guided by availability requirements How much downtime can you tolerate? Availability Class Acceptable Downtime (hrs/yr) OR RTO Acceptable Data Loss (time of last copy) OR RPO >99.99% (1 hr or less) 5 min or less Tier 1 99.9% - 99.99% (1- 8.5 hrs) 5 mins to 8.5 hrs Tier 2 Recovery Point Objective (RPO) guided by criticality of application data How much data can you lose? (<99.9%) (Hours to days) Hours to days Tier 3 RPO Tier1 RTO

  6. Protection Levels Protection against resource failures Machine Database Corruption Disk Regional DR Location Redundancy Building < 10 miles Geographic DR Protection against Natural Disasters Location Redundancy State, Country > 100-200 miles Protection against Network Outages Site Failures Location Redundancy City, County < 100-200 miles Local HA

  7. Business Continuity Planning Impact Analysis Critical Functions Threat Identification Recovery Objectives Solution Design Achieve recovery objectives for relevant threats within specified constraints like budget, human resources etc Cost\Benefit analysis of solutions Implementation Deploy the recommended solution Testing Test to see if the solution meets the recovery requirements Maintenance Yearly testing and review of procedures Analysis Solution Design Maintenance Implementati on Testing

  8. SQL Server High Availability Planning Analysis Application tiers serviced by the databases Causes of database downtime Protection levels: Local HA, Regional DR, Geographic DR Solution Design Need to understand what solutions exists? What are the characteristics and cost of the solution? Implementation What are the deployment steps and best practices? Testing How do I test my implementation? Maintenance How do I monitor and maintain the solution? Analysis Maintenance Solution Design Testing Implementation

  9. Database Downtime Drivers Analysis Failure Protection Unplanned Downtime User Errors Database Downtime Online Administration Planned Downtime Predictable Resourcing

  10. Solution Design Solution Design Understand the solutions and choices before making a decision Solution Architecture HA Capabilities Limitations and Caveats Cost Vector

  11. Solution Design SQL Server Always On Technologies

  12. Always On Technologies Solution Design Provides a full range of options to minimize downtime and maintain appropriate levels of application availability Backup and Restore Log Shipping Database Mirroring Failover Clustering Peer-Peer Replication Increases Availability Online Index Operations Table Partitioning Enhanced Locking Resource Governor Database Snapshot Dedicated Admin Connection Dynamic Configuration Decreased Downtime

  13. Always On Technology Overview Solution Design Architecture Overview How does it work? Solution Characteristics Data Loss Guarantees Failover Characteristics Redundancy Levels and Utilization Cost Limitations and Caveats Backup and Restore Log Shipping Database Mirroring Failover Clustering Peer-Peer Replication Increases Availability

  14. Backup and Restore Solution Design Base availability technology for any solution Protects against failures and recovery from errors Provides Local HA and Site DR Need to ensure the backups are accessible if site goes down High RTO due to restore time RPO=0 can never be guaranteed Types: Full, Differential, and Transaction Log File-group backup/restore for large databases Backup Compression provides faster and smaller backups in SQL Server 2008

  15. Log Shipping Solution Design Automated transaction log backup and restore provides redundancy at the database level SQLLogship.exe provides the underlying framework for doing automated backup, copy and restore Backup on primary instance Restore on secondary instance(s) Scheduling is done through SQL Server Agent jobs SQL Server 2008 provides sub-minute scheduling interval providing the ability to do quick backup and restores No automatic failover capabilities

  16. Database Mirroring Solution Design A database level high availability solution that provides complete protection against data loss and fast recovery through automatic failover Maintains a redundant database by shipping log blocks when the transactions are committed on the principal Synchronous and Asynchronous modes provide the spectrum of options to choose between availability and performance Automatic failover when using witness server

  17. Failover Clustering Solution Design Instance level protection built on Windows Failover Clustering shared disk model Cluster nodes typically co-located within the same site to provide local HA Regional DR possible using VLAN and stretch storage level replication No built in data redundancy like database mirroring and log shipping Data protection has to be provided at the storage level or by combining with other solutions

  18. Transactional Replication Solution Design A high performance data replication solution that provides granular table level replication Logical data movement provides flexibility and better hardware utilization Key scenarios: Customized application-specific DR Real-time reporting on secondary server that be used for Site DR Scale out application queries with ability to use any one database copy for Site DR Two types relevant for HA and DR Transactional and Peer-to-Peer

  19. Always On Solution Characteristics Solution Design Redundancy and Utilization RPO Failover Cost App Perf Impact Manag- eability Solutions No Data Loss (RPO=0) Failover Unit Auto Failover (RTO) Read Mult- iple Write Hard- ware Inst DB Tab * Log Shipping Low Low Low DBM Sync * Low High Low + ** Async * Low Low Low Cluster High*** Low *** Low*** Low Low High Transactional Replication Peer-Peer Replication Low Low High * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively ** Database Mirroring provides fastest failover to hot secondary *** Depends on SAN technology

  20. Whats New in SQL Server 2008 New Features Resource Governor Manage SQL Server workloads and resources by specifying limits on resource consumption Feature Enhancements Database Mirroring Automatic recovery from page corruption Log stream compression Faster recovery on failover Log Shipping Sub-Minute Log Shipping Backup compression Failover Clustering 16 nodes Rolling upgrade Peer-Peer Replication Hot add new nodes Backup Compression Reduce backup and restore time

  21. Backup Compression Common questions: How much compression will I see? Will it be comparable to, say, SQL Litespeed? We saw an 85 percent reduction in file size using SQL Server 2008 Backup Compression, says Colin Neller, Senior Software Engineer at ServiceU and part of the company s SQL Server 2008 implementation team. A backup file that was previously over 300 GB is now only 40 GB, and the job runs in about half the time. One simple answer: It depends! All data compresses differently the compression ratio achieved depends on: The type of data in the database Whether the data in the database is already compressed Whether the data/database is encrypted

  22. Backup Compression: Backup Performance Backup of a 322 MB Adventureworks database Uncompressed Compressed A LOT more CPU used (avg 25%) BUT runtime = 21.6s (45% improvement) and backup stored in 76.7MB (4.2x compression ratio) Hardly any CPU used (avg 5%), runtime = 39.5s, compression ratio of 0.

  23. demo DEMO: Increasing Availability Using Always On Technologies

  24. Solution Design Developing Your Availability Solution using SQL Server Always On Technologies

  25. Recap Solution Design Application availability requirements or SLA drive primary solution choices RPO and RTO are the key metrics used to define the SLA Application Availability Unplanned downtime Planned Downtime Need mitigation against planned and unplanned downtimes Multiple solution choices that provides varying cost\benefits Database Mirroring Clustering Other requirements apart from application SLA factor into the choice Peer-Peer Replication Log Understand constraints and tradeoffs you can make Shipping

  26. Always On Solution Characteristics Solution Design Redundancy and Utilization RPO Failover Cost App Perf Impact Manag- eability Solutions No Data Loss (RPO=0) Failover Unit Auto Failover (RTO) Read Mult- iple Write Hard- ware Inst DB Tab * Log Shipping Low Low Low DBM Sync * Low High Low + ** Async * Low Low Low Cluster High*** Low *** Low*** Low Low High Transactional Replication Peer-Peer Replication Low Low High * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively ** Database Mirroring provides fastest failover to hot secondary *** Depends on SAN technology

  27. AdventureWorks Inc Scenario Solution Design Adventureworks Inc is a manufacturing company that manufactures and sells bicycles across the world. There are a number of applications, some that are mission critical that run on multiple SQL Server Instances One datacenter located in Omaha Three applications Manufacturing Tier 1 Finance Tier 2 Scheduling Tier 3 Manufacturing application runs on a dedicated SQL Server 2008 Instance All other applications run on a second instance Availability of manufacturing application is critical Implement a solution at the lowest possible cost The DBA team is run by Darren who is responsible for deploying and managing the application databases. One of his core responsibilities is to ensure availability of all application databases in order to meet the application SLA

  28. Application Requirements Solution Design Applications Data Loss RPO=0 RTO in secs Failover Unit Auto Failover Read Multiple Sites Read Write Inst DB Tab Manufacturing Finance Scheduling Manufacturing application has strict SLA s Finance application requires readability on the secondary The reports are run every 4 hours and need to be fresh as of the last one hour. To offload the reporting load from the main system they would like to utilize the mirror

  29. Solution Choice for Manufacturing Application Solution Design Solutions Data Loss RPO=0 Fast RTO Failover Unit Auto Failover Read > 1 Sites\ Copy Read Write Inst DB Tab Cluster SAN Replication DBM - Sync Clustering can provide a zero data loss solution that can also provide fast instance level failover Use RAID configuration to provide data redundancy on the SAN If a redundant copy is required that can provide instance failover with zero data loss use SAN replication High Cost Solution Use synchronous database mirroring if instance failover is not needed DBM - Async Log Shipping Transactional Replication Peer-Peer Replication Clustering with RAID

  30. Solution Choice for Finance Application Solution Design Solutions Data Loss RPO=0 Fast RTO Failover Unit Auto Failover Read > 1 Sites\ Copy Read Write Inst DB Tab Cluster SAN Replication DBM - Sync DBM - Async For database level redundancy with acceptable data loss with minimal perf impact, asynchronous database mirroring is an optimal choice Replication Log Shipping Transactional Use database snapshots at periodic intervals to provide a readable snapshot of the data for reporting Reports Peer-Peer Replication Finance Scheduling Db Snapshot every hour Low cost solution Async Database Mirroring Omaha Datacenter

  31. Adding a Regional Datacenter Into the Mix Solution Design Regulatory and compliance requirements drive the need for having a additional datacenter within a 10 mile radius to provide redundancy against site level failure. It is now required that all applications have the ability to failover to the regional datacenter across the river in Council Bluff The SLA need to be maintained for tier 1 applications even in the case of site failures

  32. Regional Site Solution Choices Solution Design Manufacturing Cluster with SAN Sync Mirroring no witness Reports Finance Scheduling Db Snapshot every hour Async Database Mirroring Log Shipping CB Datacenter Omaha Datacenter

  33. A Complete Topology Solution Design Considering the potential of floods and tornadoes destroying the regional data centers, Adventureworks Inc wants to maintain a disaster recovery site in San Antonio, TX The disaster recovery site has lower SLA requirements for all applications The manufacturing application can have an RPO of 1 hour The RTO is set at 4 hours

  34. Topology Diagram Solution Design Sync Mirroring No witness Manufacturing Cluster with SAN Log Shipping

  35. Scale Out and Availability Scenario Solution Design Requirements Geo Redundancy Data Locality High Availability Local Read-Scale Workload Characteristics Mainly reads Few writes Application Characteristics Each user logging in connects to a particular server Adventureworks is building a new web based order management system that allows customers from all over the world access the system and place orders The core group of customers are in Western Europe, South East Asia and North America Partitioned based on user-id and region Writes from a user always happen on one server regardless of the region the user log in from All reads redirected to the closest geo- location Reasonable tolerance for latency (5-10 minutes)

  36. Replication Topology Solution Design Asia1 Asia2 Peer Nodes Read-Only Servers

  37. demo Implementing and maintaining a HA solution

  38. Licensing Facts Passive servers are mirror, log shipped secondary and clustering passive node No license required on passive if it is truly passive A passive server does not need a license if the number of processors in the passive server is equal to or less than the number of processors in the active server. The passive server can take the duties of the active server for 30 days. Afterwards, it must be licensed accordingly.

  39. HA Features Edition Support Feature Express Workgroup Standard Enterprise Comments Advanced high availability solution that includes fast failover and automatic client redirection 1 Database Mirroring 2 Failover Clustering Data backup and recovery solution Backup Log-shipping Includes Hot Add Memory, dedicated administrative connection, and other online operations Online System Changes Online Indexing Online Restore Database available when undo operations begin Fast Recovery Single thread redo Limited to 2 node cluster

  40. Summary There is no one size fits all solution Consider the cost\benefits\constraints and compare that to availability requirements of the organization to determine the best solution Use the charts to understand cost, benefit and constraints of the various SQL Server High Availability solutions TEST the solution to ensure it can meet the availability requirements and meet SLA s

  41. question & answer

Related


More Related Content