
Health IT Systems: Fault Tolerance, Backups, Decommissioning Lecture
Learn about creating fault-tolerant systems, backups, and decommissioning in health IT. Understand the importance of redundancy, fault tolerance, and system reliability. Discover strategies for backup and restoration and the impact of system failure on clinical activities and administrative processes.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Installation and Maintenance of Health IT Systems Creating Fault-Tolerant Systems, Backups, and Decommissioning Lecture a his material (Comp 8 Unit 9) was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number IU24OC000024. This material was updated in 2016 by The University of Texas Health Science Center at Houston under Award Number 90WT0006. This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/.
Creating Fault-Tolerant Systems, Backups, and Decommissioning Learning Objectives 1. Define availability, reliability, redundancy, and fault tolerance (Lecture a) 2. Explain areas and outline rules for implementing fault tolerant systems (Lecture a) 3. Perform risk assessment (Lecture a) 4. Follow best practice guidelines for common implementations (Lecture b) 5. Develop strategies for backup and restore of operating systems, applications, configuration settings, and databases (Lecture c) 6. Decommission systems and data (Lecture c) 2
Redundancy and Fault Tolerance Dependence on EHRs is increasing. EHR systems require redundant, or failover , resources and fault tolerance to ensure uptime and data integrity so that it can perform as specified. Failure vs fault : fault is the cause of a failure of the system to comply with its specifications or precise requirements. Fault tolerance is resilience in a system, or ability to continue performing to specification despite problems Ask vendor how fault tolerance is designed/coded into the EHR application. 3
Creating Fault Tolerance Computer hardware Servers and workstations Data storage Hard disks Network and Power Network switches and Internet access Mains, generators, batteries Virtualization Isolation of system from hardware Redundancy Secondary or backup systems Reliability Infrequent failure Redundant components Availability Accessible when needed no downtime Available systems are reliable and accessible 4
System Failure and Downtime Forrester Consulting report on server failure during prior two years: experienced downtime. Only 1% of server outages were resolved within five minutes. 68% had impact on clinical activities. 50+% affected administrative processes. How much downtime is acceptable? Required good understanding of business processes Critical system downtime can have significant negative impact on patient health (Forrester Consulting Report, 2010) 5
Three Areas for Fault Tolerance 1. Hardware fault tolerance compensate for hardware failure Often simplest to implement Extra hardware resources as secondaries or backups E.g., secondary network cards, error checking and correcting (ECC) memory, redundant power supplies, redundant disks / file storage Software fault tolerance compensate for poor programming or data Involves program verification (code review) and assertion checking Compensating for faults such as poorly formatted input data E.g., sanity check, double-entry comparison, and multiple-version programs System fault tolerance compensate for non-computer or inter-device failures Most complex, highest number of variables System may include facilities that are not computer-based E.g., detection of sensor failure, graceful reaction to intersystem communication failure, graceful shutdown in unexpected circumstances 2. 3. (A Conceptual Framework for System Fault Tolerance - 1.1 What is a System?, 1995) 6
Six Rules of Fault Tolerance In A Conceptual Framework for Systems Fault Tolerance , the Center For High Integrity Software Systems Assurance summarizes 6 rules: 1.Know precisely what the system is supposed to do. 2.Look at what can go wrong. 3.Study your application & determine appropriate fault containment regions & earliest feasible time to deal with potential faults. 4.Completely understand application requirements & use them to make appropriate time/space trade-offs. 5.Concentrate on credible faults first. 6.Determine application failure margins. (A Conceptual Framework for System Fault Tolerance - 5 Putting It All Together, 1995) 7
Six Rules of Fault Tolerance (cont d) Rule 1: Know precisely what the system is supposed to do. How long can system be allowed to deviate from specifications before being declared a failure ? What abnormal conditions must be accommodated? Rule 2: Look at what can go wrong. Group causes into classes. Define fault floor . (A Conceptual Framework for System Fault Tolerance - 5 Putting It All Together, 1995) 8
Six Rules of Fault Tolerance (cont d 2) Rule 3: Study your application & determine appropriate fault containment regions & earliest feasible time to deal with potential faults. Fault tolerance generally means more resources (time & space) Rule 4: Completely understand application requirements & use them to make appropriate time/space trade-offs. Consider costs, & classify faults by likelihood. (A Conceptual Framework for System Fault Tolerance - 5 Putting It All Together, 1995) 9
Six Rules of Fault Tolerance (cont d 3) Rule 5: Concentrate on credible faults first. Ignore less likely faults unless they require little additional cost. Mitigate the most likely faults first. Rule 6: Determine application failure margins. Balance the degree of fault tolerance needed with the cost of implementation. Does a small expenditure now save a great deal later? (A Conceptual Framework for System Fault Tolerance - 5 Putting It All Together, 1995) 10
Risk Assessment Risk Assessment Identify what is to be protected Examples: EHR server, or clinical record Include rating of importance Types of loss or liability Identify risks to each component Examples: Power failure, or record alteration Risk = Threat x Probability x Impact Intentional or Accidental, Human or System, Internal or External Identify mitigation strategies for each risk Examples: UPS with power monitoring, or automatic backup Policies (for people) or Controls (for systems or equipment) (Benson, n.d., Maniscalchi. 2009) 11
Creating Fault-Tolerant Systems, Backups, and Decommissioning Summary Lecture a Fault tolerance is running despite problems Implemented using Redundancy to increase Reliability and provide Availability Three areas of Hardware, Software, and System Risk assessment to identify assets, risks, and mitigation 12
Creating Fault-Tolerant Systems, Backups, and Decommissioning References Lecture a References Benson C. Security Planning. (n.d.) Available from: http://technet.microsoft.com/en- us/library/cc723503.aspx Maniscalchi, J. Threat vs. Vulnerability vs. Risk. (June 2009) Available from: https://www.pinkerton.com/blog/risk-vs-threat-vs-vulnerability-and-why-you-should- know-the-differences/ Heimerdinger, W. L., Weinstock, C. B., A Conceptual Framework for System Fault Tolerance (1992, October). Retrieved from: http://www.sei.cmu.edu/reports/92tr033.pdf Server Availability Trends In The Time Of Electronic Health Records. (January 2010) Forrester Research, Inc. Available at http://www.stratus.com/assets/ServerAvailabilityTrends_EHR_ForresterPaper.pdf 13
Installation and Maintenance of Health IT Systems Creating Fault-Tolerant Systems, Backups, and Decommissioning Lecture a This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number IU24OC000024. This material was updated in 2016 by The University of Texas Health Science Center at Houston under Award Number 90WT0006. 14