
Unsafe Data's Impact on Uberlingen Mid-Air Collision
Explore the role of unsafe data in the tragic mid-air collision over Uberlingen on July 1, 2002, shedding light on the lack of focus on data safety and its consequences in aviation incidents. References and analysis provide insights into the importance of data in aviation safety management.
Uploaded on | 1 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
THE ROLE OF UNSAFE DATA IN THE MID-AIR COLLISION OVER UBERLINGEN 1STJULY 2002 (A personal view)
REFERENCES 1. Incident Analysis Causalis Limited author Jorn Stuphorn, September 2007 2. Investigation Report Bundesstelle fur Flugunfalluntersuchung, (BFU), May 2004 3. Review of the BFU Uberlingen Accident Report author Professor Chris Johnson, Glasgow University, December 2004 4. A STAMP Analysis of the LEX COMAIR 5191 Accident author Paul S. Nelson, Lund University, June 2008
THE MYTH OF RUNNING LIKE CLOCKWORK EXTRACTED FROM AVIATION SAFETY MANAGEMENT IN SWITZERLAND (2003): http://www.nlr-atsi.nl/downloads/aviation-safety-management-in-switzerland.pdf Over the last five years the Swiss aviation sector has been struck by a number of severe aviation accidents. The tragic sequence of accidents started with the crash of a SwissAir MD-11, in Halifax, in 1998. This was followed by a fatal accident with a Crossair Saab 340 near Nassenwil in January of 2000, and a Crossair Avro 146 RJ 100 near Bassersdorf in November of 2001. Finally, on July 1, 2002, two large civil aircraft crashed near eberlingen (Germany) after a mid-air collision in airspace, controlled by Skyguide. THIS PRESENTATION SHOWS THAT A NUMBER OF THE PROBLEMS LEADING TO ONE OF THE ACCIDENTS ARE DUE TO A LACK OF FOCUS ON DATA SAFETY Note that everything else the SCSC have considered of late, including the lack of safety cultures and analysis outside the physical system must still be part of the design analysis. Data safety analysis is not a panacea.
PRESENTATION FORMAT MOD interest in Data Safety revised 00-55 Brief recap on the collision and components Examine why we are looking at data and the alternatives The Factors involved in the collision. Looking at DSIWG guide Data Types and Properties The BFU recommendations and Professor Johnson s. Conclusions
A revision of Def Stan 00-55 highlighting data issues Revision finalised in the past month Now includes data safety Annex E (formerly X) Derives from Def Stan 00-56 which states specifically that data is part of the computer system. Scope does not explore outside the system (aka not Leveson s STAMP methodology).
BRIEF ON THE UBERLINGEN CRASH Collision occurred at 21.35 on 1stJuly in clear skies over Southern Germany Tupolev TU154 carrying 60 passengers mostly school children on an exchange visit to Barcelona Boeing 757 cargo aircraft with a crew of 2 ATC working on 2 workstations using two radar screens of different scale, two separate radios, a phone failure and without full radar computer facilities, (no visual STCA) and backup controller resting .
Uberlingen Accident Components The question is how to model what interactions happened and which did not happen that should have. Note the multiple ATCO data sources: - Control Strip Phones (3) Radio STCA audio STCA visual Radar screen (2) Usual limited SCSC area of concern Major point of failure
Reasons Model of Sequential Events Accident when the Swiss Cheese holes line up - Synchronous Always true. Easy to visualise post accident analysis. Difficult to use in preventing accidents (Unless you are the world s leading expert on permutations and combinations and own a large Swiss cheese factory).
Treating the causes of accidents as part of an asynchronous system outside the physical computer. Quereshi argues that Sequential and epidemiological (the study of patterns and causes) accident models are inadequate to capture the dynamics and non-linear interactions between system components in complex socio- technical systems Leveson Systems-Theoretic Accident Model and Processes (STAMP) argues for an accident model based on basic systems theory concepts . Safety is treated as dynamic control problem.
WHAT CONSTITUTES DATA IN THIS ANALYSIS Using a similar concept to Quereshi and Levson: The space in which the causes exist is considered as a system or system of systems, (as a Systems Analyst might look at a company accountancy office to computerise it). Any data which might constitute data input to a single purpose Air Traffic Control System of a hypothetical future, be it current phone technology, procedure manuals, training etc., is considered to be data.
THE FACTORS The Causalis report lists 100 factors as contributory to the accident. Of these, applying the principle of what constitutes data (previous slide) at least 42 (arguably more but some are virtual repeats) involve erroneous data as the DSIWG defines data categories. Some of these are raised in this presentation to illustrate the issues associated with data safety .
THE FACTORS 1 1. Reconfiguration of Air Traffic control sectors necessitated telephone reconfiguring and connection too. 2. After 21.23 direct calls to neighbouring ATC were not possible on usual phone as both phone and substitute not working. (Back on a few minutes later). 3. ATCO moves to another desk, (inadequate staffing an issue) to deal with separate aircraft A350 landing at Friederichshafen but fails due to the phone fault. He had not even noticed before leaving his desk that the B757 had reached FL 360.
THE FACTORS 2 4. 21.34 neighbouring ATCs have made 3 calls to pass on data but none answered as ATCO may have assumed phone system still not serviceable. 5. AT controller data overload is proven by necessity to move to a second desk, neglecting the other, to deal with an incident in parallel. (Exacerbated by permission for other controller to rest during the night so not able to operate other desk). 6. Poor system design hiding data switching screens from the same position on the same screen would clearly have been a better idea.
THE FACTORS 3 7. No data supplied by maintenance schedule to remind the AT controller that a mobile phone was also available. (could argue poor training but as mobile so rarely used can also argue expectation to retain such awareness unrealistic). 8. The consequence of looking at a different screen and focussing on the landing plane was neglect of the 2 planes approaching each other. The ATC was trying to focus on contacting Friederichshafen from other desk.
THE FACTORS 4 9. The visual Short Term Conflict Alert, (STCA), was not available due to the maintenance work so the controller may have assumed all was well 10. Automatic correlation of new targets did not occur when Radar isolation recommenced at 21:18 - a full 17 minutes before collision. 11. The ATC gave an instruction contrary to the TCAS system alert instruction.
SOLUTIONS TO DATA ISSUES 1 1. A hazard analysis of data issues likely to arise due to the maintenance of the system could have output information for the AT controller on what to be aware of e.g. mobile phone availability and the absence of the visual STCA. 2. The higher volume of traffic than normal night of mostly over-flights, could have been planned for in design. 3. An auto-alert sent to the sleeping controller when higher traffic level, (likely to cause a single controller problems), occurs.
SOLUTIONS TO DATA ISSUES 2 4. Again a data hazard analysis may have highlighted the lack of radar correlation that occurred when there was full system restoration at 21.18. The B757 reached FL360 at 21:29:50, 5 minutes before impact, which the controller did not take note of. He did not therefore issue an instruction to descend to flight level 350.
The categories of Data in DSIWG guide From 1-7 the development issues DATA TYPE RELEVANCE 1 Prediction Data used to model the system 2 Assumption Data used to provide context 3 Requirements Data used to Specify the system 4 Interface Data used to enable interfaces 5 Design and Development Data produced in the development process 6 Verification Data used to test the system 7 Configuration Data used to configure the system (A400 crash) Reference will be made to these numbers in the next slides
The categories of Data in DSIWG guide From 8 onwards the Implementation: DATA TYPE RELEVANCE 8 Application Data processed or produced by the system 9 Instructional Data used to warn, train or instruct the user 10 Release Time specific data eg. workarounds 11 Operational Data produced from operations eg logs 12 Evolution Maintenance issues. 13 System Includes maintenance requirements 14 Justification Data used to justify safety claims 15 Staffing and Training Staff training, competency, certification etc 16 End of Life Taking the system out of service 17 Investigation Data to support accident investigations 18 Standards and Regulatory Approaches and processes to develop safe system 19 Reference or Look-up Data used across multiple systems
Uberlingen Accident Components The question is how to model what interactions happened and which did not that should have. Note here the pilot is in the analysis of the system aka Leveson approach. Major point of failure
Data Factors in the Causalis Analysis 1 Causalis Analysis 12 TU 154 descends when TCAS says climb contrary to TCAS 23 Controller says conflicting traffic is at 2 o clock but TU154 sees traffic at 10 o clock 31 Only one analogue channel for radio meaning both planes listening to same instructions accommodate separate data channels 44 Controller had the duties of Planning Radar Executive and Radar Controller approach at the same time elsewhere 52 Control strips do not warn on crossing flight routes controller 55 Acoustic STCA not heard by staff. Only used at night and may not have been switched on. Description Data problem DSIWG Guide Data Type 8 Air Traffic Controller gave instruction ATC gives erroneous data as guidance 2, 3, 4, 5, 8 and 15 Easy confusion in an emergency situation. Insufficient bandwidth to 3, 4, 8 and 9 Data overload in an emergency situation with a priority third aircraft on landing 1, 2, 3, 9 and 15 The system fails to provide data to the 8 Process failure. Absence of any switch on confirmation log, design failure. 11 and 15 56 High work load of controller. Had two consoles but was not aware of assistance available. Had priority of handover of aircraft on approach to Friedrichshafen Optical STCA not available due to maintenance Data overload. Erroneous system design priorities possibility. Training data or correct process data would have mitigated lack of awareness. Failure to deliver data. Absence of essential warning data. Maintenance procedure data inadequate. Design process failure. 2 and 15 (appalling example of latter) 63 3, 9 and 12
Data Factors in the Causalis Analysis 2 Causalis Analysis Description Data problem DSIWG Guide Data Type 64 8,10,11,12,13 and 15 Maintenance meant fallback radar computer in operation but it does not engage in data fusion with the system flight plan data Aerolloyd handover fails and controller does not use any of the three telephone systems. Data fusion impossible so correlation does not occur leading to absence of essential data 67 8, 13, 15 Failure to deliver data. Failure of maintenance to inform of restoration of service. Failure of training to provide retained knowledge of mobile phone availability. Failure to provide alarm channel by which data overloaded controller could summon assistance of resting/sleeping controller. Management permission factor Data handling in exceptional circumstances not considered in the design 80 2, 3, 11, 15, 18 Second controller goes for rest as common and tolerated 88 1, 2, 5, 18 System geared for usual low volume of traffic with only transit flights 98 9 and 15 Emergency handbook lists three phones available Poor training as controller not aware. Lack of emergency procedure data awareness. Lack of emergency process data formally alerting at the time of necessary procedures and reminding of back-ups 100 9, 12 and 15 Direct lines to neighbouring ATCs not available for significant period of the developing situation
Properties of Data Very important that designers examining a system from a data perspective understand that properties are inherent in data if the system is to be safe. When data is declared as contributing safety to a system, during an analysis post use of this guide, that will only be the case if the properties hold.
Property Description 1. Integrity 2. Completeness 3. Consistency 4. Format 5. Accuracy 6. Resolution 7. Traceability 8. Timeliness 9. Verifiability 10. Availability 11. Fidelity 12. Priority 13. Sequencing 14. Usage 15. Accessibility 16. Non-accessibility 17. History 18. Lifetime 19. Disposability
Data Properties example COMAIR crash cause:Take-off on wrong short runway 10 years prior FAA 6 principles on NOTAM data states ..text .shall be written in clear simple language NOTAM: Circling Minimums: MDA 1580/HAA 601 All CATS. VIS CAT C 1 ALTITUDE AT HYK 5.00 DME 1580. TEMPROARY FAS CONTROLLING OBSTACLE 1240 MSL/205 AGL TOWER AT 380007.65N- 0843132.58W
Data Properties Actually meant when translated to FAA std: Due to a tower 205 above ground level and 1240 MSL, located 4.6 DME from the Lexington VOR (HYK) just north of the final approach course, the approach minimums are increased as follows.
The 4+1 principles of Data Safety 1. Data safety requirements shall be defined to address the data contribution to system hazards (System as per Leveson et al) 2. The intent of the data safety requirements shall be maintained throughout requirements decomposition 3. Data safety requirements shall be satisfied 4. Hazardous system behaviour arising from the system's use of data shall be identified and mitigated (System as per Leveson et al) 5. The confidence established in addressing the data safety principles shall be commensurate to the contribution of the data to system risk (System as per Leveson et al)
BFU SAFETY RECOMMENDATIONS There are 19 recommendations at the end of the BFU report. Looking here at some relevant to data integrity. (Emphasising again that we are looking at the incident from a Systems Analyst point of view). They show a focus on management failings of the whole system, not computer failings but do not cover all problems.
BFU SAFETY RECOMMENDATIONS RECOMMENDATION Number 18/2002 ICAO should change the International requirements in Annex 2, Annex 6 and PANS_OPS (DOC 8168) so that pilots flying are required to obey and follow TCAS resolution advisories (RAs) regardless of whether contrary ATC instruction is given prior to during or after the RAs are issued. A problem of poor design and training with ambiguous TCAS manuals (Data Safety Type 1,2,3,5,6 and 15 ) and lack of awareness by ATC that TCAS had given instructions.
BFU SAFETY RECOMMENDATIONS RECOMMENDATION Number 01/2003 The Federal Office for Civil Aviation, (FOCA) should ensure that the air traffic control service provider issues and implements procedures to undertake maintenance work on the ATC Systems stipulating operational effects and available redundancies. A problem of poor training and maintenance procedures (Data Safety Type 3,6,9,10,12,13 and 15 ).
BFU SAFETY RECOMMENDATIONS RECOMMENDATION Number 02/2003 The FOCA should ensure that ACC Zurich is manned with minimum number of air traffic controllers. A problem of poor training and maintenance procedures (Data Safety Type 2, 13 and 15 ).
BFU SAFETY RECOMMENDATIONS RECOMMENDATION Number 03/2003 The FOCA should ensure that the air traffic controllers are imparted with the initial and recurrent training covering the theoretical and practical (simulator) emergency procedures. A problem of bad assumptions, bad design and poor training again (Data Safety Type 1,2,3,5,6 and 15 )
BFU SAFETY RECOMMENDATIONS RECOMMENDATION Number 16/2004 Utilizing its own mechanism and international resources available ICAO should ensure that all ACAS/TCAS users are consistent in their response to the equipment advice. A problem of poor training and ambiguous TCAS manuals (Data Safety Type 1,2,3,4, 5,6 and 15 ) and lack of awareness by ATC that TCAS had given instructions.
BFU SAFETY RECOMMENDATIONS RECOMMENDATION Number 09/2004 To improve the investigation of future accidents . Require ATS units .to be equipped with a recording device. A problem highlighting the need to understand what has gone wrong when an accident does occur, which there are still some doubts about in this case (Data Safety Type 1,2,3,5,6, 10 and 11 )
Professor Chris Johnsons report Interesting conclusion from Prof. Chris Johnson s report: The BFU report provided few insights into the risk assessment procedures that should be used before any similar upgrades to ATC computers are attempted. His report acknowledges the focus of the BFU report on: the control room staffing and operation the TCAS, ATC instruction conflict, However it goes where the BFU does not into why there was not adequate preparation for the extensive technical procedures that deprived the controllers of necessary support creating as it did an error prone debacle. (In DSIWG terms primarily Maintenance and Training data )
Conclusions 1 Data safety is crucial but is not a panacea. Use of the DSIWG Guide is recommended for analysis of both the computer and processes outside the computer involving utilisation, input and output of the data. Accident investigations can miss important factors if the whole system through its whole life is not examined. The same follows for system design. Data safety analysis is a big contribution to the more comprehensive systems of analysis currently proposed, examining systems in a socio- psychological-technological context.
Conclusions 2 The MOD s next edition of Def Stan 00-55 includes a much greater focus on data safety and an Annex E dedicated to it. The analysis does not stretch outside the computer system but culturally the UK Armed Forces have the best and most rigorous training and maintenance regimes anywhere on the planet so the primary focus for the improvement of safety of data outside the system will probably be the civilian world for the moment where less rigour is applied.