Supporting the ARP4761 Safety Assessment Process with AADL
Transcription
Supporting the ARP4761 Safety Assessment Process with AADL
Supporting the ARP4761 Safety Assessment Process with AADL Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 Peter H. Feiler Feb 6, 2014 © 2014 Carnegie Mellon University Copyright 2014 Carnegie Mellon University This material is based upon work funded and supported by the Department of Defense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center. NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN “AS-IS” BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT. This material has been approved for public release and unlimited distribution. This material may be reproduced in its entirety, without modification, and freely distributed in written or electronic form without requesting formal permission. Permission is required for any other use. Requests for permission should be directed to the Software Engineering Institute at permission@sei.cmu.edu. DM-0000942 ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 2 Outline Safety Assessment Challenges Automation through Architecture Fault Modeling Supporting the Safety Assessment Process Example Application Conclusions ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 3 High Fault Leakage Drives Major Increase in Rework Cost Aircraft industry has reached limits of affordability due to exponential growth in SW size and complexity. Requirements Engineering 70% Requirements & system interaction errors System Design 80% late error discovery at high rework repair cost cost Acceptance Test 0%, 9% 80x System Test 70%, 3.5% 1x Software Architectural Design 20.5% 300-1000x 10%, 50.5% 20x Exceptional conditional/fault handling is up to 80% of system functionality Integration Test Software system is a hazard contributor Component Software Design Rework and certification is 70% of SW cost, and SW is 70% of system cost. 20%, 16% 5x Where faults are introduced Where faults are found The estimated nominal cost for fault removal Unit Test Sources: NIST Planning report 02-3, The Economic Impacts of Inadequate Infrastructure for Software Testing, May 2002. D. Galin, Software Quality Assurance: From Theory to Implementation, Pearson/Addison-Wesley (2004) B.W. Boehm, Software Engineering Economics, Prentice Hall (1981) Code Development ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 4 Software-reliant System Safety Assessment Challenges Safety assessments are rigorous and comprehensive reliability and safety design evaluations • Required by industry standards and Government policies • When performed manually are often done once due to cost and schedule • Are primarily carried out during system engineering Resulting challenges • Years between repetition of safety assessment as system evolves • Prone to inconsistencies with evolving architecture and between analyses • Software system as a hazard contributor ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 5 Outline Safety Assessment Challenges Automation through Architecture Fault Modeling Supporting the Safety Assessment Process Example Application Conclusions ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 6 SAE AADL Error Model Annex: Scope and Purpose System safety process uses many individual methods and analyses, e.g. • • • • hazard analysis failure modes and effects analysis fault trees Markov processes System Subsystem Capture hazards Capture risk mitigation architecture Component Capture FMEA model Goal: a general facility for modeling fault/error/failure behaviors that can be used for several modeling and analysis activities. Annotated architecture model permits checking for consistency and completeness between these various declarations. Related analyses are also useful for other purposes, e.g. • • • • maintainability availability Integrity Security SAE ARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment Automation demonstrated in SAVI Wheel Braking System Example Error Model Annex can be adapted to other ADLs ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 7 Error Propagation Paths Error Model V2: Abstraction and Refinement Four levels of abstraction: • Focus on fault interaction with other components – Probabilistic error sources, sinks, paths and transformations – Fault propagation and Transformation Calculus (FPTC) from York U. • Focus on fault behavior of components – Probabilistic typed error events, error states, propagations – Voting logic, error detection, recovery, repair • Focus on fault behavior in terms of subcomponent fault behaviors – Composite error behavior state logic maps states of parts into (abstracted) states of composite • Types of malfunctions and propagations – Extensible fault ontology ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 8 Support of SAE ARP4761 System Safety Assessment Practice Markov Chain PRISM FHA & EMV2 Spreadsheet Uses error flows & behavior Uses error sources FMEA Spreadsheet Uses error flows & propagations FTA RBD/DD CAFTA, OpenFTA OSATE plugin Uses composite error behavior Uses composite error behavior ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 9 Outline Safety Assessment Challenges Automation through Architecture Fault Modeling Supporting the Safety Assessment Process Example Application Conclusions ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 10 Safety Practice in Development Process Context Safety assessment of SW System Architecture Interaction Between Safety and Development Processes ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 11 Refinement of Architecture Layers/Tiers Iterative Safety Analysis Process with AADL Functional Hazard Assessment System in Environment Propagation points, Hazards FHA Report Consistency Preliminary System Safety Assessment System as Subsystems Error sources propagations, flows, Composite error state PSSA & other reports (FMEA, FTA, DD, MA) Consistency System Safety Assessment Subsystem Implementations Redundancy logic detection Composite error state SSA Report (FMEA, FMES, CCA) ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 12 Case Study with SAVI Industry Initiative SAVI: System Architecture Virtual Integration Functional Architecture Functional Architecture System Architecture System Architecture Refinement of Functional Architecture & Fault Model Consistency of Functional and System Fault Models ARP4761 Safety Process in AADL Function Mappings Imply System Components as Common Source Feiler, Feb 6,Error 2014 Refinement of System Architecture © 2014 Carnegie Mellon University 13 Outline Safety Assessment Challenges Automation through Architecture Fault Modeling Supporting the Safety Assessment Process Example Application Conclusions ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 14 Example System Architecture Fault Model System Description Annotated Architecture Model Component specific error behavior specification Library of Error Types Library of Error State Machines ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 15 Architecture Fault Modeling with EMV2 Hazard description Error sources, propagation paths & sinks per component Functional hazard reports & fault tree analysis FMEA reports & fault impact visualization ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 16 Original Preliminary System Safety Analysis (PSSA) Anticipated: No EGI data EGI Oper’l Failed NoData Flight Mgnt System Anticipated: NoService Actuator Cmd NoData Auto Pilot Airspeed Data Operational NoService Failed Stall FMS Processor Operational Anticipated: No Stall Propagation Failed FMS Power System engineering activity with focus on failing components. ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 17 Discovery of Unexpected PSSA Hazard through Repeated Virtual Integration Anticipated: No EGI data EGI Flight Mgnt System Anticipated: NoService Actuator Cmd EGI Logic Oper’l Failed CorruptedData NoData Auto Pilot Airspeed Data Operational NoService Failed Stall Corrupted EGI HW Oper’l Unexpected propagation of corrupted Airspeed data results in Stall due to miss-correction Failed FMS Processor Operational Anticipated: No Stall Propagation Failed Vibration causes boards to touch which causes EGI data corruption FMS Power EGI maintainer adds corrupted data hazard to model. Error Model analysis of integrated model detects unhandled propagation. ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 18 Recent Automated FMEA Experience Failure Modes and Effects Analyses are rigorous and comprehensive reliability and safety design evaluations • Required by industry standards and Government policies • When performed manually are usually done once due to cost and schedule When automated allows for • Multiple iterations from conceptual to detailed design • Tradeoff studies and evaluation of alternatives • Early identification of potential problems Largest analysis of satellite to date consists of 26,000 failure modes • Includes detailed model of satellite bus • 20 states perform failure mode • Longest failure mode sequences have 25 transitions (i.e., 25 effects) Source: Myron Hecht, Aerospace Corp. Safety Analysis for JPL, member of DO-178C committee 19 ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 19 Outline Safety Assessment Challenges Automation through Architecture Fault Modeling Supporting the Safety Assessment Process Example Application Conclusions ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 20 Benefits of Safety & Reliability Analysis Automation Automation allows for • Early identification of potential problems – Single points of failure, Unanticipated effects • Larger set of failure modes and failure mode combinations • Beyond two levels of effects • More frequent re-analysis after system changes • Architecture trade studies • Safety analysis of system and software architecture • Consistency across different analyses Architecture-centric automation in use Steven Vestal, Honeywell, MetaH, Error Model, AADL committee, Avionics system trade studies during bidding (1999-) Myron Hecht, Aerospace Corp., member of AADL & DO-178C committee, Thomas Noll, University of Aachen, COMPASS project, Automated safety analysis and verification of satellite systems for ESA (2008-) ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 21 Contact Information Peter Feiler/Julien Delange Member of the Technical Staff SSD Telephone: +1 412-268-7790/9652 Email: phf/jdelange@sei.cmu.edu U.S. Mail Software Engineering Institute Customer Relations 4500 Fifth Avenue Pittsburgh, PA 15213-2612 USA Web www.sei.cmu.edu www.sei.cmu.edu/contact.cfm https://wiki.sei.cmu.edu/aadl/ Customer Relations Email: info@sei.cmu.edu Telephone: +1 412-268-5800 SEI Phone: +1 412-268-5800 SEI Fax: +1 412-268-6257 ARP4761 Safety Process in AADL Feiler, Feb 6, 2014 © 2014 Carnegie Mellon University 22