Automatic assessment of an industrial safety-critical system

by Mar 7, 2022Success stories

You are here:

Home 9 Success stories 9 Automatic assessment of an industrial safety-critical system
Reading Time: 8 minutes

Introduction

HIMA is an 800-employees company, hidden champion, and market leader specialized in safety-related automation solutions in the global process and railway industry. The electric/electronic and programmable systems produced by the company are used in a wide range of application fields, from offshore facilities to turbines. In this article, a partial safety analysis of one of their products – the HIMax – is assessed.

The HIMax (see Figure 1) is a “Powerful, Uninterrupted Safety Control” [1] that contains multiple digital output modules, whose signals are intended for use in a programmable electronic system. The modules monitor the voltage going through and shut off when a threshold is exceeded. This feature allows to quickly detect and correct short circuits or line breaks. Moreover, the system monitors itself and gives a visual indication of its status using LEDs. Thus, the HIMax provides a convenient means for safety-critical production processes which must run continuously and comply with various industrial standards, enabling a quick certification up to SIL3 (Process) & SIL4 (Rail).

Figure 1: View of the HIMax system

With more than 200 components for each channel of one Digital Output (DO), the safety assessment requires an extensive analysis of over 500 possible failure modes. The assessment of the HIMax performed by HIMA consists of computing the system failure rate as the sum of its component failure rate, assuming a diagnostic coverage based on the standards.

In the scope of an evaluation project of paitron [Project ran on a regular windows 10 laptop (Lenovo V155 with 4 logical processors with 2.6-3.5 GHz and 8GB of RAM)] with HIMA, a channel of the HIMax’s DO was analyzed. The DO system of the HIMax is composed of several function blocks. To determine paitron’s limits, we studied the LSUE part of the DO. This choice has been made regarding the number (relatively to the others DO’s modules) and wide range of components the LSUE is made of.

This analysis has been performed using a “model-based” approach: the impact of each individual component failure mode on the system behavior is considered. Although such an approach is expected to provide accurate results, it is in practice very laborious and in some cases unrealistic to perform manually. This laborious aspect of the “model-based” approach is what motivated HIMA to adopt paitron.

To overcome the workload issue, paitron automates the study of the system’s failure modes using a numerical model. In the present publication, the LSUE model has been based on HIMA’s OrCAD Capture design.

Among the 182 possible failure modes identified for the LSUE system, paitron automated the assessment of 146, achieving an 80% automation of the analysis within 9 hours. The remaining 20% must be verified manually by an expert, which only takes a few hours. The failure modes considered for this study and their distribution were taken from IEC 61709, the components’ FIT rate was selected from SN 29500. Eventually, the results generated by paitron were checked and validated by HIMA. In a more general way, the software quality of paitron has been evaluated in a concept report with TÜV SÜD.

Analysis setup

The main purpose of the LSUE is the detection and report of overcurrent (current above 800mA). The signals emitted from this detection are used to:

  • Inform the user about the overcurrent (REPORT signal)
  • Trigger the shutdown of the line and protect the system integrity (CUTOFF signal)

A major step of paitron’s safety analysis is the definition of the tracked system effects and their criticality. Knowing the system’s main purpose, the system’s studied effect criticalities (see table below) are determined regarding IEC 61508 (2010).

Table 1: LSUE studied effects and their associated criticality

System effectDescriptionCriticality
REPORT signal: no overcurrent indicationThe REPORT signal does not report overcurrent when overcurrent occursDangerous
REPORT signal: erroneous overcurrent indicationThe REPORT signal reports an overcurrent although no overcurrent occursSafe
REPORT signal: undetermined stateThe REPORT signal is neither high nor lowSafe
REPORT signal: overvoltageThe REPORT signal is overvoltedDangerous
CUTOFF signal: no overcurrent indicationThe CUTOFF signal does not report overcurrent when overcurrent occursDangerous
CUTOFF signal: erroneous overcurrent indicationThe CUTOFF signal reports an overcurrent although no overcurrent occursSafe
CUTOFF signal: undetermined stateThe CUTOFF signal is neither high nor lowDangerous
LSUE: no output voltageThe LSUE output is stuck lowSafe

Failure modes effects and diagnostic analysis

With 41 components and considering IEC 61709, “model-based” safety analysis of the LSUE system requires the study of 182 (see Figure 2) failure modes among which paitron:

  • Identified 156 possible failure modes of the system
  • Was able to propose a model of the system for 152 fault cases
  • Successfully evaluated the effects of 146 of the system’s failure modes

The only components paitron was not able to study are the integrated circuits. Indeed, the failure modes of integrated circuits were not available in the tested version of paitron which deeply decreased the automation rate of the study. The failure modes of the integrated circuits represent the 26 missing failure modes (182-156 failure modes) that paitron was not able to identify, model, and simulate.

Figure 2: Repartition of the 182 failure modes of the studied LSUE system

The analysis performed by paitron lasted 9 hours and allowed to cover 80% of all the failure modes to be considered for such an analysis. In 6 cases (152-146 cases), the simulation of a failure mode crashed or did not converge (within 5 minutes) for one or several of the system’s inputs. Such cases are indicated in paitron’s generated results with the mention: “<not studied, check manually>”.

A view of the FMEDA sheet generated during this study is given below in Figure 2. Paitron’s analysis allowed to determine the LSUE safety metrics (see Table 2 and Table 3) within a realistic amount of time:

  • 9 hours for automatically evaluating the effects of 146 failure modes;
  • the time needed to complete the manual analysis for the 36 remaining cases (182-146 cases). This represented a few hours work (i.e. 2hrs).
Figure 3: View of the FMEDA sheet generated by paitron

Table 2: Safety metrics of the LSUE evaluated using paitron

MetricResults
Safe Failure Fraction (SFF)99,77%
Safety Function Failure Rate [1/h]44.8 FIT
Device Failure Rate [1/h]48.8 FIT
Mean Time Between Failure (MTBF) [h]2339 years

As we can see in Table 3, the results found using by the “model-based” approach are different than those found when using assumptions from the standards (e.g. for Diagnostic Coverage). These differences highlight that the results found using assumptions can be considerably refined and testify the considerable interest of the “model-based” approach to treat such safety analysis problems. The results of both approaches are compared in the table below. For the DO of the HIMax, HIMA’s experts assume 50% of the cases result in a dangerous failure. Having a diagnostic coverage of 99% with respect to the ISO 61508 enables HIMA to achieve a safe failure fraction of 99.5%.

Table 3: Comparison of the failure rates assessed by paitron (“model-base” method) and HIMA (“part counting” method)

Failure rateHIMAPaitron
Safe Detected24.1615.00
Safe Undetected0.242.96
Dangerous Detected24.1618.23
Dangerous Undetected0.240.10
No effect0.008.48
Not relevant0.004.00
Total48.8048.80

Our analysis with paitron allowed us to re-estimate the safe failure fraction to be approximately 99.8%. This means a reduction of the most-critical case “Dangerous Undetected” by 58.3%. Additionally, roughly a fourth of the failure modes could be classified as either “Not relevant” or having “no effect” on the system’s safety.

Conclusions and further work

The safety analysis presented showcased an application of paitron for an industrial safety control unit. Paitron relies on the “model-based” approach, which was easily embedded within HIMA’s processes & workbench by integrating existing OrCAD Capture models. Within the project, it was shown that the analysis with paitron is more accurate than the method currently used by HIMA. HIMA’s experts calculated the safe failure fraction to be 99.5%. Our analysis with paitron allowed us to re-estimate the safe failure fraction to be approximately 99.8% – cutting the residual failure rate in half.

“The quality of the analysis improves extremely” was the feedback by the responsible Functional Safety engineer Sven Sulzbach. A better understanding of the system allows improving the system by “discovering and controlling all failures” (Markus Dalheimer, Functional Safety Engineer) and achieving a diagnostic coverage above 99% to which HIMA are currently bound by the standards.

Furthermore, the use of paitron is beneficial during design as it enables engineers to directly check their design during the early stages of the development – even without a safety expert’s involvement. Jochen Däschler, Functional Safety Expert at HIMA was impressed by how direct the feedback was: “No human could do 23.000 checks in 40 minutes”. This would not only reduce the number of design iterations before reaching the desired safety level but also reduce the amount of work for safety engineers. Analyses that previously took multiple weeks can now be done within a few days. The latter is important with the lack of safety engineers on the market, reflected in Jochen Däschler’s statement that “HIMA aims at enabling their designers to perform expert analyses”.

The main hurdle met by paitron was the consequent number of integrated circuits contained in the HIMax. Indeed, the version of paitron used in this study (1.7) lacks the failure modes of such components, which subsequently reduced its automation rate. In the study, the failure modes of the integrated circuits represent 14% of the system’s failure mode. This observation highlights the importance for modelwise GmbH to provide the possibility to account for the integrated circuit’s failure mode in future versions. Beyond this aspect, the possibility to use paitron on dedicated computation servers could reduce analysis time and act as a spell-checker for engineering designs providing feedback within milliseconds.

Paitron is used to detect the exact cause for which the system requirements are violated when facing a failure mode and is preventing safety engineers from human error due to the repetitive nature of the “model-based” safety analysis. The data generated by paitron can be used for safety proofs or to faster a product certification. Additionally, all intermediate data generated by paitron is openly accessible to versioning and documentation systems.

The following summarizes the key benefits of paitron:

  • Accelerate the time to market by weeks
  • Enable designers to perform expert analyses
  • Catching design flaws early on saves 80% of the costs for fixing

References

[1] “HIMax,” Products & Services | HIMA Paul Hildebrandt GmbH. [Online]. Available: https://www.hima.com/en/products-services/himax [accessed 10.01.2022]

Soft- and Hardware used in this project

  • A regular Windows 10 laptop (Lenovo V155 with 4 logical processors with 2.6-3.5 GHz and 8GB of RAM)
  • Safety Assessment Automation: paitron V1.7 (1.7.7971.31144)
  • Modeling: Orcad Capture V17.3
<a href="https://modelwise.ai/author/hadrien/" target="_self">Hadrien Tournaire</a>

Hadrien Tournaire

modelwise Team

Get Inspired

You might also like

Automatic assessment of an industrial safety-critical system

The HIMax is an industrial safety-critical system of the HIMA Paul Hildebrandt GmbH that contains multiple digital output modules, whose signals are intended for use in a programmable electronic system. In this article, a partial safety analysis of the HIMax is assessed with paitron.

modelwise achieves TÜV concept report

This post aims to show how the tool concept of paitron was evaluated from TÜV SÜD and regarding its certification ability as a qualified software for the use in safety-critical applications.

​Where did ARP 4761 go wrong?

In this article, we compare a manually generated expert failure modes and effects analysis (FMEA) taken from SAE ARP4761 with an automatically generated one, produced with paitron and find where the ARP 4761 went wrong.

Share This