...

You may not know or care much about Hardware Fault Tolerance (HFT) unless you're working in a hazardous industry with Safety Integrity Level *SIL requirements.


However, just like the multiple safety systems in your motor vehicle, systems used for protecting hazardous process plants are often built with intentional redundancy, both for safety, and to keep things running when stuff fails.


Fault Tolerance for Safety

Levels of Hardware Fault Tolerance (HFT) are specified in functional safety standards IEC 61508 and IEC 61511, primarily for safety reasons. Very generally speaking, the higher the safety integrity Level (SIL) required, the more hardware fault tolerance is expected in the design.

Systems or functions with ZERO hardware fault tolerance (HFT = 0) cannot tolerate a single dangerous failure. All such "single channel" systems, by definition, have no ability to tolerate faults.

Systems or functions with ONE LEVEL of hardware fault tolerance (HFT = 1) are designed to tolerate a single dangerous failure. Examples of these are dual or triple-redundancy.

So, a typical SIL 1 safety instrumented function (SIF) may not require any level of HFT to achieve the overall safety goal, provided that goal is met by other aspects such as the calculated PFD/PFH. The benefit of this is lower complexity, installation cost and reduced maintenance. Single channel systems are very common when the risk of failure is relatively low.

When the integrity requirement increases, there may need to be some redundancy added to achieve the SIL target. So, a SIL 2 SIF may require redundant sensors, logic and/or final elements. A SIL 3 SIF will  always require some redundant elements in the design.

Given that SIL 3 requirements are fairly uncommon, it is the designer's responsibility to check that the HFT is sufficient for SIL 2 and SIL 1 requirements. Of course, these are not the only things needed - check our other blog on this topic.

Fault Tolerance for Availability

Another goal for systems and safety functions is the AVAILABILITY. High availability means a well-designed fault tolerant system will keep a plant running even in the presence of single hardware failures.

Adding redundancy for availability can also allow a system to keep running during testing, possibly even without shutting down the plant. This aspect of fault tolerance is often forgotten in the quest for safety integrity, but it's very critical for the bottom-line.

What are the some typical HFT Options?

Single channel SIF

A single channel SIF consists of single devices (or elements) connected together in a chain or series. At a "block diagram" level this is shown as a series of connected blocks, with all the active devices from sensor through to final element. Non-active components such as cables and terminal blocks are implicit in the design, but are generally not shown as they do not fail randomly. In the example below there are three active elements at the block diagram level (there may be more active elements in a practical design).

Single channel SIF

1oo2 Voted Sensors

Taking the same elements as the single channel SIF above, but adding in a second identical sensor element to measure the same process parameter provides some "redundancy" in the design.

The idea of a 1 out of 2 (1oo2) vote is that a single sensor could fail, but any single failure will be "tolerated" - hence another term that gets used - fault tolerant. Note that when only part of the architecture is redundant for safety purposes, the remaining non-redundant parts will typically be the dominant weak points in respect of safety.

1oo2 voted sensors

HFT for Valves

When it comes to valves on a pipe, a single fail-closed valve could have an actuator or other failure, rendering it useless to stop the flow. In HFT terms, this is zero (0) fault tolerance for safety.

However, suppose two Fail-Closed valves get installed in series; this provides an HFT of one (1) for safety purposes because any single failed valve should still leave the remaining healthy valve to operate*.

HFT for Valves - Fail Closed

*Note: This assumes that there is no common cause failure that could affect both valves. In practice, common cause failure must be considered.

About 

Jon Keswick, CFSE

Jon Keswick is a Certified Functional Safety Expert (CFSE) and founder of eFunctionalSafety. Feel free to make contact via Linked-In or comment on any of the eFunctionalSafety blog pages.

  • Nicely presented..very simplified article for quick understanding!

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    Get started with your functional safety journey

    Sign-up to our ONLINE Functional Safety Pro Community

    >
    Success message!
    Warning message!
    Error message!