Key terminology for safety instrumented systems
3.1 Hazard, harm and risk
Major accident hazards have to be identified before any form of safety system is implemented in a new plant or for a major modification. For existing plant, re-validation exercises can often result in altered requirements, so there is an ongoing need to identify hazards and assess risk.
Given that a hazard exists, for instance, in the form of stored flammable material; an event leading to possible loss of containment of that material would be a hazardous event. In some cases, the hazardous loss of containment event may lead to fire or even explosion, and this may lead to harm to people, the environment and assets. Risk assessment is about estimating the frequency of hazardous events and the severity of the harmful consequences.
3.2 Tolerable risk and ALARP
In order to determine if a given risk is tolerable or not, there needs to be some form of framework agreed. In some countries, for instance the United Kingdom, this framework is provided by the country Government in the form of ALARP. Note that although this guidance is from the UK, it is specifically referenced in IEC 61511 part 3.
ALARP is the abbreviation for "As Low As is Reasonably Practicable". Reasonably practicable involves weighing a risk against the trouble, time and money needed to control it. In essence, making sure a risk has been reduced ALARP is about weighing the risk against the sacrifice needed to further reduce it.
Unacceptable risk. This is the risk of fatality to an individual which is not considered tolerable. An example would be a hazardous fatality event occurring once in 1000 years, or 1 in 1000 per annum. This frequency of undesired event can also be expressed as 1E-3 per year. Any risk estimated as more frequent than this would be considered unacceptable and would require further risk reduction.
Tolerable if ALARP. Risk of fatality to an individual is between 1E-3, and 1E-6 per year. For these fatality risks, some regulators require a cost-benefit argument to be made, to demonstrate that sufficient risk reduction measures have been considered and weighed against the risk reduction provided.
Negligible risk. This is where risk of fatality to an individual is below a certain threshold, in the UK example, less than 1 in 1 million per year. Any hazardous event occurring at this frequency, or lower, is considered to be background risk which we must accept as part of everyday life.
ALARP: "As Low As is Reasonably Practicable"
3.3 Risk reduction principles
Inherent risk reduction should always be the first priority in reducing risk. Reducing hazardous inventory is an example. This has the possibility of changing the consequence side of risk if the reduced inventory changes the severity of a potential hazardous event.
Reducing hazardous inventory can reduce consequence, but possibly still leave a small number of exposed personnel . Another option for altering consequence is to alter facility siting to reduce occupancy in hazardous areas.
When the consequence has been reduced to a minimum, then the frequency side of risk must be addressed. For example, a single fatality risk may be estimated to be bordering the unacceptable region at 1E-03 per year. Without any measures to alter the estimated consequence (1 fatality), it is still possible to reduce the expected frequency of the event. This can be achieved by various different means, typically referred to as Independent Protection Layers (IPL).
3.4 Independent Protection Layers
There are many types of IPL that can be applied to help reduce the frequency of hazardous events. These include actions by operators, mechanical safety devices designed for specific events like pressure relief, and safety instrumented functions (SIF) designed to actively sense a hazard and automatically take an action to prevent escalation.
Whatever the type of IPL, there are some fundamental principles that must be met if they are claimed as risk reduction measures. These principles were first introduced in the CCPS book "Layer of protection analysis - simplified process risk assessment" published in 2001 by the American Institute of Chemical Engineers (ISBN 0-8169-0811-7). The principles are as follows:
An IPL must:
- be effective in preventing the consequence when it functions as designed. Note: this should include consideration of the process safety time and the effectiveness of the IPL to act within that time.
- be independent of the initiating event and any other IPL claimed for the same scenario;
- be auditable and verifiable in some manner by documentation, review or testing.
When these principles are correctly applied to the selection of IPL then the remaining question is "how much risk reduction?" is provided by a given IPL device, system or action.
There are no formally approved values other than what has been proposed in text books like the one mentioned above, or more recent CCPS texts such as "Guidelines for Initiating Events and Independent Protection Layers in Layer of Protection Analysis" published in 2015 by the American Institute of Chemical Engineers (ISBN 978-0-470-34385-2).
The usual way to approach quantifying risk reduction is to consider a conservative order of magnitude probability of failure on demand (PFD) for each IPL. Using the assumption that an IPL will be designed to specifically prevent a consequence from occurring, an assumed PFD of 10% (or a factor of 0.1) will represent ONE order of magnitude risk reduction.
Taking the earlier example of an unacceptable single fatality risk of 1E-03 per year, a correctly selected IPL with a PFD of 0.1 will reduce this frequency to 1E-04 per year (0.1 x 1E-03 per year = 1E-04 per year), with the same outcome consequence. Of course, this will only be a valid estimate if the IPL fully meets the fundamental principles of effectiveness, independence and auditability.
Following this same principle and applying another IPL with a 1% probability of failure on demand (PFD of 0.01), it follows that the 1E-04 per year risk could be further reduced to 1E-06 per year in order to claim a practically negligible risk.
3.5 Safety Instrumented Functions
A special type of IPL is known as a safety instrumented function or SIF. A SIF comprises at least one element for directly sensing a potentially dangerous process condition, a logic solver to decide on the action(s) to be taken, and a final element which will take a direct action on the process to prevent the hazardous condition or stop it escalating further.
A SIF is actually no different in concept to any other IPL, albeit it has additional considerations for design integrity. It must meet the same criteria as other IPL; being effective in preventing the consequence (including being fast enough), independent of any other IPL and the initiating event, and audited (tested) on a regular basis.
The additional design integrity that SIF designers need to consider is known as Safety Integrity Level - SIL. Every SIF must have a SIL target, and this is determined using one or more qualitative or quantitative methods. Typical methods employed in the process industry today include risk graphs or risk matrices (qualitative), Layer of Protection Analysis - LOPA (semi-quantitative) or combinations of Event Tree and Fault Tree Analysis (quantitative).
Usually, each SIF should be designed to take action without any human intervention. The premise is to remove the reliance on people responding to alarm conditions as human behaviour is far less predictable.
Usually, each SIF should be designed to take action without any human intervention.
3.6 Safety Instrumented System
When SIF for different hazardous conditions are collected together into one logic solver, the collective is called a Safety Instrumented System - SIS. The SIS may comprise only a few SIF, or it may have tens or even hundreds. There is actually no limit on numbers of SIF in a SIS, although commercially available logic solvers will always have some capacity limitations.
For reasons of economics and ultimate flexibility, the majority of SIS implemented today use specialist programmable logic controllers as the logic solver. This programmability brings software design into the safety domain. Many additional design requirements apply to software used in safety duties, so this needs careful consideration and project control.
It is important to note that the SIS logic solver will need to be designed for the highest SIL requirement of any SIF which resides within the system.
3.7 Safety Integrity Level - SIL
The term SIL, Safety Integrity Level, is a measure of the amount of risk reduction provided by a Safety Instrumented Function (SIF) for each specific hazardous event. IEC 61511 requires that each SIF is designed to meet minimum risk reduction factors (RRF), between 10 times risk reduction at SIL 1, and >10,000 times risk reduction at SIL 4.
In practice, SIL requirements in the process industry sector are limited to SIL 3. Even SIL 3 is an extremely hard design requirement to meet, and typically involves diverse equipment or non-programmable systems. A SIL 4 requirement specified for a non-nuclear process industry application often indicates that there is something suspect with the process design or the SIL assessment. For the nuclear industry, different standards apply. For further detail on nuclear industry safety - see IEC 61513: 2011.
For a SIF operating in low demand, by definition less than one demand per year, the reciprocal of the RRF is known as average probability of failure on demand (or PFD average). Note that not all safety functions operate in low demand, but perhaps upwards of 90% of process industry functions do operate in this mode.
For functions which operate more frequently than once per year (or which operate continuously), SIL is specified in terms of probability of failure per hour, or PFH. Such High Demand or Continuous Mode functions are less commonplace in the process industry, but they do occur.
3.8 SIL and risk reduction
Consider an undesired hazardous event occurs only once in 500 years due to good process design and other risk reduction measures. We will call this the unmitigated event frequency. 1 in 500 years equals 2E-3 per year. Despite this appearing to be quite a low frequency event, it is in the unacceptable risk region previously given as an example.
The target frequency for this event is decided to be 1E-5 per year. In practice this target comes from an estimate or calculation of the consequences, in comparison with corporate risk criteria. These criteria should be set by your organization before embarking on SIL determination. If a SIL 1 SIF is applied with a minimum of 10 times risk reduction, the risk enters into the "tolerable if ALARP", amber region. As SIL 1 is a band of RRF from 10 to 100 times, the resulting risk could be approaching 2E-5 per year with a very well designed SIL 1 function. However, this does not meet the required risk reduction target for the consequence of 1E-5 per year.
If a SIL 2 function is applied with a minimum of 200 times risk reduction, the target risk frequency could be met. This is calculated by taking the unmitigated event frequency and dividing by the target. A well designed SIL 2 function would therefore be sufficient. Negligible risk (the green region) could be achieved with SIL 3, but this would be over-engineering, on the assumption that the cost of a SIL 3 function is grossly disproportionate to the extra risk reduction gained.