An operational scenario is defined as “Description of an imagined sequence of events that includes the interaction of the product or service with its environment and users, as well as interaction among its product or service components” [32]. The set of operational scenarios shall therefore represent real scenarios that may be encountered when the system is in operation. This set shall comprise a number of defined scenarios, meaningful with respect to the safety requirements of the system, that may occur during the system life.
For a system designed for automated retinal disease diagnosis, an operational scenario can be represented using relevant clinical pathways.
For a self-driving car, different operational scenarios would cover changing lanes on a motorway, navigating a roundabout, stopping at a red light etc.
The system shall be tested against the defined operational scenarios ([EE]), and the results from the tests assessed against the safety requirements. The results shall be captured explicitly as the integration testing results ([FF]).
Integration testing may take many forms, including simulation and hardware in the loop testing [11]. A shadow deployment can also be used for integration testing to evaluate the actual system in the real operating environment while another stable system is in use.
When using simulation, sufficient confidence shall be demonstrated that the simulator represents the actual operating environment.
Implementing SLAM (simultaneous localization and mapping) using a simulated environment such as Gazebo can provide very different results from considering a physical robot such as a TurtleBot navigating in a real‐world setting. In particular, simulators often do not take good account of the effects of sensor noise such as reflections in the case of LIDAR.
Not all properties of the system can be tested in simulation. Physical properties such as different frictions depending on wearing or sensor noises are very difficult to simulate and will require hardware in the loop (HIL) or real system testing. HIL has the advantage of enabling testing in the early stages of a project and can be used to take account of sensor noise or other physical properties in different programmed virtual environments.
The target system containing the integrated ML component shall be tested in a controlled setting to allow for safe evaluation of the system. This controlled setting may include additional controls, monitoring, or the use of simulation of real‐world scenarios. In this way the behaviour of the component may be safely evaluated, by stakeholders, in context.
An automated retinal disease diagnosis system is used as a trial in a hospital for six months providing guidance to clinicians who make the diagnosis decisions. This approach allows the clinicians to evaluate the performance of the system whilst maintaining the ability to override any diagnosis results from the system that are felt to be unsafe.
Prior to operation on public highways, an automated lane following system may be tested and tuned once integrated into a vehicle making use of experienced test drivers and on designated test tracks. This reduces the risk during evaluation of the system in high-speed manoeuvres.
Wherever possible the integration to the system shall be tested using the actual target system or using a hardware in the loop approach, as this provides results that most closely reflect what will be observed in operation. However, in many cases, this may be impractical. In which case simulation and hardware in the loop may be used together.
The integration testing results shall be reported in the integration test results [W] artefact providing evidence that the system safety requirements [A] are met.
The worst-case execution time of a system used for a vehicle control loop which includes a machine learning component shall remain in the limit for each of the different scenarios in order to assure the safety of the vehicle.