To achieve reliability goals, software development teams strive for software applications to meet rigorous verification processes and zero defects. Edsger Dijkstra famously observed that testing can only show the presence of errors, not the absence.
If testing cannot prove the absence of critical run-time errors, how can embedded software development teams be certain that their software is free of these errors?
Code verification based on mathematical proofs is a solution to consider. Recent advances in the practical application of scalable and high-performance mathematical techniques for the verification of software will help achieve the quest of proving the absence of run-time errors in software.
The complexity of embedded software in high-integrity systems continues to increase. In military applications, the avionics software for the F-22 Raptor consists of 1.7 million lines of code, and avionics software for the F-35 is expected to have 5.7 million lines of code. For commercial airliners, the Boeing 787 flight-control system will have about 6.5 million lines of code. The increased software content and complexity of these aircraft compounds the risk of failure and complicates the process of achieving high confidence in software reliability.
Studying past embedded device failures can be instructive in understanding coding-related issues. For example, the failure of a test flight of an expendable rocket was traced to a code defect. In this particular situation the launch vehicle self-destructed less than a minute after liftoff because the vehicle encountered high aerodynamic loads as a result of an angle of attack that exceeded the prescribed safety limit.
Post-incident investigation revealed the root cause of failure: an overflow resulting in a run-time error in the embedded software. The overflow, found in a pair of redundant inertial reference systems that determined the rocket’s attitude and position, occurred when a 64-bit floating-point number was converted to a 16-bit signed integer, which moved the rocket nozzles to an extreme position. The presence of a redundant system did not help because the backup system implemented the same behavior.
Run-time errors such as the one described above represent a specific class of software errors known as latent faults. The faults exist in code, but unless specific tests are run under particular conditions, the faults will not be detected during system testing. Therefore, the code can appear to function normally but result in unexpected system failures in the field. A few examples of run-time errors include noninitialized data, out-of-bounds array access, null pointer dereference, overflow and underflow, incorrect computation, concurrent access to shared data, and illegal type conversions.
Traditionally, software verification at the source-code level has involved code reviews, static analysis, and dynamic testing. Each of these approaches has its shortcomings.
Code reviews, which rely solely on the expertise of the reviewers, may be tedious for large code bases. Classic static analysis techniques, which rely mainly on a pattern-matching approach to detect unsafe code patterns, cannot prove the absence of run-time errors. With the increasing complexity of embedded software, dynamic testing of all operating conditions may be impossible to achieve, which reinforces Edsger Dijkstra’s position that testing can show only the presence of errors, not the absence.
A verification method known as abstract interpretation builds on static analysis methods by using formal mathematical proofs, which can identify certain run-time errors or prove their absence. Abstract interpretation is applied to the source code without executing it.
Abstract interpretation as a proof-based verification method can be illustrated by multiplying three large integers in the following equation: –4586 × 34,985 × 2389 = ?
Although computing the answer to the problem by hand can be time-consuming, the rules of multiplication can be quickly applied to determine that the sign of the computation will be negative, which is an application of abstract interpretation. The technique provides some properties of the final result, such as the sign, without having to multiply the integers fully. By applying the rules of multiplication, it is known that the resulting sign will never be positive for this computation.
In a similar manner, abstract interpretation can be applied to software semantics to prove certain properties of the software. Abstract interpretation bridges the gap between conventional static analysis techniques and dynamic testing by verifying certain dynamic properties of source code without executing the program itself.
Abstract interpretation investigates all possible behaviors of a program—that is, all possible combinations of values—in a single pass to determine how and under what conditions the program may exhibit certain classes of run-time failures. It can be mathematically proven that the technique will predict the correct outcome as it relates to the operation under consideration.
Abstract interpretation can be used as a static analysis tool to detect and mathematically prove the absence of certain run-time errors in source code, such as overflow, divide by zero, and out-of-bounds array access. The verification is performed without requiring program execution, code instrumentation, or test cases.
Polyspace code verification products from MathWorks rely on this type of static analysis. The input to a Polyspace product is C, C++, or Ada source code, which is examined to determine where potential run-time errors could occur. Polyspace then generates a color-coded report that indicates the status of each element in the code.
Results that are all tagged in green indicate that the code is free of certain run-time errors. In cases where run-time errors have been detected and color-coded in red, gray, or orange, software developers and testers can use information generated from the verification process to fix identified run-time errors.
A code verification solution that includes abstract interpretation can be instrumental in attaining a good quality process. It is a robust verification process that helps achieve high integrity in embedded devices.
Jay Abraham, Polyspace Product Marketing Manager at MathWorks, wrote this article for Aerospace Engineering.