Research and Advances
Computing Applications India Region Special Section: Hot Topics

Impactful Research and Tooling for Program Correctness

Posted
checklist in front of a laptop computer, illustration
  1. Introduction
  2. VeriAbs
  3. VeriFuzz
  4. References
  5. Authors
checklist in front of a laptop computer, illustration

In 2020, poor-quality software systems led to financial losses of approximately USD 2.08 trillion in the U.S. alone.19 Formal methods, such as bounded model checking (BMC), help to improve software quality, but they often fail to scale to the size and complexity of software. To solve these problems, we have been developing two frameworks—VeriAbs and VeriFuzz—with novel techniques, tools, and strategies.

Back to Top

VeriAbs

Aim. Prove code correctness or generate tests that demonstrate correctness violations.

Innovation. VeriAbs implements several novel verification techniques, such as loop shrinking,20 aimed at verifying programs with complex loops, and improving the operational efficiency of popular verification techniques, such as k-induction.3,9 Further, in our experience, the application of an appropriate sequence of techniques is more effective at verifying complex programs than any single technique. As shown in Figure 1, VeriAbs implements four novel strategies aimed at verifying different kinds of code, where each strategy consists of a sequence of verification techniques:

f1.jpg
Figure 1. VeriAbs architecture (S: Program Safe, F: Property Fails, U: Unknown).

Strategy 1 comprises only k-induction, which is effective on loops with unstructured control flow. This does not occur in real-world code but in synthetic programs to challenge verification engines. In practice, we switch off the first strategy and retain k-induction as the final step of the default strategy.

Strategy 2 is useful for verifying quantified properties, such as whether arrays are sorted. This kind of code is commonly found in the automotive sector, where domain-specific parameters and configurations are coded using arrays and are accessed using pointers. This strategy correctly verifies 87% of all quantified properties in automotive applications.17

Strategy 3 aims to verify implementations of reactive systems and email clients that admit inputs to loops with very short ranges. As a result, 50% of the properties of such benchmarks in the International Competition on Software Verification (SV-COMP)5 could be verified using this strategy.

Strategy 4 is the default strategy of VeriAbs and handles programs containing linear acceleration of variables, which are very common in practice. This strategy works well in practice; it helped not only with the verification of real-world software13 but also to eliminate false positives from those generated by static analysis tools,10 such as the TCS Embedded Code Analyzer (ECA).18

Impact. The main technical novelty of VeriAbs is in the various abstractions it implements to scale program verification, which resulted in four publications7,13,17,20 and six patents. The architectural advances in VeriAbs are published in a tool paper1 and several competition-contribution papers.2,9,12,15 To benchmark against other verifiers, VeriAbs has participated in SV-COMP since 2017, competing with more than 30 state-of-the-art verifiers. The results show that VeriAbs solves about 20% more programs than the competition,5 helping VeriAbs win a gold medal in SV-COMP’s ReachSafety category each year since 2019.

VeriAbs has also been used in different industrial projects.8,10,14,16,21 This experience helped us refine the strategies in VeriAbs to successfully verify code sizes of 1.5 MLOC for properties such as array index out of bounds, division by zero, variable value overflow and underflow, and null pointer dereferences. VeriAbs verified around 60% of these properties across domains including automotive, office automation, networking, device drivers, and security.

Back to Top

VeriFuzz

Aim. Automatic test generation for the discovery of complex software errors related to functionality, security, and privacy.

Innovation. Figure 2 shows the architecture of the VeriFuzz framework, which combines coverage guided fuzzing (CGF), BMC, and static analysis (SA) technology in novel ways for automatic test-case generation for programs. VeriFuzz uses the open source engines AFL24 and CBMC6 respectively for CGF and BMC, and TCS’s in-house SA framework PRISM.18

f2.jpg
Figure 2. VeriFuzz architecture.

The input to VeriFuzz is a program P and a desired coverage ∅. First, VeriFuzz extracts some syntactic features and classifies P into a category K, which determines the parameters to the BMC and the fuzzer. Next, P is instrumented to measure coverage, and BMC is then invoked to generate an initial test corpus Tc. The fuzzer invokes Pe, the executable of instrumented P, with the initial corpus Tc and measures the initial coverage. Each test from Tc is mutated several times to generate a set of new test inputs Tg. Pe is repeatedly executed with each test input tgTg and coverage is measured. The fitness check determines whether tg should be added to Tc or not. This process is repeated until ∅ is achieved. VeriFuzz is efficient (takes only a few minutes) and effective (able to achieve ∅).

Impact. VeriFuzz combines the strengths of BMC and CGF for efficient and effective test generation,22 which led to a nomination for a best paper award. The architectural advances in VeriFuzz are published in competition-contribution papers.11,23 To benchmark it against state-of-the-art test engines, VeriFuzz has participated in the International Competition on Software Testing (Test-Comp)4 each year since 2019, competing with more than 10 state-of-the-art test engines and evaluated on more than 3,000 C benchmarks. VeriFuzz has earned seven gold and five silver medals in Test-Comp and discovered errors 5- to 30-times faster than other top tools. In the industry, VeriFuzz was recently evaluated on an already well-tested smart contract implementation, where it exposed security errors, especially in the handling of Unicode string character buffers that were hitherto undiscovered.

    1. Afzal, M. et al. VeriAbs: Verification by abstraction and test generation. In Proceedings of the Intern. Conf. on Automated Software Engineering (2019), 1138–1141.

    2. Afzal, M. et al. Veriabs: Verification by abstraction and test generation. In Proceedings of the Intern. Conf. Tools and Algorithms for the Construction and Analysis of Systems (2020), 383–387.

    3. Beyer, D., Dangl, M., and Wendler, P. Boosting k-induction with continuously refined invariants. CAV (2015), 622–640.

    4. Beyer, D. Advances in automatic software testing: Test-Comp 2022. In Proceedings of the 25th Intern. Conf. Fundamental Approaches to Software Engineering. LNCS 13241, Springer.

    5. Beyer, D. Progress on software verification: SV-COMP 2022. In Tools and Algorithms for the Construction and Analysis of Systems, LNCS 13244, Springer.

    6. C Bounded Model Checker; https://github.com/diffblue/cbmc.

    7. Chakraborty, S., Gupta, A., and Unadkat, D. Diffy: Inductive reasoning of array programs using difference invariants. Computer Aided Verification (2021), 911–935.

    8. Chimdyalwar, B. and Darke, P. Statically relating program properties for efficient verification. In Proceedings of the Intern. Conf. on Languages, Compilers, and Tools for Embedded Systems (2018), 99–103.

    9. Chimdyalwar, B. et al. Veriabs: Verification by abstraction. In Proceedings of the Intern. Conf. Tools and Algorithms for the Construction and Analysis of Systems (2017), 457–462.

    10. Chimdyalwar, B. et al. Eliminating static analysis false positives using loop abstraction and bounded model checking. In Proceedings of the Intern. Symp. on Formal Methods (2015), 573–576.

    11. Chowdhury, A.B., Medicherla, R.K., and Venkatesh, R. Verifuzz: Program aware fuzzing. In Proceedings of the Intern. Conf. Tools and Algorithms for the Construction and Analysis of Systems (April 2019), 244–249.

    12. Darke, P., Agrawal, S., and Venkatesh, R. VeriAbs: A tool for scalable verification by abstraction. In Proceedings of the Intern. Conf. Tools and Algorithms for the Construction and Analysis of Systems (2021).

    13. Darke, P. et al. Over-approximating loops to prove properties using bounded model checking. Design, Automation, and Test in Europe Conf. (2015), 1407–1412.

    14. Darke, P. et al. Precise analysis of large industry code. Asia-Pacific Conf. on Software Engineering 2 (December 2012), 306–309.

    15. Darke, P. et al. Veriabs: Verification by abstraction and test generation. In Proceedings of the Intern. Conf. Tools and Algorithms for the Construction and Analysis of Systems (2018), 457–462.

    16. Darke, P. et al. Efficient safety proofs for industry-scale code using abstractions and bounded model checking. In Proceedings of IEEE 2017 Intern. Conf. Software Testing, Verification, and Validation, 468–475.

    17. Jana, A. et al. Scaling bounded model checking by transforming programs with arrays. In Proceedings of the 26th Intern. Symp. on Logic-Based Program Synthesis and Transformation (September 2016), 10184. LNCS Springer, 275–292.

    18. Khare, S., Saraswat, S., and Kumar, S. Static program analysis of large, embedded code base: An experience. In Proceedings of the 2011 India Software Engineering Conf.

    19. Krasner, H. The cost of poor software quality in the US: A 2020 report. Consortium for Information and Software Quality; https://bit.ly/3QgMN4p.

    20. Kumar, S., Sanyal, A., Venkatesh, R., and Shah, P. Property checking array programs using loop shrinking. In Proceedings of the Intern. Conf. Tools and Algorithms for the Construction and Analysis of Systems (2018), 213–231.

    21. Kumar, S., Chimdyalwar, B., and Shrotri, U. Precise range analysis on large industry code. In Proceedings of the ESEC/FSE 2013, 675–678.

    22. Metta, R., Medicherla, R.K., and Chakraborty, S. BMC+Fuzz: Efficient and effective test generation to appear. In Proceedings of the Design, Automation, and Test in Europe Conf. and Exhibition (March 2022). IEEE.

    23. Metta, R., Medicherla, R.K., and Hrishikesh, K. Verifuzz 1.2: Seeds for fuzzing from BMC with remedies. In Proceedings of the 25th Intern. Conf. on Fundamental Approaches to Software Eng. (April 2022). LNCS 13241, Springer.

    24. Zalewski, M. American fuzzy lop; http://lcamtuf.coredump.cx/afl/.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More