Why you should really care about C/C++ static analysis

Many resources discuss the benefits of using the static analysis tools, and how they could help you improve your code base. Somehow they show you what you could gain after using them. But did you asked yourself what do you lose if you don’t use them?

Let’s take an example of a memory corruption due to free of a pointer twice, this cause random crash. It could take few hours or maybe many days to find this kind of issue. Many similar risky problems exist in C/C++ specially concerning memory corruption. Just one problem could cost few dollars or many thousands of dollars.

The impact of an issue depends also on the nature of the program, Indeed a problem in an embedded application of a machine does not have the same impact as a crash in a paint application.Sometimes one problem could cost many million of dollars or even many billions of dollars, like the case of Ariane 5 where a bug costs $7 billion.

What do you lose if you use a static analysis tool?

Let’s take as example cppcheck, which is primarily detects the types of bugs that the compilers normally do not detect. Many interesting errors are reported by this tool.

You need less than one minute to download it, maybe 20 minutes to configure it, the analysis takes a few minutes to many hours, but in this time you are free to do other tasks. After the analysis you could have thousands of potential issues, in the beginning you could focus only on priority errors.

Finally for free static analysis tools, you lose only 30 min to have a list of potential issues that could cost you many thousand of dollars.

For commercial tools you lose more than time, you have to pay it. Therefore, you lose also money. Let’s suppose that you purchase a tool with 1000$ and it helps you find a problem that needs two or three days for a developer to find it.

Three days of a C/C++ developer could cost more than 1000$, it depends of course where the company is. But if you take into account the hidden cost of one issue, you will be surprised how many a simple issue could cost to the company. Many stories exist on the web talking about the cost of simple issues.

here are some free static analysis tools:

CppCheck (Free): Many checks are provided by CppCheck, here are some of the checks available:

Out of bounds checking
Checking exception safety
Memory leaks checking
Warn if obsolete functions are used
Check for invalid usage of STL
Check for uninitialized variables and unused functions

Clang(Free): Clang is a C/C++ compiler, its diagnostics are very interesting, you could be surprised by the relevant issues reported, it could concern:

Deprecated usage
Cast problems
Intialisation problems
OpenMP issues and more.

Clang-tidy(Free): is a clang-based C++ “linter” tool. Its purpose is to provide an extensible framework for diagnosing and fixing typical programming errors, like style violations, interface misuse, or bugs that can be deduced via static analysis. clang-tidy is modular and provides a convenient interface for writing new checks. Here’s the checks list of clang-tidy.

Many other static analysis tools exist, some of them are easily accessible to test, for others you have to contact their companies and ask for a trial version.

If you could just lose 30 min and use cppcheck, be sure that you will not waste your time.

What about Bug-prone situations?

Static analysis is not only about directly finding bugs, but also about finding bug-prone situations that can decrease code understanding and maintainability. Static analysis can handle many other properties of the code:

Code metrics: for example, methods with too many loops, if, else, switch, case… end up being non-understandable, hence non-maintainable. Counting these through the code metric Cyclomatic Complexity is a great way to assess when a method becomes too complex.
Dependencies: if the classes of your program are entangled, effects of any changes in the code becomes unpredictable. Static analysis can help to assess when classes and components are entangled.
Immutability: types that are used concurrently by several threads should be immutable, else you’ll have to protect state read/write access with complex lock strategies that will end up being un-maintainable. Static analysis can make sure that some classes remain immutable.
Dead code: dead code is code that can be removed safely, because it is not invoked anymore at runtime. Not only can it be removed, but it must be removed, because this extra code add unnecessary complexity to the program. Static analysis can find most of dead code in your program (yet not all).
API breaking change: if you present an API to your client, it is very easy to remove a public member without noticing and thus, breaking your clients code. Static analysis can compare two states of a program and can warn about this pitfall.
API usage: some APIs are intended to be used carefully. For example, a class that hold disposable fields must be itself disposable in general, except when the disposable field lifetime is not aligned with the class instances lifetime, which then sounds like a design problem.

Many interesting tools exist to detect bugs in your C++ code base. But what about the detection of the bug-prone situations?

If the static analysis tools creators could decide which situations are considered as bugs, it’s not the case of the bug-prone situations which depends on the development team choices. For example a team could consider that a method with more than 20 lines is complex, another team could define the max to 30. If a tool provides the detection of some bug-prone situations, it must provides also the possibility to customize it.

Code as Data is the better way to detect the Bug-prone situations

Static analysis is the idea of analyzing source code for various properties and reporting on those properties, but it’s also, philosophically, the idea of treating code as data. This is deeply weird to us as application developers, since we’re very much used to thinking of source code as instructions, procedures, and algorithms. But it’s also deeply powerful.

After the source code analysis of a source file, we can extract its AST and generate a model containing many interesting infos about the code. This way we can query it using a code query language similar to SQL.

CppDepend provides a powerful code query language named CQLinq to query the code base like a database. Developers, designers and architects could define their custom queries to find easily the bug-prone situations.

With CQlinq we can combine the data from teh code metrics, dependencies, api usage and other model infos to define very advanced queries that match some bug-prone situations.

Here’s an example of a CQLinq query that matches the most complex methods:

Summary

It’s better to combine many C++ tools to detect some problems in your C++ code base, some tools detect bugs, some others detect the bug-prone situations.