Top Reasons not to use Static Analysis
We all know that finding bugs as soon as possible saves time and money. Static analysis is one of the best ways of finding bugs early, yet in my experience, very few development teams have built it into their process. This post takes a look at five of the most common objections to using static analysis and gives some suggestions for overcoming them.
Static analysis is for finding mistakes, not real bugs.
There are two problems with this objection. First, it’s no longer true. One of our customers identified 50 bugs the first time they used static analysis. The cost of those bugs reaching even QA would have exceeded the cost of bringing in static analysis. If any of those bugs had reached production, the costs would have been even higher. At another customer site, we started the day by presenting a brief summary of static analysis results to the development team. Before we left that afternoon, over 20 bugs had been fixed in the code.
There is an ongoing debate in the academic community on the effectiveness of static analysis for identifying bugs. My view is that static testing is fundamentally limited in its ability to detect bugs. Static analysis can go some way towards identifying whether or not your code reliably implements the algorithm you’ve coded. It has no way of telling whether that algorithm is appropriate for solving the problem at hand. However, just because static analysis can only find some bugs, even if it’s only one in 10, why on earth would you not use it, so that you can concentrate on finding the hard bugs?
It’s too noisy
A common reaction when talking to developers about static analysis tools is a rolling of the eyes and the recounting of the time they tried to use Lint on their project. It ran for hours and produced tens of thousands of warnings. No one had any idea what to do with the reams of data from the tool, so it went into a folder to be looked at some day and that was the end of the experiment.
The solution to this is to choose and use a tool that only generates data you are going to use. First, different static analysis tools have different philosophies on how to deal with false positives. A false positive is a situation where the tool reports a problem with the code, even though the code is actually correct. Typically, the more ambitious the rule, the greater is the likelihood of false positives. For example, highlighting class names that don’t meet some predefined template is straight forward and would never produce false positives. Identifying possible race conditions in a multi-threaded program is more valuable, but also much more likely to produce false positives.
This objection stems primarily from experience of using static analysis in the C++ world. C++ is a terrible language for tools vendors to handle. There are only a handful of people in the world capable of writing an accurate parser to read and understand C++ source files in all their template-ridden complexity. On top of this, you have to deal with the pre-processor transforming the visible source code into something quite different before it hits the compiler. Unless you get the exact same set of pre-processor defines to your static analysis tool that your compiler is using, you can expect to be overwhelmed with spurious or plain incorrect warnings.
It takes too long
I am still baffled at the number of static analysis tools that need to be explicitly executed by the developer. Eclipse made great strides with its smart incremental compiler guaranteeing that your Java code is constantly fully compiled and up to date, yet the static analysis framework within TPTP needs to be invoked from a menu.
Developers are busy and the only way to make static analysis work is to integrate it completely into their existing workflow. If you treat a static analysis violation just the same way you would treat a compiler warning there is no overhead to using static analysis. And if your existing tool won’t let you work that way, find one that will.
I’ll use it for my next project
I’m in the middle of a project right now, I’ll use it for my next project. I’ll write clean code next time, but for now, I need to deal with an existing code base that wasn’t written with static analysis in place and contains too many violations to manage.
The solution is the same. The only way to use static analysis is a zero-tolerance policy. Your entire project must build with no static analysis warnings at all times. That means that you must use a tool that supports escapements i.e., the ability to suppress messages from that static analysis tool in situations where you know the code is correct. This is usually achieved by inserting magic comments or, in Java, annotations.
Some static analysis is more valuable than no static analysis. Most tools have a severity setting, so try enabling only the most serious violations and then running the tool against your code. The chances are it will find few, if any, violations. The most serious violations typically correspond to bugs, and your existing unit and functional tests have probably flushed out those bugs already.
If there are still too many violations, then there’s probably a mismatch between your coding style and the default settings for the tool. Next project you can review your coding standards, but for now, simply disable any rules that are too noisy to deal with. Remember, the goal is to get a static analysis up and running on the project that you have today.
Once you get the number of violations down to a manageable number, take the time to check out any messages and resolve them. Typically this may show up a few bugs but in most cases, the solution will be to mark up the source code to suppress incorrect warnings. That’s not a bad thing in itself if the tool is confused by the code then the next maintenance programmer probably will be too, and they’ll thank you for the explanatory comment.