I’m working at a company that would score 11 on Joel Test – at least on paper.
In practice, however, nothing works quite as well as expected, and the project has been on DEFCON 1 for half a year. Now, most of my peers are happy if they can go back home at 6pm – on Sunday.
One of the apparently good practices that struck me as not working is the use of static analysis tools. The project both tracks gcc -Wall warnings and a proprietary and very expensive “C/C++” tool.
Gcc warnings do more often than not point to real(if most of the time inoffensive) bugs.
The proprietary tools, however, list things such as implicit casts and sizeof’ing a string literal. Implicit casts are also blacklisted on their stylebook.
The standard practice is that people is pressed to make every single warning shut up. Note that this does exclude warnings that are predominantly false positives, this is not the problem.
The result is:
- People add type casts to every rvalue and to every argument hiding real problematic type mismatches in the process.
- People introduce off by one bugs, or use a different problematic language feature.(strlen instead of sizeof, strncpy instead of strcpy, etc.)
- The warnings are silenced.
- The bug reports start rolling in.
The main point is the original code was working and written by people who were playing safe within their language abilities whereas the fixes were not.
Now, I don’t really think this company can be saved. However, I would like to know if there is a better, preferably working, way to use the “pro” tools or if I should just avoid using them altogether in case I am the one making the decision in the future.
A solution which doesn’t assume all programmers are geniuses that can’t err. Because well, if they are, then there is no need to use the tools in the first place.
3
In practice, however, nothing works quite as well as expected, and the
project has been on DEFCON 1 for half a year. Now, most of my peers
are happy if they can go back home at 6pm – on Sunday.
This is certainly a good chunk of your problem. A software engineer cannot work productively for more than 40hours/week. If you go above that, you do serious damage to your capacity to work- to the point at which even working 80 hours/week brings very little of value to the project, and sometimes, projects can even regress because the team introduces more errors than they can fix. If you’ve been doing 60 hours/week or more for half a year, then your whole team needs a good long break, and to come back at 40hours/week. You can’t solve anything with a workforce that’s hardly capable of bashing on the keyboard because they’re so overworked.
The proprietary tools, however, list things such as implicit casts and
sizeof’ing a string literal. Implicit casts are also blacklisted on
their stylebook.
You really need to either just drop the tool entirely, or reconfigure/replace it. Warning on every implicit cast is far too much for anyone to deal with.
3
I’m working at a company that would score 11 on Joel Test – at least on paper.
That test is just mildly relevant to software companies. I don’t understand the hype about it. You can score 12 and still have 100% crap programmers. People matter far more than tools.
The proprietary tools, however, list things such as implicit casts and sizeof’ing a string literal. Implicit casts are also blacklisted on their stylebook. The standard practice is that people is pressed to make every single warning shut up. Note that this does exclude warnings that are predominantly false positives, this is not the problem.
This is the largest problem with all static analysers: too many false positives. The only way to handle this is to learn in detail why the tool was set to give a warning for a certain issue. Only then can you make qualified assumptions of whether a warning is false or not.
For the specific cast of implicit typecasts, it is there because of some very common, subtle and dangerous bugs in the C language, caused by the two (moronic) implicit type promotion rules in C. They are formally known as the integer promotion rules and the usual arithmetic conversions. I think less than one in ten of all professional C programmers can explain what these two rules do, and what they don’t do. And yet these rules are extremely fundamental and applied thousands of times between the lines in any normal C program.
The MISRA-C standard devoted a whole chapter to explain these dangerous implicit conversion rules, and then they added numerous fairly complex rules for how to avoid bugs caused by implicit conversion. This had quite some impact on all static analysers in the market.
I would recommend any C programmer to read that MISRA-C chapter, or at the very least Google the two promotion rules I mentioned and read as much about them as you can. You can only ignore the static analyser when you know everything there is to know about these rules.
Now, I don’t really think this company can be saved. However, I would like to know if there is a better, preferably working, way to use the “pro” tools or if I should just avoid using them altogether in case I am the one making the decision in the future.
A solution which doesn’t assume all programmers are geniuses that can’t err. Because well, if they are, then there is no need to use the tools in the first place.
There is no easy way around it. The real problem is actually not the tool, but the obscure C language. It may seem to be an easy language at a brief glance, but there are plenty of illogical and strange mechanisms going on between the lines. And there are also hundreds of cases of undefined/unspecified/impl.specific behavior in the language. Most of them you must learn and avoid.
So you must either become a veteran C programmer that understands all these obscure warnings, or you must be in a team with at least one such person. Companies that only hire a bunch of junior programmers with no veteran in the team should not use these tools, nor should they work with high-integrity software development.
2
It is not clear whether you are asking how to fix the problem, or how you could have avoided it in the first place. I’m assuming the latter.
It sounds like you have been relying on static analysis as your primary means of detecting and avoiding bugs. Maybe you would have been better off focussing more on unit tests, and dynamic tools … such as memory checkers.
It is a bad idea to rely on a tool that generates lots of false positives, especially if you can’t suppress those false positives. It encourages tired / overworked (or lazy) programmers to make “fixes” to make the warnings go away. If you can’t selectively suppress false positives (e.g. with comments) or tailor the ruleset, then you need a better tool. Or at least, you should only use it sparingly.
It sounds like you have a problem with people being careless and / or making mistakes due to overwork. Maybe you should have spent more time / effort in code reviews. This directly addresses your observation that programmers are not geniuses.
Finally, it sounds like you may have been suffering under unrealistic deadlines, poor people resourcing and / or shifting requirements. This is a management problem, and has to be addressed at that level … or else there is a significantly elevated risk of project failure, and long-term damage to staff productivity and morale.
In addition to user29079’s excellent answer, I would like to add one more deficiency of the C language which adds to the problem. I was not aware of this deficiency until I moved on to Java.
The purpose of a warning is to bring to the attention of the writer and future readers of the code the fact that there is something fishy going on. That, as a concept, is perfectly fine: the more warnings you have enabled, more fishy things you discover.
What is dead wrong, is the notion that every single little fishy thingy must be fixed at all costs, by changing code. That is what caused the problem that the OP is describing: the botched attempts to hastily make warnings go away by altering code which is working just fine.
At the same time, we don’t want our code spewing out hundreds of warnings every time we compile, because pretty soon warnings become meaningless, and nobody pays any attention to new warnings anymore.
So, this seems to be a case of conflicting goals; a contradiction.
A very good solution to this seemingly impossible quandary is to be able to selectively suppress a certain warning on a statement by statement basis, so that we can suppress it precisely there where we want to indicate that we do really know what we are doing, without having to alter any code.
Java has a nice solution to this, with the @SuppressWarnings( "" )
annotation. If you are making, say, a so-called unchecked conversion in java, the compiler issues a warning to bring this to your attention, you examine it, you determine that this is one more of those cases where you know what you are doing, so you precede the offending statement with a @SuppressWarnings( "unchecked" )
annotation which affects only the statement which immediately follows it, and you proceed with your life.
So, the warning goes away, the code remains unaltered, and the suppression annotation stands there, colored green by syntax highlighting, to document the fact that an iffy conversion is taking place, but the author promises is good. (*) Everyone is happy.
In java you can even prefix individual parameters to functions with @SuppressWarnings()
, so as to disable a specific warning which may be issued on that specific parameter only.
However, as far as I know, C never supported any standard mechanism for selectively suppressing warnings on statements, and certain compilers that implemented their own proprietary mechanisms did so in a shrewd fashion, for example the #pragma warning (nnn:N)
directives of Microsoft C. The problem with these directives is that you need two lines of directives in order to both suppress a warning for a single statement and leave that warning enabled for statements that follow. Therefore, C programmers are highly unlikely to pick up the habit of suppressing a warning on an individual statement basis, even when they know that this particular warning on this particular statement is okay.
(*) someone might argue that if you allow this, then every programmer in the house will be suppressing warnings without giving them any thought; the answer to this is that you may do everything you can to help the programmers do their job better, but there is nothing you can do if they actively decide to sabotage you.
1
I’d say (without knowing the details) that your company has made more significant mistakes and this is just the symptom: it failed to properly determine the domain of the software produced, to manage the team properly and to define reasonable goals.
If the software your team is writing is something critical that human lives depend on, static analysis tools are important and useful. Since the “cost” of a bug, measured in money (ignoring how cynical that is) is high, there are special tools required for the job. There are guidelines for both C and C++ for safely using the language features (read what to avoid and how far to go in avoiding it).
However, in such a situation having employees work on Sunday is a very bad idea, moreover from the context you’ve provided, I extrapolate that sloppy work is not infrequent. So if this is the case, the problem is very very big, and your management is trying to solve it by using “better” tools. Unfortunately, that’s not how things work. If I’m even close to correct, I’d go as far as suggesting that you start looking for a different job. Mission-critical software done bad is something that can haunt you and ruin your professional reputation, not to mention lawsuit prospects.
If, on the other hand, your team is writing something that is far less critical, ‘static analysis’ tools with a high false-positive rate will only slow you down, and, as you’ve noted, people usually react to this by finding the local maximum in their effectiveness, in this case by simply cheating their way around the warnings, instead of trying to fix the underlying reason. So in your situation, using these tools may actually deteriorate the quality of the source code, and they should generally be avoided.
If you need to manage a software team that is writing some C or C++ code, I’d say first determine what are the main project goals. If writing bug-free code is of utmost importance (e.g., the new super awesome social app for iOS and Android doesn’t fit this), then use static analysis, maybe even go as far as formal specification and formal verification. But if this is the case, choosing good programmers and having them work in a sane environment with adequate workload and proper 40 hour week is far more important.
People cannot work responsibly if they’re treated like slaves (and someone occupying my holidays is exactly this) and good programmers will simply refuse to do this on a regular basis, unless they know no better (and if they’re good, chances are they do).
Bottom point: Figure your goals, then determine the tools and the process you use, don’t let the available tools define the process (ergo the goals) as (bad) management might want to force you.
As stated in k.steff’s answer, static analysis tools for ‘C’ are useful if you are building software in critical situation (airplane software) and they must be used from day one (not after months of development when a un-reproducible bug occurs).
Using these kinds of tools late in the development process is usually a sign of bad design choices early in the process.
In non-critical software, these kinds of tools are useful to:
-reassure management, make them feel useful (we have done something, we have spent a lot of money to fix the software). It gives them actions that are easy to fulfill in their powerpoint reportings).
-Keep the ‘toxic’ developers busy. Let them configure the tool and analyze the tool results, so that you can have time to fix their nasty bugs.
-enforce coding rules. In that case it must be used from day one. But a good code review tool may be better choice.
So my opinion is you must avoid these kinds of expensive, time-consuming tools or use them to keep the ‘toxic’ people busy.
I’m working at a company that would score 11 on Joel Test – at least on paper.
Validation by the Joel Test is not a measure of good practices; Invalidation by the Joel Test is a measure of bad/missing practices though. In other words, it can be seen as necessary, but not sufficient.
Now, most of my peers are happy if they can go back home at 6pm – on Sunday.
It looks like you have a management problem. Systematic overtime is not a solution to getting things done; if anything, it is a solution to making things look like they are working and accumulating technical debt.
Now, I don’t really think this company can be saved. However, I would like to know if there is a better, preferably working, way to use the “pro” tools or if I should just avoid using them altogether in case I am the one making the decision in the future.
Yes, there is a way to use static code analysis in a sane manner.
They can, and usually do provide a lot of value. You just have to remember that, (like compiler warnings) a static analysis tool can (at most) provide hints of something going wrong – not bug reports, and no certainty whatsoever.
It sounds like your decisions makers confuse static analysis flags with application defects.
Implicit casts are also blacklisted on their stylebook.
This sounds like a management competence problem, not a static analysis problem.
The result is:
People add type casts to every rvalue and to every argument hiding real problematic type mismatches in the process.
People introduce off by one bugs, or use a different problematic language feature.(strlen instead of sizeof, strncpy instead of strcpy, etc.)
The warnings are silenced.
The bug reports start rolling in.
All these are ways to make the periodic reports look good, not ways to fix the problems in the code (it looks like you have a management competence problem, again).