2016-03-13

Against signed integer types

It is often mentioned that Google's guidelines call to always use signed integers, and avoid unsigned.

One could argue, if I were as smart as Sergey Brin or Larry Page, I'd be as rich as they are. This may be true. But we tend to succeed for the few things we get very right, and despite the many things we don't. I'm pretty sure this one is the latter.

The preference for signed integers opens opportunities for a whole panoply of unnecessary bugs. It's a mistake of the same order as nullable pointers. You almost never need negative values, but now you need to check for them always. Signed overflow is even undefined behavior, when unsigned overflow is not.

I have gone the opposite route, and always use unsigned types. I find a signed type is not needed 99% of the time. I would further argue that most uses of negative values are hacks, and potential containment and security issues. It overloads a variable with potential meanings that the variable should not have.

There's a distinct difference between MoveForward and MoveBackward, for example. A design that expresses reversion as MoveForward(-15) is both unsafe and unsound.

But but but...


A counter-argument is that it's dangerous to mix signed and unsigned types, so using only signed types helps. But this is not the only solution, or the best one. An example:
int main() {
    int i = -1;
    unsigned int j = 1;
    if (i > j) return 1;
    return 0;
}
If allowed to compile, this program returns 1. -1 is represented as 0xFFFFFFFF, which is more than 1.

But it should not compile. You should be enabling signed / unsigned comparison and conversion warnings. With MSVC, these are C4018 and C4365. With GCC, they are -Wsign-compare and -Wsign-conversion.

In addition, it's good practice to treat warnings as errors: -WX with MSVC, and -Werror with GCC.

Note that with GCC, -Wall does not actually enable all warnings. You have to go the extra step to enable -Wsign-conversion manually.

Serious projects should use -WX -Wall, enabling as many warnings as possible, treating them as errors, and manually disabling only selected, individually weighed warnings.

3 comments:

evgenymuralev.com said...


Nice post! Just wanted to note that using signed integers maybe wise from performance standpoint because undefined signed overflow makes room for a number of compiler optimizations :)

Although, in not performance critical code I tend to use unsigned type if a variable is unsigned semantically.

Risto Lankinen said...

Way too late, I guess, but IMO from day one the signed/unsigned comparison should've worked along the line of this pseudocode:

bool operator<(/*signed*/ type s,unsigned type u)
{
if(s<0) return true;
return operator<((unsigned type)i,u);
}

denis bider said...

Risto - I'm inclined to agree; the outcome likely would have been better today if there were additional unsigned/signed and signed/unsigned forms of machine instructions. But alas, transistor budgets were tighter when these decisions were made. :)

Maybe this decision would have been different if it was done again today.