Your production code should assert

If you have C or C++ experience, you're familiar with asserts. Here's just one example use case in a Visual Studio 2013 include file:

bool _HasCapturedContext() const
    _ASSERTE(_M_context._M_captureMethod != _S_captureDeferred);
    return (_M_context._M_pContextCallback != nullptr);

Pretty normal, right? Let's see how _ASSERTE is defined.

#ifndef _DEBUG
#ifndef _ASSERTE
#define _ASSERTE(expr) ((void)0)
#endif  /* _ASSERTE */
#else  /* _DEBUG */
#ifndef _ASSERTE
#define _ASSERTE(expr)  _ASSERT_EXPR((expr), _CRT_WIDE(#expr))
#endif  /* _ASSERTE */
#endif  /* _DEBUG */

This is good, right? Perform checks while testing; then, when most bugs have been weeded out, omit the tests for production, to reap major performance benefits. So speed! Such performance! Wheeee!

No. This is wrong. Do not do this.

First of all: assert checks are nearly free. In most applications - i.e. anywhere outside of algorithms that do hard core, cache-optimized data crunching - the CPU is not tied up by instructions, it's tied up with memory access. The instruction cost of the assert check is therefore nearly zero. The assert branch is never taken, and provides easy work for the CPU's branch prediction.

On the other hand, you're paying by not doing the assert as follows:
  • You will fail to detect bugs that your testing failed to exercise until your program crashes in production. You will then have a jolly old time debugging without the benefit of asserts.
  • Better yet - the bug might not be detected at all, and may result in silent corruption of data.
  • If your code is public, anyone with access to it can look at your asserts, and use them literally as a guidebook for exploits. Each assert corresponds to a weakness in production!
You're introducing diagnostic issues, data corruption, and making your program exploitable - all for the awesome benefit of a few CPU cycles! Wheeeee!

Keep your asserts in production

The cost is negligible, and the benefits are many. You should nearly always keep your asserts.

In fact - just to make sure no one thinks these checks should be removed in Release mode - I have stopped calling them "asserts" at all. I call them "ensures". In my latest code, this is how I define them:

// By default, an interactive EnsureAbortHandler is set. The interactive handler will output
// information about an Ensure failure either to STDOUT (if available), or otherwise, using
// an interactive dialog box.
// If the application is running non-interactively, call this to set an EnsureAbortHandler
// that will output information about any ensure failures to the Application section of the
// Windows Event Log.

void SetEnsureAbortHandler_EventLog(wchar_t const* sourceName, DWORD eventId);

struct OnFail { enum E { Throw, Abort }; };

struct InternalInconsistency : public std::exception {};

// Handles an Ensure failure:
// - if running under a debugger, always causes a debug breakpoint;
// - otherwise, if onFail == OnFail::Throw, and there is no uncaught exception,
//   throws an exception deriving from InternalInconsistency;
// - otherwise, if onFail == OnFail::Abort, or there is an uncaught exception,
//   calls the registered EnsureAbortHandler.

void EnsureFailure(OnFail::E onFail, char const* test, char const* funcorfile, uint line);

#define EnsureThrow(TEST) \
  ((TEST) ? 0 : (At::EnsureFailure(OnFail::Throw, #TEST, __FUNCSIG__, __LINE__), 0))

#define EnsureAbort(TEST) \
  ((TEST) ? 0 : (At::EnsureFailure(OnFail::Abort, #TEST, __FUNCSIG__, __LINE__), 0))
There are two types of ensures because there are two types of unexpected conditions: those you can recover from, and those you can't.

You can recover from an Ensure failure if you detect it before damage has taken place. In this case, you call EnsureThrow, which throws an InternalInconsistency exception, which isn't caught until it reaches a fairly base-level exception handler. An Ensure failure is an unexpected condition, so it means you must stop and potentially restart that part of your program which encountered it, in order to restore your program to a known state. In a server program, an appropriate thing to do is to terminate any worker thread or client connection or session related to the Ensure failure, but allow the rest of the server to run if you believe it to be in a good state.

You cannot recover from an Ensure failure if you detect it after damage has taken place. If you detect that a pointer is pointing somewhere it shouldn't, that memory might have been written to incorrectly - you should call EnsureAbort. This version does its best to report the problem - either via standard output, via dialog box, or via Windows Event Log - and then ends the program. If data may be at risk, the program shouldn't continue to run. It's better to have a denial of service than corruption.


Popular posts from this blog

"Unreachable" beauty standards

When monospace fonts aren't: The Unicode character width nightmare

Is the internet ready for DMARC with p=reject?