Win32 calling conventions: Usage cases

Last time, I talked about some of the general concepts behind the varying calling conventions in use on Win32 (x86).  This posting focuses on the implications and usage cases behind each of the calling conventions, in an effort to provide a better understanding as to when you’ll see them used.

When looking at the different calling conventions, we can see that there are a number of differences between them.  Stack usage vs register parameters, caller vs callee cleans stack, member function calls vs “plain C” function calls, and soforth.  These differences lend each calling convention to specific cases where they are best suited.

To begin with, consider the __cdecl calling convention.  We know that it is a stack based calling convention where the caller cleans the stack.  Furthermore, we know that it is the default calling convention for CL (big hint as to when you’ll see this being used!).  These attributes make it well suited for a couple of cases:

  • Variadic functions, or functions with an ellipsis (…) terminating the argument list.  These functions have a variable number of arguments, which is not known at compile time of the callee.  __cdecl is useful for these functions because the compiler needs to implement a stack displacement to clean arguments off of the stack, but it doesn’t know how many arguments there are.  By leaving the argument-disposal up to the caller, who does know the number of arguments at compile time, the compiler doesn’t need special help from the programmer to correctly adjust the stack when the variadic function is going to return – it “just works”.  __cdecl is the only calling convention on Win32 x86 that supports variadic functions when used with CL.
  • Old-style C functions without prototypes.  For compatibility with legacy C code, the C compiler needs to support making function calls to unprototyped functions.  These must be treated as if they were variadic functions, because the compiler doesn’t know whether the function takes a fixed number of arguments or not (because there is no prototyped argument list).
  • Any other case where the programmer does not explicitly override the calling convention.  The default Visual Studio build environment will use the compiler default calling convention if you do not explicitly tell it otherwise, and this goes to __cdecl.  Some build environments (the DDK/build.exe platform in particular) default to different calling conventions, but Visual Studio built programs will always default to __cdecl if you are using CL.

Next, we’ll take a look at __stdcall.  This calling convention is the standard for Win32 APIs; virtually all system APIs are __stdcall (typically decorated as “WINAPI”, “NTAPI”, or “CALLBACK” in the headers, which are macros that expand to __stdcall).  Here are the typcial usage cases for __stdcall (and the “why” behind them):

  • Library functions.  Excepting the C runtime libraries, virtually all Microsoft-shipped Win32 libraries use __stdcall.  The main reason for this (if you discount the “that’s the way it has always been”) is that you save some instruction code space by using __stdcall and not __cdecl for library functions.  The reason for this is that for __cdecl functions, the caller typically needs to adjust the stack pointer after every “call” instruction to a __cdecl function (which takes up instruction code space – typically an “add esp, imm8” opcode).  For __stdcall functions, you only pay this penality once, in the “retn imm16” opcode at the end of the function (as opposed to once for every caller).  For frequently called functions (say, ReadFile), this begins to add up.  You also theoretically save a bit of processor time and cache space, as there is one less instruction to be executed per “call”.
  • COM functions.  COM uses __stdcall with the “this” pointer being the first argument, which is a required part of the COM API contract for publicly accessible functions.
  • Functions that need to be called from a language other than C/C++.  This also ties back into the COM and library function purposes, but of all of the calling conventions discussed here, only __stdcall has practically universal support among non-Microsoft or non-C/C++ compilers for x86 Win32 (such as Visual Basic, or Delphi).  As a result, it is advantageous to use __stdcall if you are expecting to be called from other languages.
  • Microsoft-built programs.  Microsoft defaults their programs to __stdcall and not __cdecl virtually everywhere, even in images that don’t export functions, or in internally, non-exported funtions within a system library.  This also applies to Microsoft kernel mode code, such as the HAL and the kernel itself.
  • Programs built with the DDK.  The DDK defaults to __stdcall and not __cdecl.
  • NT kernel drivers.  These are always (or at least should always be!) built with the DDK, which again, defaults to __stdcall.

There yet remains __fastcall to discuss.  This calling convention is not used as extensively as the other two (no Microsoft build environment that I am aware of defaults to it), so most of the cases for it being used are the result of a programmer explicitly requesting it.

  • Functions that do not call other functions (“leaf functions”).  These are good candidates for __fastcall because the register arguments are passed in volatile registers, so there is a penalty associated with __fastcall functions that call subfunctions and need to use their arguments across those function calls, as this requires the __fastcall function to save arguments to somewhere nonvolatile (i.e. the stack), and that defeats the whole purpose of __fastcall entirely.
  • Functions that do not use their arguments after the first subfunction call.  These can still benefit from __fastcall without the penalty mentioned above relating to preserving arguments across function calls.
  • Short functions that call other functions and then return.  If you can make both functions __fastcall, then sometimes the compiler can be clever and not need to re-load the argument registers when a __fastcall function calls a __fastcall subfunction.  This can be useful for “wrapper” functions in some cases.
  • Functions that interface with assembly code.  Sometimes it can be more convenient to make a C function called by assembler code __fastcall, because this can save you the work of manually tracking stack displacements.  It can also sometimes be more convenient to make assembler functions called by C code __fastcall as well, for similar reasons.

In general, __fastcall is relatively rare.  There are a couple of kernel functions that use it (KfRaiseIrql, for instance).  A couple of software vendors (such as Blizzard Entertainment) seem to like to ship things compiled with __fastcall as the default calling convention, but this not the common case, and usually not a good idea.

Finally, there is __thiscall.  This calling convention is only used if you are using the default calling convention for member functions.  Note that for member functions that are not accessible cross-program (e.g. not exported somehow), the compiler will sometimes replace ecx with ebx for the “this” pointer as a custom calling convention, depending on your optimization settings.

That’s all for this installment.  In the next posting, I’ll discuss what the various calling conventions look like at a low level (assembly), and what this means to you.

4 Responses to “Win32 calling conventions: Usage cases”

  1. nksingh says:

    Awesome post! I’m trying to become conversant with reading unassembled program snippets and this kind of information is just what I like.

    Thanks a lot!

  2. Skywing says:

    Thanks for the feedback. I’ll have some more posts about this subject out shortly. Good to know that people are finding them interesting.

  3. […] As I have previously described, __cdecl is an entirely stack based calling convention, in which arguments are cleaned off the stack by the caller. Given this, you can expect to see all of the arguments for a function placed onto the stack before a function call is main. If you are using CL, then this is almost always done by using the “push” instruction to place arguments on the stack. […]

  4. OJ says:

    Interesting is an understatement :) Your posts are very informative, well structured and a joy to read. I only stumbled across your blog today, and I’ll make sure that I continue to do so in the future.

    I’ve always loved RCE, and seeing what’s going on under the hood, but it’s been so long since I dabbled in it that I had forgotten how much fun it is. Thanks for the reminder! Time to break out the disassembler/debugger and have a dabble in some things I haven’t dabbled in for ages.

    All the best!