Introduction to x64 debugging, part 1

There are some subtle differences between using the Debugging Tools for Windows (DTW) toolset on x86 and x64 that are worth mentioning, especially if you are new to doing x64 debugging. Most of this post applies to all of the debuggers shipped in the DTW package, which is why I avoid talking about WinDbg or ntsd or cdb specifically, and often just refer to the “DTW debuggers”. This is the first post in a multipart series, and it provides a general overview of the options you have for doing 32-bit and 64-bit debugger on an x64 machine, and how to setup the debugger properly to support both, using either the 32-bit or 64-bit packages.

There are many ways to do x64 debugging, which can get confusing, simply because there are so many different choices. You can use both the 32-bit and 64-bit DTW packages, with some restrictions. Here’s a summary of the most common cases (including “cross-debugging” scenarios, where you are using the 32-bit debugger to debug 64-bit processes). For now, I’ll just limit this to user mode, although you can use many of these options for kernel debugging too.

  • Natively debugging 64-bit processes on the same computer using the 64-bit DTW package
  • Natively debugging 32-bit (Wow64) processes on the same computer using the 64-bit DTW package
  • Debugging 32-bit (Wow64) processes on the same computer using the 32-bit DTW package (running the debugger itself under Wow64)
  • Debugging 64-bit processes or 32-bit (Wow64) processes on the same or a different computer using either the 64-bit or 32-bit DTW package, with the remote debugging support (e.g. dbgsrv.exe, or -remote/-server). This requires a 64-bit remote debugger server.
  • Debugging 32-bit (Wow64) processes on the same or a different computer using either the 64-bit or 32-bit DTW package, with the remote debugger support (e.g. dbgsrv.exe, or -remote/-server). This works with a 32-bit remote debugging server.
  • Debugging a 64-bit or 32-bit dump file using the 32-bit or 64-bit DTW package. Both DTW packages are capable of doing this task natively.

There are actually even more combinations, but to keep it simple, I just listed the major ones. Now, as for which setup you want to use, there are a couple of considerations to keep in mind. Most of the important differences for the actual debugging experience stem from whether the process that is making the actual Win32 debugger API calls is a 64-bit or 32-bit process. For the purposes of this discussion, I’ll call the process that makes the actual debugger API calls (e.g. DebugActiveProcess) the actual debugger process.

If the actual debugger process is a 32-bit process under Wow64, then it will be unable to interact meaningfully with 64-bit processes (if you are using WinDbg, 64-bit processes will all show as “System” in the process list). For 32-bit processes, it will see them exactly as you would under an x86 Win32 system; there is no direct indication that they are running under Wow64, and the extra Wow64 functionality is completely isolated from the debugger (and the person driving the debugger). This can be handy, as the extra Wow64 infrastructure can in many cases just get in the way if you are debugging a pure 32-bit program running under Wow64 (unless you suspect a bug in Wow64 itself, which is fairly unlikely to be the case).

If the actual debugger process is a native 64-bit process, then the whole debugging environment changes. The native 64-bit debugging environment allows you to debug both 32-bit (Wow64) and 64-bit targets. However, when you are debugging 32-bit targets, the experience is not the same as if you were just debugging a 32-bit program on a 32-bit Windows installation. The 64-bit debugger will see all of the complexities of Wow64, which often gets confusing and can get in your way. I’ll go into specifics of what exactly is different and how the 64-bit debugger can sometimes be annoying when working with Wow64 processes in a moment; for now, stick with me.

So, if you need to do development on 64-bit computers, which debugging package is the best for you to use? Well, that really depends on what you are doing, but I would recommend installing both the 32-bit and 64-bit DTW packages. The main reason to do this is that it will allow you to debug 32-bit processes without having to deal with the Wow64 layer all the time, but it at the same time it will allow you to debug native 64-bit processes.

After you have installed the DTW packages, one of the familiar first steps with setting up the debugger tools on a new system is to register WinDbg as your default post-portem debugger. This turns out to be a bit more complicated on 64-bit systems than on 32-bit systems, however, in large part due to a new concept added to Windows to support Wow64: registry reflection. Registry reflection allows for 32-bit and 64-bit applications to have their own virtualized view of several key sections of the registry, such as HKEY_LOCAL_MACHINE\Software. What this means in practice is that if you write to the registry from a 32-bit process, you might not see the changes from 64-bit processes (and vice versa), depending on which registry keys are changed. Alternatively, you might see different changes than you made, such as if you are registering a COM interface in HKEY_CLASSES_ROOT.

So, what does all of this mean to you, as it relates to doing debugging on 64-bit systems? Well, the main difference that impacts you is that there are different JIT handlers for 32-bit and 64-bit processes. This means that if you register a 32-bit DTW debugger as a default postmortem debugger, it won’t be activated for 64-bit processes. Conversely, if you register a 64-bit DTW debugger as a default postmortem debugger, it won’t be activated for 32-bit processes.

This leaves you with a couple of options: Register both the 32-bit and 64-bit DTW packages as default postmortem debuggers (if you only want to use the 64-bit DTW package on 64-bit processes and not 32-bit (Wow64) processes as a JIT debugger), or register the 64-bit DTW debugger as a default postmortem debugger for both 32-bit and 64-bit processes. If you want to do the former, then what you need to do is as simple as logging in as an administrator and running both the 32-bit and 64-bit DTW debuggers with the -I command line option (install as default postmortem debugger), and then you’re set. However, if you want to use the 64-bit debugger for both 64-bit and 32-bit processes as a JIT debugger, then things are a bit more complicated. The best way to set this up is to install the 64-bit DTW debugger as a default postmortem debugger (run it with -I), and then open the 64-bit version of regedit.exe, navigate to HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\AeDebug, and copy the value of the “Debugger” entry into the clipboard. Then, navigate to the 32-bit view of this key, located at HKEY_LOCAL_MACHINE\Software\Wow6432Node\Microsoft\Windows NT\CurrentVersion\AeDebug, create (or modify, if it already exists) the “Auto” string value and set it to “1”, then create (or modify, if it already exists) the “Debugger” string value and set it to the value you copied from the 64-bit view of the AeDebug key. For my system, the “Debugger” value is set to something like “C:\Program Files\Debugging Tools for Windows 64-bit\WinDbg.exe” -p %ld -e %ld -g. If you don’t see a Wow6432Node registry key under HKEY_LOCAL_MACHINE\Software, then you are probably accidentally running the 32-bit version of regedit.exe and not the 64-bit version of regedit.exe.

Now, there are a couple of other considerations to take into account when picking whether to use the 32-bit or 64-bit DTW tools on 32-bit processes. Besides the ease of use consideration (which I’ll come back to in more detail shortly), many third party extension DLLs (including my own SDbgExt, for the moment) are only available as 32-bit binaries. While these extension DLLs might support 64-bit targets, they will only run under a 32-bit debugger host.

I said I’d describe some of the reasons why debugging Wow64 processes under the native 64-bit debugger can be cumbersome. The main problem with doing this is that you need to be careful with whether the debugger is active as a 32-bit or 64-bit debugger. This is controlled by something that the DTW package calls the effective machine, which is a way to tell the debugger that it should be treating the program as a 32-bit or 64-bit program. If you are using the native 64-bit debugger on a Wow64 process, you will often find yourself having to manually switch between the native (x64) machine mode and the Wow64 (x86) mode.

To give you an idea of what I mean, let’s take a simple example of breaking into the 32-bit version of CMD.EXE, and getting a call stack of the first thread (thread 0). If you are experienced with the DTW tools, then you probably already know how to do this on x86-based systems: the “~0k” command, which means “show me a stack trace for thread 0”. If you run this on the 32-bit CMD.exe process, though, you won’t quite get what you were expecting:

0:000> ~0k
Child-SP          RetAddr           Call Site
00000000`0013e318 00000000`78ef6301 ntdll!ZwRequestWaitReplyPort+0xa
00000000`0013e320 00000000`78bc0876 ntdll!CsrClientCallServer+0x9f
00000000`0013e350 00000000`78ba1394 wow64win!ReadConsoleInternal+0x236
00000000`0013e4c0 00000000`78be6866 wow64win!whReadConsoleInternal+0x54
00000000`0013e510 00000000`78b83c7d wow64!Wow64SystemServiceEx+0xd6
00000000`0013edd0 00000000`78be6a5a wow64cpu!ServiceNoTurbo+0x28
00000000`0013ee60 00000000`78be5e0d wow64!RunCpuSimulation+0xa
00000000`0013ee90 00000000`78ed8501 wow64!Wow64LdrpInitialize+0x2ed
00000000`0013f6c0 00000000`78ed6416 ntdll!LdrpInitializeProcess+0x17d9
00000000`0013f9d0 00000000`78ef3925 ntdll!LdrpInitialize+0x18f
00000000`0013fab0 00000000`78d59630 ntdll!KiUserApcDispatch+0x15
00000000`0013ffa8 00000000`00000000 0x78d59630
00000000`0013ffb0 00000000`00000000 0x0
00000000`0013ffb8 00000000`00000000 0x0
00000000`0013ffc0 00000000`00000000 0x0
00000000`0013ffc8 00000000`00000000 0x0
00000000`0013ffd0 00000000`00000000 0x0
00000000`0013ffd8 00000000`00000000 0x0
00000000`0013ffe0 00000000`00000000 0x0
00000000`0013ffe8 00000000`00000000 0x0

Hey, that doesn’t look like the 32-bit CMD at all! Well, the reason for the strange call stack is that the 32-bit CMD’s first thread is sleeping in a system call to the 64-bit kernel, and the last active processor state for that thread was native 64-bit mode, and NOT 32-bit mode. You will find that this is the common case for threads that are not spinning or doing actual work when you break in with the debugger.

In order to get the more useful 32-bit stack trace, we’ll have to use a debugger command that is probably unfamiliar to you if you haven’t done Wow64 debugging before: .effmach. This command controls the “effective machine” of the debugger, which I previously described. We’ll want to tell the debugger to show us the 32-bit state of the debugger, which we can do with the “.effmach x86” command. Then, we can get a 32-bit stack trace for the first thread with the “~0k” command:

0:002> .effmach x86
Effective machine: x86 compatible (x86)
0:002:x86> ~0k
ChildEBP          RetAddr
002dfd68 7d542f32 KERNEL32!ReadConsoleInternal+0x15
002dfdf4 4ad0fe14 KERNEL32!ReadConsoleW+0x42
002dfe5c 4ad15803 cmd!ReadBufFromConsole+0xb5
002dfe88 4ad02378 cmd!FillBuf+0x174
002dfe8c 4ad02279 cmd!GetByte+0x11
002dfea8 4ad026c5 cmd!Lex+0x6b
002dfeb8 4ad02783 cmd!GeToken+0x20
002dfec8 4ad02883 cmd!ParseStatement+0x36
002dfedc 4ad164c0 cmd!Parser+0x46
002dff44 4ad04cdd cmd!main+0x1d6
002dffc0 7d4e6e1a cmd!mainCRTStartup+0x12f
002dfff0 00000000 KERNEL32!BaseProcessStart+0x28

Much better! That’s more in line with what we’d be expecting an idle CMD.EXE to be doing. We can now treat the target as a 32-bit process, including things like displaying and altering registry contexts, disassembling, and soforth. For instance:


0:002:x86> ~0s
00000000`7d54e9c3 c22000  ret     0x20
0:000:x86> r
eax=00000001 ebx=002dfe84 ecx=00000000 edx=00000000 esi=00000003 edi=4ad2faa0
eip=7d54e9c3 esp=002dfd6c ebp=002dfdf4 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
00000000`7d54e9c3 c22000  ret     0x20
0:000:x86> u poi(esp)
00000000`7d542f32 8b4dfc  mov     ecx,[ebp-0x4]
00000000`7d542f35 5f      pop     edi
00000000`7d542f36 5e      pop     esi
00000000`7d542f37 5b      pop     ebx
00000000`7d542f38 e8545df9ff call KERNEL32!__security_check_cookie (7d4d8c91)
00000000`7d542f3d c9      leave
00000000`7d542f3e c21400  ret     0x14
00000000`7d542f41 90      nop

If we want to switch the debugger back to the 64-bit view of the process, we can use “.effmach .” to change to the native processor type:

0:000:x86> .effmach .
Effective machine: x64 (AMD64)

Now, we’re back to 64-bit mode, and all of the debugger commands will reflect this:

0:000> r
rax=000000000000000c rbx=000000000013e3a0 rcx=0000000000000000
rdx=00000000002df1f4 rsi=0000000000000000 rdi=00000000003e0cd0
rip=0000000078ef148a rsp=000000000013e318 rbp=00000000002dfdf4
r8=000000007d61c929  r9=000000007d61caf1 r10=0000000000000000
r11=00000000002df1f4 r12=00000000002dfe34 r13=0000000000000001
r14=00000000002dfe84 r15=000000004ad2faa0
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000244
00000000`78ef148a c3               ret

That should give you a basic idea as to what you will be needing to do most of the time when you are doing Wow64 debugging. If you are running the 32-bit debugger packages, then all of this extra complexity is hidden and the process will appear to be a regular 32-bit process, with all of the transitions to Wow64 looking like 32-bit system calls (these typically happen in places like ntdll or user32.dll/gdi32.dll).

That’s the end of this post. The next in this series will go into more detail as to what has changed when you take the plunge and start debugging things on a 64-bit system.

One Response to “Introduction to x64 debugging, part 1”

  1. andy says:

    pal, excelent post, thank you ;o)! keep on like this, you’re the 1st hit in google when searching for stuff like that. i had to dive inside and “felt” the fastcall being somehow mixed-in in the WINAPI but couldn’t find any exact info; so thank you man! andy