Debugger internals: How loaded module names are communicated to the debugger

If you’ve ever used the Win32 debugging API, you’ll notice that the WaitForDebugEvent routine, when returning a LOAD_DLL_DEBUG_EVENT style of event, gives you the address of an optional debuggee-relative string pointer containing the name of the DLL that is being loaded. In case you’ve ever wondered just where that string comes from, you’ll be comforted to know that this mechanism for communicating module name strings to the remote debugger is built upon a giant hack.

To give a bit of background information on how loading of DLLs works, most of the heavy-lifting with respect to loading DLLs (referred to as “mapping an image”) is done by the memory manager subsystem in kernel mode – specifically, in the “MiMapViewOfImageSection” internal routine. This routine is responsible for taking a section object (known as a file mapping object in the Win32 world) that represents a PE image on disk, and setting up the in-memory layout of the PE image in the specified process address space (in the case of Win32, always the address space of the caller). This includes setting up PE image subsections with the correct alignment, zero-filling “bss”-style sections, and setting up the protections of each PE image subsection. It is also responsible for supplying the “magic” necessary to allow shared PE subsections to work. All of this behavior is controlled by the SEC_IMAGE flag being passed to NtMapViewOfSection (this behavior is visible by Win32 via passing SEC_IMAGE to MapViewOfFile, and can be used to achieve the same result of “just” mapping an image in-memory without going through the loader). Internally, the loader routine in NTDLL (LdrLoadDll and its associated subfunctions, which are called by the LoadLibrary family of routines in kernel32) utilizes NtMapViewOfSection to create the in-memory layout of the DLL being requested. After performing this task, the user-mode NTDLL-based loader then performs tasks such as applying base relocations, resolving imports to other modules (and loading dependent modules if necessary), allocating TLS data slots, making DLL initializer callouts, and soforth.

Now, the way that the debugger is notified of module load events is via a kernel mode hook that is called by NtMapViewOfSection (DbgkMapViewOfSection). This hook is responsible for detecting if a debugger (user mode or kernel mode) is present, and if so, forwarding the event to the debugger.

This is all well and good, but there’s a catch here. Both the user mode and kernel mode debuggers display the full path name to the DLL being loaded, but we’re now at the wrong level of abstraction, so to speak, to retrieve this information. All MiMapViewOfSection has is a handle to a section object (in actuality, a PSECTION_OBJECT and not a handle at this point). Now, the section object *does* have a reference to the PFILE_OBJECT associated with the file backing the section object (the reference is stored in the CONTROL_AREA of the section object), but there isn’t necessarily a good way to get the original filename that was passed to LoadLibrary out of the FILE_OBJECT (for starters, at this point, that path has already been converted to a native path instead of a Win32 path, and there is some potential ambiguouity when trying to convert native paths back to Win32 paths).

To work around this little conundrum, the solution the developers chose is to temporarily borrow a field of the NT_TIB portion of the TEB of the calling thread for use as a way to signal the name of a DLL that is being loaded (if SEC_IMAGE is being passed to NtMapViewOfSection). Specifically, NT_TIB.ArbitraryUserPointer is temporarily replaced with a string pointer (in Windows NT, this is always a unicode string) to the original filename passed to LdrLoadDll. Normally, the ArbitraryUserPointer field is reserved exclusively for use by user mode as a sort of “free TLS slot” that is available at a known location for every thread. Although this particular value is rarely used in Windows, the loader does make the effort to preserve its value across calls to LdrLoadDll. This works (since the loader knows that none of the code that it is calling will use NT_TIB.ArbitraryUserPointer), so long as you don’t have cross-thread accesses to a different thread’s NT_TIB.ArbitraryUserPointer (to date, I have never seen a program that tries to do this – and a good thing to, or it would randomly fail when DLLs are being loaded). Because the original value of NT_TIB.ArbitraryUserPointer is restored, the calling thread is typically none-the-wiser that this substitution has been performed.

Disassembling the part of the NTDLL loader responsible for mapping the DLL into the address space via NtMapViewOfSection (a subroutine named “LdrpMapViewOfDllSection” on Windows Vista), we can see this behavior in action:

ntdll!LdrpMapViewOfDllSection:
[...]
;
; Find the TEB address for the current thread.
; esi = NtCurrentTeb()->NtTib.Self
;
77f0e2ee 648b3518000000  mov     esi,dword ptr fs:[18h]
77f0e2f5 8365fc00        and     dword ptr [ebp-4],0
77f0e2f9 57              push    edi
77f0e2fa bf00000020      mov     edi,20000000h
77f0e2ff 857d18          test    dword ptr [ebp+18h],edi
77f0e302 c745f804000000  mov     dword ptr [ebp-8],4
77f0e309 0f85ce700400    jne     LdrpMapViewOfDllSection+0x26

ntdll!LdrpMapViewOfDllSection+0x42:
77f0e30f 8b4514          mov     eax,dword ptr [ebp+14h]
;
; Save away the previous ArbitraryUserPointer value.
;
; ebx = Teb->NtTib.ArbitraryUserPointer
77f0e312 8b5e14          mov     ebx,dword ptr [esi+14h]
77f0e315 6a04            push    4
77f0e317 ff7518          push    dword ptr [ebp+18h]
;
; Set the ArbitraryUserPointer value to the string pointer
; referring to the DLL name passed to LdrLoadDll.
; Teb->NtTib.ArbitraryUserPointer = (PVOID)DllNameString;
; 
77f0e31a 894614          mov     dword ptr [esi+14h],eax
77f0e31d 6a01            push    1
77f0e31f ff7510          push    dword ptr [ebp+10h]
77f0e322 33c0            xor     eax,eax
77f0e324 50              push    eax
77f0e325 50              push    eax
77f0e326 50              push    eax
77f0e327 ff750c          push    dword ptr [ebp+0Ch]
77f0e32a 6aff            push    0FFFFFFFFh
77f0e32c ff7508          push    dword ptr [ebp+8]
;
; Call NtMapViewOfSection to map the image and perform the
; debugger notification.
;
77f0e32f e830180300      call    NtMapViewOfSection
77f0e334 857d18          test    dword ptr [ebp+18h],edi
77f0e337 5f              pop     edi
;
; Restore the previous value of
; Teb->NtTib.ArbitraryUserPointer.
;
77f0e338 895e14          mov     dword ptr [esi+14h],ebx
77f0e33b 5e              pop     esi
77f0e33c 894514          mov     dword ptr [ebp+14h],eax
77f0e33f 5b              pop     ebx
77f0e340 0f85bc700400    jne     LdrpMapViewOfDllSection+0x75

Sure enough, the user mode loader uses the current thread’s NT_TIB.ArbitraryUserPointer to communicate the DLL name string pointer (in this context, the “eax” value loaded into NT_TIB.ArbitraryUserPointer is the dll name string.) We can easily verify this in the debugger:

Breakpoint 0 hit
eax=0017ecfc ebx=00000000 ecx=0017ecd8
edx=774951b4 esi=c0000135 edi=0017ed80
eip=773fe2e5 esp=0017ec10 ebp=0017ed18
iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b
gs=0000             efl=00000246
ntdll!LdrpMapViewOfDllSection:
773fe2e5 8bff            mov     edi,edi
0:000> g 773fe31a 
eax=001db560 ebx=00000000 ecx=0017ecd8
edx=774951b4 esi=7ffdf000 edi=20000000
eip=773fe31a esp=0017ebf0 ebp=0017ec0c
iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b
gs=0000             efl=00000246
ntdll!LdrpMapViewOfDllSection+0x4d:
773fe31a 894614          mov     dword ptr [esi+14h],eax
0:000> du @eax
001db560  "C:\\Windows\\system32\\CLBCatQ.DLL"

Looking in the kernel, we can clearly see the call to DbgkMapViewOfSection:

ntoskrnl!NtMapViewOfSection+0x21a:
0060a9b6 50              push    eax
0060a9b7 8b55e0          mov     edx,dword ptr [ebp-20h]
0060a9ba 8b4dd8          mov     ecx,dword ptr [ebp-28h]
0060a9bd e86e1c0100      call    ntoskrnl!DbgkMapViewOfSection

Additionally, we can see the references to NT_TIB in DbgkMapViewOfSection:

ntoskrnl!DbgkMapViewOfSection+0x65:
;
; Load eax with the address of the current thread's
; KTHREAD object.
;
; Here, fs refers to the KPCR.
;    +0x120 PrcbData         : _KPRCB
;  (in KPRCB)
;    +0x004 CurrentThread    : Ptr32 _KTHREAD
;
0061c695 64a124010000    mov     eax,dword ptr fs:[00000124h]
;
; Load esi with the address of the current thread's
; user mode PTEB.
;
; Here, we have the following layout in KTHREAD:
;    +0x084 Teb              : Ptr32 Void
;
0061c69b 8bb084000000    mov     esi,dword ptr [eax+84h]
0061c6a1 eb02            jmp     DbgkMapViewOfSection+0x75
ntoskrnl!DbgkMapViewOfSection+0x75:
0061c6a5 3bf3            cmp     esi,ebx
0061c6a7 7421            je      DbgkMapViewOfSection+0x9a
0061c6a9 3b8a44010000    cmp     ecx,dword ptr [edx+144h]
0061c6af 7519            jne     DbgkMapViewOfSection+0x9a
0061c6b1 56              push    esi
0061c6b2 e82c060200      call    DbgkpSuppressDbgMsg
0061c6b7 85c0            test    eax,eax
0061c6b9 0f85bf000000    jne     DbgkMapViewOfSection+0x144
0:000> u
ntoskrnl!DbgkMapViewOfSection+0x8f:
;
; Recall that 14 is the offset of the
; ArbitraryUserPointer member in NT_TIB,
; and that NT_TIB is the first member of TEB.
;
;    +0x000 NtTib            : _NT_TIB
;  (in NT_TIB)
;    +0x014 ArbitraryUserPointer : Ptr32 Void
;
0061c6bf 83c614          add     esi,14h
;
; [ebp-90h] is now the current thread's value of
; NtCurrentTeb()->NtTib.ArbitraryUserPointer
;
0061c6c2 89b570ffffff    mov     dword ptr [ebp-90h],esi

Thus is the story of how the filename that you pass to LoadLibrary ends up being communicated to the debugger, in a rather round-about and hackish way.

It is also worth noting that the kernel cannot trust the user mode supplied filename for use with opening the file handle to the DLL passed to the debugger process. This is because the kernel uses ZwOpenFile which bypasses normal security checks. As a result, the kernel needs to retrieve the filename via querying the section’s associated PFILE_OBJECT anyway, although for different purposes than providing the filename to the debugger.

One Response to “Debugger internals: How loaded module names are communicated to the debugger”

  1. afei says:

    This is really good stuff. It’s glad to know there are somebody else digging into MiMapViewOfSection(). :-)