What is the “lpReserved” parameter to DllMain, really? (Or a crash course in the internals of user mode process initialization)

One of the parameters to the DllMain function is the enigmatic lpReserved argument. According to MSDN, this parameter is used to convey whether a DLL is being loaded (or unloaded) as part of process startup or termination, or as part of a dynamic DLL load/unload operation (e.g. LoadLibrary/FreeLibrary).

Specifically, MSDN says that for static DLL operations, lpReserved contains a non-null value, whereas for dynamic DLL operations, it contains a null value.

There’s actually a little bit more to this parameter, though. While it’s true that in the case of DLL_PROCESS_DETACH operations, it is little more than a boolean value, it has some more significance in DLL_PROCESS_ATTACH operations.

To understand this, you need to know a little bit more about how process/thread initialization occurs. When a new user mode thread is started, the kernel queues a user mode APC to it, pointing to a function in ntdll called LdrInitializeThunk (ntdll is always mapped into the address space of a new process before any user mode code runs). The kernel also arranges for the thread to dispatch the user mode APC the first time it begins execution.

One of the arguments to the APC is a pointer to a CONTEXT structure describing the initial execution state of the new thread. (The actual contents of the CONTEXT structure are based at the stack of the thread.)

When the thread is first resumed, the APC executes and control transfers to ntdll!LdrInitializeThunk. From there, depending on whether the process is already initialized or not, either the process initialization code is executed (loading DLLs statically linked to the process, and soforth), or per-thread initialization code is run (for instance, making DLL_THREAD_ATTACH callouts to loaded DLLs).

If a user mode thread always actually begins execution at ntdll!LdrInitializeThunk, then you might be wondering how it ever starts executing at the start address specified in a CreateThread call. The answer is that eventually, the code called by LdrInitializeThunk passes the context record argument supplied by the kernel to the NtContinue system call, which you can think of as simply taking that context and transferring control to it. Because the context record argument to the APC contained the information necessary for control to be transferred to the starting address supplied to CreateThread, the thread then begins executing at expected thread starting address*.

(*: Actually, there is typically another layer here – usually, control would go to a kernel32 or ntdll function (depending on whether you are running on a downlevel platform or on Vista), which sets up a top level exception handler and then calls the start routine supplied to CreateThread. But, for the purposes of this discussion, you can consider it as just running the requested thread start routine.)

As all of this relates to DllMain, the value of the lpReserved parameter to DllMain (when process initialization is occuring and static linked DLLs are being loaded and initialized) corresponds to the context record argument supplied to the LdrInitializeThunk APC, and is thus representative of the initial context that will be set for the thread after initialization completes. In fact, it’s actually the context that will be used after process initialization is complete, and not just a copy of it. This means that by treating the lpReserved argument as a PCONTEXT, the initial execution state for the first thread in the process can be examined (or even altered) from the DllMain of a static-linked DLL.

This can be verified experimentally by using some trickery to step into process initialization DllMain (more on just how to do that with the user mode debugger in a future entry, as it turns out to be a bit more complicated than what one might imagine):

1:001> g
Breakpoint 3 hit
000007fe`feb15580 48895c2408      mov     qword ptr [rsp+8],
rbx ss:00000000`001defc0=0000000000000000
1:001> r
rax=0000000000000000 rbx=00000000002c4770 rcx=000007fefeaf0000
rdx=0000000000000001 rsi=000007fefeb15580 rdi=0000000000000003
rip=000007fefeb15580 rsp=00000000001defb8 rbp=0000000000000000
 r8=00000000001df530  r9=00000000001df060 r10=00000000002c1310
r11=0000000000000246 r12=0000000000000000 r13=00000000001df0f0
r14=0000000000000000 r15=000000000000000d
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b
000007fe`feb15580 48895c2408      mov     qword ptr [rsp+8],
rbx ss:00000000`001defc0=0000000000000000
1:001> k
RetAddr           Call Site
00000000`77414664 ADVAPI32!DllInitialize
00000000`77417f29 ntdll!LdrpRunInitializeRoutines+0x257
00000000`7748e974 ntdll!LdrpInitializeProcess+0x16af
00000000`7742c4ee ntdll! ?? ::FNODOBFM::`string'+0x1d641
00000000`00000000 ntdll!LdrInitializeThunk+0xe
1:001> .cxr @r8
rax=0000000000000000 rbx=0000000000000000 rcx=00000000ffb3245c
rdx=000007fffffda000 rsi=0000000000000000 rdi=0000000000000000
rip=000000007742c6c0 rsp=00000000001dfa08 rbp=0000000000000000
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000000
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=0000  es=0000  fs=0000  gs=0000
00000000`7742c6c0 4883ec48        sub     rsp,48h
1:001> u @rcx
00000000`ffb3245c 4883ec28        sub     rsp,28h

If you’ve been paying attention thus far, then you might then be able to explain why when you set a hardware breakpoint at the initial process breakpoint, the debugger warns you that it will not take effect. For example:

(16f0.1890): Break instruction exception
- code 80000003 (first chance)
00000000`7742fdf0 cc              int     3
0:000> ba e1 kernel32!CreateThread
        ^ Unable to set breakpoint error
The system resets thread contexts after the process
breakpoint so hardware breakpoints cannot be set.
Go to the executable's entry point and set it then.
 'ba e1 kernel32!CreateThread'
0:000> k
RetAddr           Call Site
00000000`774974a8 ntdll!DbgBreakPoint
00000000`77458068 ntdll!LdrpDoDebuggerBreak+0x35
00000000`7748e974 ntdll!LdrpInitializeProcess+0x167d
00000000`7742c4ee ntdll! ?? ::FNODOBFM::`string'+0x1d641
00000000`00000000 ntdll!LdrInitializeThunk+0xe

Specifically, this message occurs because the debugger knows that after the process breakpoint occurs, the current thread context will be discarded at the call to NtContinue. As a result, hardware breakpoints (which rely on the debug register state) will be wiped out when NtContinue restores the expected initial context of the new thread.

If one is clever, it’s possible to apply the necessary modifications to the appropriate debug register values in the context record image that is given as an argument to LdrInitializeThunk, which will thus be realized when NTDLL initialization code for the thread runs.

The fact that every user mode thread actually begins life at ntdll!LdrInitializeThunk also explains why, exactly, you can’t create a new thread in the current process from within DllMain and attempt to synchronize with it; by virtue of the fact that the current thread is executing DllMain, it must have the enigmatic loader lock acquired. Because the new thread begins execution at LdrInitializeThunk (even before the start routine you supply is called), for the purpose of making DLL_THREAD_ATTACH callouts, it too will become almost immediately blocked on the loader lock. This results in a classic deadlock if the thread already in DllMain tries to wait for the new thread.

Parting shots:

  • Win9x is, of course, completely dissimilar as far as DllMain goes. None of this information applies there.
  • The fact that lpReserved is a PCONTEXT is only very loosely documented by a couple of ancient DDK samples that used the PCONTEXT argument type, and some more recent SDK samples that name the “lpReserved” parameter “lpvContext”. As far as I know, it’s been around on all versions of NT (including Vista), but like other pseudo-documented things, it isn’t necessarily guaranteed to remain this way forever.
  • Oh, and in case you’re wondering why I used advapi32 instead of kernel32 in this example, it’s because due to a rather interesting quirk in ntdll on recent versions of Windows, kernel32 is always dynamic-loaded for every Win32 process (regardless of whether or not the main process image is static linked to kernel32). To make things even more interesting, kernel32 is dynamic loaded before static-linked DLLs are loaded. As a result, I decided it would be best to steer clear of it for the purposes of making this posting simple; be suitably warned, then, about trying this out on kernel32 at home.

3 Responses to “What is the “lpReserved” parameter to DllMain, really? (Or a crash course in the internals of user mode process initialization)”

  1. pk says:

    how do you know all of this? RE?

  2. […] of the most interesting things I’ve recently came across was a short post on Skywing’s blog, saying about a neither documented nor well-known fact about what the lpReserved DllMain parameter […]

  3. BanMe says:

    thats a amazing piece of ‘uncovered’ knowledge ..various ideas spawn from that..j00ru has thought of ‘some’ of the uses for it..excellent research and ‘clue’ gathering ;}