Programming against the x64 exception handling support, part 5: Collided unwinds

Previously, I discussed the internal workings of RtlUnwindEx. While that posting covered most of the inner details regarding unwind support, I didn’t fully cover some of the corner cases.

Specifically, I haven’t yet discussed just what a “collided unwind” really is, other than providing vague hints as to its existance. A collided unwind occurs when an unwind handler initiates a secondary unwind operation in the context of an unwind notification callback. In other words, a collided unwind is what occurs when, in the process of a stack unwind, one of the call frames changes the target of an unwind. This has several implications and requirements in order to operate as one might expect:

  1. Some unwind handlers that were on the original unwind path might no longer be called, depending on the new unwind target.
  2. The current unwind call stack leading into RtlUnwindEx will need to be interrupted.
  3. The new unwind operation should pick up where the old unwind operation left off. That is, the new unwind operation shouldn’t start unwinding the exception handler stack; instead, it must unwind the original stack, starting from the call frame after the unwind handler which initiated the new unwind operation.

Because of these conditions, the implementation of collided unwinds is a bit more complicated than one might expect. The main difficulty here is that the second unwind operation is initiated within the call stack of an existing unwind operation, but what the unwind handler “wants” to do is to unwind the stack that was already being unwound, except to a different target and with different parameters.

From an unwind handler’s perspective, all that needs to be done to accomplish this is to make a call to RtlUnwindEx in the context of an unwind handler callback for an unwind operation, and RtlUnwindEx magically takes care of all of the work necessary to make the collided unwind “just work”.

Allowing this sort of unwind operation to “just work” requires a bit of creative thinking from the perspective of RtlUnwindEx, however. The main difficult here is that RtlUnwindEx, when called from the unwind handler, somehow needs a way to recover the original context that was being unwound in order to “pick up” where the original call to RtlUnwindEx “left off” (when it called an unwind handler that initiated a collided unwind). Because there is no provision for passing a context record to RtlUnwindEx and indicating that RtlUnwindEx should use it as a starting point for an unwind operation (RtlUnwindEx always initiates the unwind in the current call stack), this poses a problem; how is RtlUnwindEx to recover the original unwind parameters from where it should initiate the “real” unwind?

The way that Microsoft decided to solve this problem is an elegant little hack of sorts. The solution all comes down to that mysterious exception handler around RtlpExecuteHandlerForUnwind: RtlpUnwindHandler. Recall from the previous article that RtlUnwindEx calls RtlpExecuteHandlerForUnwind in order to invoke an exception handler for unwind purposes, and that RtlpExecuteHandlerForUnwind sets up an exception handler (RtlpUnwindHandler) before calling the requested exception handler for unwind. At the time, these extra steps (the use of RtlpExecuteHandlerForUnwind, and its exception handler) probably looked a bit redundant, and in the process of a “conventional” unwind operation, the extra work that RtlUnwindEx goes through before calling an unwind handler doesn’t even come into play as adding any value.

That all changes when a collided unwind occurs, however. In the collided unwind case, RtlpExecuteHandlerForUnwind and RtlpUnwindHandler are critical to solving the problem of how to recover the original unwind parameters so that RtlUnwindEx can perform an unwind operation on the correct call stack. In order to understand just how RtlpUnwindHandler and friends come into play with a collided unwind, it’s necessary to take a closer look about just what RtlUnwindEx will do when it is called from the context of an unwind handler.

Since RtlUnwindEx always begins a call frame unwind from the currently active call stack, the second call to RtlUnwindEx will start unwinding the call stack of the unwind handler that called RtlUnwindEx. But wait, you might say – this isn’t what is supposed to happen! It turns out that unwinding the unwind handler’s call stack will actually lead up to the “right thing” happening, through a bit of clever use of how “conventional” unwind operations work. To better understand what I mean, it’s helpful to look at the stack of a secondary call to RtlUnwindEx (initiating a collided unwind operation). For this purpose, I’ve put together a small problem that initiates a collided unwind (more on how and why you might see a collided unwind in the “real world” later). I’ve set a breakpoint on RtlUnwindEx, and skipped forward until I encountered the nested call to RtlUnwindEx that was initiating a collided unwind operation:

0:000> k
Child-SP          Call Site
00000000`0012e058 ntdll!RtlUnwindEx
00000000`0012e060 ntdll!local_unwind+0x1c
00000000`0012e540 TestApp!`FaultingFunction2'::`1'::fin$2+0x34
00000000`0012e570 ntdll!_C_specific_handler+0x140
00000000`0012e5e0 ntdll!RtlpExecuteHandlerForUnwind+0xd
00000000`0012e610 ntdll!RtlUnwindEx+0x236
00000000`0012ec90 TestApp!UnwindExceptionHandler2+0xf8
00000000`0012f1b0 TestApp!`FaultingFunction2'::`1'::filt$1+0xe
[...]

At this point, given what we know about RtlUnwindEx, it will start unwinding the stack downward. Since the target of the collided unwind will by definition be lower in the stack than the unwind handler’s stack pointer itself, RtlUnwindEx will continue unwinding downward, calling unwind handlers (if any) for each successive frame. Taking a look at the call stack, we can determine that there are no frames with an exception handler marked for unwind (denotated by a [ U ] in the !fnseh output):

0:000> !fnseh ntdll!RtlUnwindEx
ntdll!RtlUnwindEx L295 22,0A [   ]  (none)
0:000> !fnseh ntdll!local_unwind
ntdll!local_unwind L24 07,02 [   ]  (none)
0:000> !fnseh 00000000`01001f04 
1001ed0 L3a 06,02 [   ]  (none)
0:000> !fnseh ntdll!_C_specific_handler+0x140
ntdll!_C_specific_handler L16a 20,0C [   ]  (none)

(Here, 00000000`01001f04 corresponds to TestApp!`FaultingFunction2′::`1′::fin$2+0x34).

Because none of these call frames have an exception handler marked for unwind callbacks, we can surmise that RtlUnwindEx will blissfully unwind past all of these call frames just as one might expect. At this point, RtlUnwindEx is still unwinding the “wrong” stack though; we’d like it to be unwinding the stack passed to the original call to RtlUnwindEx, and not the unwind/exception handler call stack.

Something that one might not immediately expect happens when RtlUnwindEx reaches the next frame, however. Remember that the current call frame is now _C_specific_handler – the C-language exception handler for the current function that was originally being unwound after an exception occured. This means that the next call frame will be the original RtlUnwindEx, or more precisely, RtlpExecuteHandlerForUnwind.

This is where RtlpExecuteHandlerForUnwind and RtlpUnwindHandler get to shine. If we take a look at the next call frame in the debugger, we see that it is indeed RtlpExecuteHandlerForUnwind, and that it also has (as expected) an exception handler marked for unwind support: RtlpUnwindHandler.

0:000> !fnseh ntdll!RtlpExecuteHandlerForUnwind+0xd
ntdll!RtlpExecuteHandlerForUnwind L13 04,01 [EU ]
  ntdll!RtlpUnwindHandler (assembler/unknown)

Because this call frame does have an exception handler that supports unwind callouts, it will be returned to RtlUnwindEx by RtlVirtualUnwind. This, in turn, will lead to RtlUnwindEx calling RtlpUnwindHandler, as registered by RtlpExecuteHandlerForUnwind in the original call stack (by RtlUnwindEx). We can verify this in the debugger:

0:000> bp ntdll!RtlpUnwindHandler
0:000> g
Breakpoint 1 hit
ntdll!RtlpUnwindHandler:
00000000`779507e0 488b4220  mov rax,qword ptr [rdx+20h]
0:000> k
Child-SP          Call Site
00000000`0012d9a8 ntdll!RtlpUnwindHandler
00000000`0012d9b0 ntdll!RtlpExecuteHandlerForUnwind+0xd
00000000`0012d9e0 ntdll!RtlUnwindEx+0x236
00000000`0012e060 ntdll!local_unwind+0x1c
00000000`0012e540 TestApp!`FaultingFunction2'::`1'::fin$2+0x34
00000000`0012e570 ntdll!_C_specific_handler+0x140
00000000`0012e5e0 ntdll!RtlpExecuteHandlerForUnwind+0xd
00000000`0012e610 ntdll!RtlUnwindEx+0x236
00000000`0012ec90 TestApp!UnwindExceptionHandler2+0xf8
00000000`0012f1b0 TestApp!`FaultingFunction2'::`1'::filt$1+0xe
[...]

This is where things start to get a little interesting. From the discussion in the previous article, we know that RtlpUnwindHandler essentially does the following:

  1. Retrieve the PDISPATCHER_CONTEXT argument that RtlpExecuteHandlerForUnwind (the original instance, from the original unwind operation initiated by the first call to RtlUnwindEx) saved on its stack. This is done via the use of the EstablisherFrame argument to RtlpUnwindHandler.
  2. Copy the contents of RtlpExecuteHandlerForUnwind’s DISPATCHER_CONTEXT over the DISPATCHER_CONTEXT of the current RtlUnwindEx instance, through the PDISPATCHER_CONTEXT argument provided to RtlpUnwindHandler. Note that the TargetIp member of the DISPATCHER_CONTEXT is not copied from RtlpExecuteHandlerForUnwind’s DISPATCHER_CONTEXT.
  3. Return the manifest ExceptionCollidedUnwind constant to the caller (RtlpExecuteHandlerForUnwind, which will in turn return this value to RtlUnwindEx).

After all this is done, control returns to RtlUnwindEx. Because RtlpExecuteHandlerForUnwind returned ExceptionCollidedUnwind, though, a previously unused code path is activated. This code path (as described previously) copies the contents of the DISPATCHER_CONTEXT structure whose address was passed to RtlpExecuteHandlerForUnwind back into the internal state of RtlUnwindEx (including the context record), and then attempts to re-start unwinding of the current stack frame.

If you’ve been paying attention so far, then you probably understand what is going to happen next.

Because of the fact that RtlpUnwindHandler copied the DISPATCHER_CONTEXT from the original call to RtlUnwindEx over the DISPATCHER_CONTEXT from the current (collided unwind) call to RtlUnwindEx, the current instance of RtlUnwindEx now has access to all of the state information that the original RtlUnwindEx instance had placed into the PDISPATCHER_CONTEXT passed to RtlpExecuteHandlerForUnwind. Most importantly, this includes access to the original context record descibing the call frame that the original instance of RtlUnwindEx was in the process of unwinding.

Since all of this information has now been copied over the current RtlUnwindEx instance’s internal state, in effect, the current instance of RtlUnwindEx will (for the next unwind iteration) start unwinding the stack where the original RtlUnwindEx instance stopped; in other words, the stack being unwound “jumps” from the currently active call stack to the exception (or other) call stack that was originally being unwound.

At this point, the second instance of RtlUnwindEx is all setup to unwind the call stack to the new unwind target frame (and target instruction pointer; remember that TargetIp was omitted from the copying performed on the PDISPATCHER_CONTEXT in RtlpUnwindHandler) like a “conventional” unwind. The rest is, as they say, history.

Now that we know how collided unwinds work, it is important to know when one would ever see such a thing (after all, interrupting an unwind in-progress is a fairly invasive and atypical operation).

It turns out that collided unwinds are not quite as far-fetched as they might seem; the easiest way to cause such an event is to do something sleazy like execute a return/goto/continue/break to transfer control out of a __finally block. This, in effect, requires that the compiler stop the current unwind operation and transfer control to the target location (which is usually within the function that contained the __finally that the programmer jumped out of). Nevertheless, the compiler still has to deal with the fact that it has been called in the context of an unwind operation, and as such it needs a way to “break out” of the unwind call stack. This is done by executing a “local unwind”, or an unwind to a location within the current function. In order to do this, the compiler calls a small, runtime-supplied helper function known as local_unwind. This function is described below, and is essentially an extremely thin wrapper around RtlUnwindEx that, in practice, adds no value other than providing some default argument values (and scratch space on the stack for RtlUnwindEx to use to store a CONTEXT structure):

0:000> uf ntdll!local_unwind
ntdll!local_unwind:
00000000`7796f580 4881ecd8040000 sub  rsp,4D8h
00000000`7796f587 4d33c0         xor  r8,r8
00000000`7796f58a 4d33c9         xor  r9,r9
00000000`7796f58d 4889642420     mov  qword ptr [rsp+20h],rsp
00000000`7796f592 4c89442428     mov  qword ptr [rsp+28h],r8
00000000`7796f597 e844b0fdff     call ntdll!RtlUnwindEx
00000000`7796f59c 4881c4d8040000 add  rsp,4D8h
00000000`7796f5a3 c3             ret

When the compiler calls local_unwind as a result of the programmer breaking out of a __finally block in some fashion, then execution will eventually end up in RtlUnwindEx. From there, RtlUnwindEx eventually detects the operation as a collided unwind, once it unwinds past the original call to the original unwind handler that started the new unwind operation via local_unwind.

As a result, breaking out of a __finally block instead of allowing it to run to completion (which may result in control being transferred out of the “current function”, from the programmer’s perspective, and “into” the next function in the call stack for unwind processing) is how every-day programs can end up causing a collided unwind.

Next time: More unwind estorica, including details on how RtlUnwindEx and RtlRestoreContext lay the groundwork used to build C++ exception handling support.

2 Responses to “Programming against the x64 exception handling support, part 5: Collided unwinds”

  1. […] In the last post in the programming x64 exception handling series, I described how collided unwinds were implemented, and just how they operate. That just about wraps up the guts of unwinding (finally), except for one last corner case: So-called frame consolidation unwinds. […]