Debugging (or reverse engineering…) a real life Windows Vista compatibility problem: CreateIpForwardEntry in iphlpapi

Since I’m at the Microsoft Vista compatibity lab, it only makes sense that I’ve fixed a few Vista compatibility bugs in our product today.

Some of these are real bugs, but I ran into one in particular that is particularly infuriating: a completely undocumented, seemingly completely arbitrary restriction placed on a publicly documented API that has been around since Windows 98.

In this particular case, I was running into a problem where one of our products was being unable to add routes on Vista. This worked fine on prior platforms we supported, and so I started looking into it as a compatibility problem. First things first, I narrowed the problem down to a particular API that was failing.

We have a function that wrappers the various details about creating routes. The function in question went approximately like so:

// Add a route through the desired gateway.

	__in unsigned long Network,
	__in unsigned long Mask,
	__in unsigned long Gateway
	DWORD            Status, ForwardType;
	unsigned long    InterfaceIp, InterfaceIndex;

[...]	// (Code to determine the local
	// interface to add the route on)

	// Setup the IP forward row.


	Row.dwForwardDest    = Network;
	Row.dwForwardMask    = Mask;
	Row.dwForwardPolicy  = 0;
	Row.dwForwardNextHop = Gateway;
	Row.dwForwardIfIndex = InterfaceIndex;
	Row.dwForwardType    = ForwardType;
	Row.dwForwardProto   = PROTO_IP_NETMGMT;
	Row.dwForwardAge     = INFINITE;
	Row.dwForwardMetric1 = 0;

	// Create the route.

	if ((Status = CreateIpForwardEntry(&Row))
		!= NO_ERROR)
		wprintf(L"Creation failed, %lu.\\n",
		return Status;

[...]	// (More unrelated boilerplate code)

	return Status;

Essentially, the problem here was that CreateIpForwardEntry was failing. Checking logs, the error code logged was 0xA0.

Using the handy Microsoft error code lookup utility (err.exe), it was easy to determine what this error code means:

C:\\>err a0
# for hex 0xa0 / decimal 160 :
  INTERNAL_POWER_ERROR                            bugcodes.h
  LLC_STATUS_BIND_ERROR                           dlcapi.h
  SQL_160_severity_15                             sql_err
# Rule does not contain a variable.
  ERROR_BAD_ARGUMENTS                             winerror.h
# One or more arguments are not correct.
# Too much incoming data%0
# 5 matches found for "a0"

The only error that makes sense in this context is ERROR_BAD_ARGUMENTS. Unfortunately, that is not really all that helpful. Checking the latest MSDN documentation for CreateIpForwardEntry, there is, of course, no mention of this error code whatsoever.

Additionally, looking at the Microsoft documentation, nothing immediately jumped to mind as to what the problem is.

Although the Microsoft people here for the Vista lab did offer to see about getting me in touch with someone in the product team who might have an explanation for this behavior, I eventually decided that I would just take a crack at digging into the internals of CreateIpForwardEntry and understand the problem myself in the meanwhile to see if I might be able to come up with a fix sooner. After searching around a bit on Google and not coming up with any good explanation for what was going wrong, I eventually decided to step into iphlpapi!CreateIpForwardEntry in the debugger and see just what was going wrong first-hand.

0:000> bu iphlpapi!CreateIpForwardEntry
breakpoint 0 redefined
0:000> g
Breakpoint 0 hit
eax=0012fd6c ebx=00000004 ecx=00000000 edx=00000000
esi=01040a0a edi=00000003
eip=751bdfc1 esp=0012fd58 ebp=0012fdb0 iopl=0
nv up ei pl nz ac pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000
751bdfc1 8bff            mov     edi,edi

Looking at the disassembly of CreateIpForwardEntry, it’s clear that this function is now just a stub that forwards the call onto another function that performs the real work:

0:000> u @eip
751bdfc1 8bff       mov     edi,edi
751bdfc3 55         push    ebp
751bdfc4 8bec       mov     ebp,esp
751bdfc6 6a01       push    1
751bdfc8 ff7508     push    dword ptr [ebp+8]
751bdfcb e820ffffff call    CreateOrSetIpForwardEntry
751bdfd0 5d         pop     ebp
751bdfd1 c20400     ret     4

So, I pressed onward, stepping into iphlpapi!CreateOrSetIpForwardEntry

0:000> tc
751bdfcb e820ffffff call    CreateOrSetIpForwardEntry
0:000> t
eax=0012fd6c ebx=00000004 ecx=00000000 edx=00000000
esi=01040a0a edi=00000003
eip=751bdef0 esp=0012fd48 ebp=0012fd54 iopl=0
nv up ei pl nz ac pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000
751bdef0 8bff            mov     edi,edi

Looking at the disassembly, there appears to be only one place where the error code ERROR_BAD_ARGUMENTS (disassembly truncated for better viewing):

0:000> uf @eip
751bdef0 8bff            mov     edi,edi
751bdef2 55              push    ebp
751bdef3 8bec            mov     ebp,esp
751bdef5 83ec48          sub     esp,48h
751bdef8 8365b800        and     dword ptr [ebp-48h],0
751bdefc 56              push    esi
751bdefd 6a2c            push    2Ch
751bdeff 8d45bc          lea     eax,[ebp-44h]
751bdf02 6a00            push    0
751bdf04 50              push    eax
751bdf05 e8f053ffff      call    memset
751bdf0a 8b7508          mov     esi,dword ptr [ebp+8]


; Convert the interface metric we passed in with
; the pRoute structure into an interface LUID,
; stored at [ebp-30].

751bdf36 8d45d0          lea     eax,[ebp-30h]
751bdf39 50              push    eax
751bdf3a ff7610          push    dword ptr [esi+10h]
751bdf3d e86590ffff      call    ConvertInterfaceIndexToLuid
751bdf42 85c0            test    eax,eax
751bdf44 7571            jne     751bdfb7

; Get the interface metric for the requested interface,
; and store it at [ebp+8].  We pass in the address of
; the LUID of the requested interface in order to make
; the check.

751bdf46 8d4508          lea     eax,[ebp+8]
751bdf49 50              push    eax
751bdf4a 8d45d0          lea     eax,[ebp-30h]
751bdf4d 50              push    eax
751bdf4e e802f4ffff      call    GetInterfaceMetric


; Load esi with pRoute->dwForwardMetric1

751bdf6c 8b7624          mov     esi,dword ptr [esi+24h]
751bdf6f 6a06            push    6
751bdf71 8945e0          mov     dword ptr [ebp-20h],eax
751bdf74 83c8ff          or      eax,0FFFFFFFFh
751bdf77 3b7508          cmp     esi,dword ptr [ebp+8]
751bdf7a 59              pop     ecx
751bdf7b 8d7de8          lea     edi,[ebp-18h]
751bdf7e f3ab            rep stos dword ptr es:[edi]
751bdf80 8945ec          mov     dword ptr [ebp-14h],eax
751bdf83 8945f0          mov     dword ptr [ebp-10h],eax
751bdf86 5f              pop     edi

; Check that esi is not less than [ebp+8]
; ... in other words, verify that
; pRoute->dwForwardMetric1 >= InterfaceMetric,
; where InterfaceMetric is set by GetInterfaceMetric()

751bdf87 7229            jb      751bdfb2 ; failure

751bdf89 2b7508          sub     esi,dword ptr [ebp+8]
751bdf8c 6a18            push    18h
751bdf8e 8d45e8          lea     eax,[ebp-18h]
751bdf91 50              push    eax
751bdf92 6a30            push    30h
751bdf94 8d45b8          lea     eax,[ebp-48h]
751bdf97 50              push    eax
751bdf98 6a10            push    10h
751bdf9a 6864331b75      push    751b3364
751bdf9f ff750c          push    dword ptr [ebp+0Ch]
751bdfa2 8975f4          mov     dword ptr [ebp-0Ch],esi
751bdfa5 6a01            push    1
751bdfa7 c645ff01        mov     byte ptr [ebp-1],1

; Call the NsiSetAllParameters internal API to create the
; route, and return its return value to the caller.

751bdfab e86857ffff      call    NsiSetAllParameters
751bdfb0 eb05            jmp     751bdfb7

751bdfb2 b8a0000000      mov     eax,0A0h

751bdfb7 5e              pop     esi
751bdfb8 c9              leave
751bdfb9 c20800          ret     8

From this annotated disassembly, we can conclude that there are only two possibilities that might result in this behavior. The first is that GetInterfaceMetric(InterfaceIndex, &InterfaceMetric) is returning an InterfaceMetric greater than the metric we are supplying. The second is that NsiSetAllParameters is returning ERROR_BAD_ARGUMENTS.

To test this theory, we need to examine the comparison at 751bdf87 to determine if that is taking the failure branch, and we need to check the return value of NsiSetAllParameters. This is fairly easy to do with a couple of breakpoints:

0:000> bu 751bdf87 
0:000> bu 751bdfb0 
0:000> g
Breakpoint 1 hit
eax=ffffffff ebx=00000004 ecx=00000000 edx=7707e524
esi=00000000 edi=00000003
eip=751bdf87 esp=0012fcf8 ebp=0012fd44 iopl=0
nv up ei ng nz ac pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000
751bdf87 7229            jb      751bdfb2 [br=1]

Our first breakpoint, the one on the comparison with the “Interface Metric” and the route metric we supplied in pRoute->dwForwardMetric1, was the one that hit first (as expected). Looking at the register context supplied by WinDbg, though, we can clearly see that the program is going to take the branch and head down the code path that returns ERROR_BAD_ARGUMENTS. Problem identified!

There still remains the issue of solving the problem, though. Looking at [ebp+8], it appears that the undocumented iphlpapi!GetInterfaceMetric returned 10:

0:000> ? dwo(@ebp+8)
Evaluate expression: 10 = 0000000a

This makes sense. We supplied a metric of 0, which is obviously less than 10. Unfortunately, now we need a good way to determine whether we should use a zero metric (for previous OS versions) or a different metric (for Vista), assuming we want our route to be the most precedent for a particular network/mask value.

Unfortunately, MSDN doesn’t turn up any hits on GetInterfaceMetric, and neither does Google. Well, that sucks – it looks like that for Vista, unless I want to hardcode 10, I’ll have to go off into undocumented land to use a publicly documented API. There seems to be something a bit ironic about that to me, but, nonetheless, the problem remains to be solved.

Update: There is a (minimally) documented solution that was very recently made available. See the bottom of the post for details.

So, all that we need to do is reverse engineer the parameters to this undocumented GetInterfaceMetric function and call it, right?

Well, no, not exactly – things actually get worse. It turns out that GetInterfaceMeteric is not even exported from iphlpapi.dll – it’s a purely internal function!

The only other option at this point, aside from hardcoding 10 as a minimum metric, is to reimplement all of the functionality of GetInterfaceMetric ourselves. Taking a look at GetInterfaceMetric, things look unfortunately rather complicated:

0:000> uf iphlpapi!GetInterfaceMetric
751bd355 8bff            mov     edi,edi
751bd357 55              push    ebp
751bd358 8bec            mov     ebp,esp
751bd35a 6a1c            push    1Ch
751bd35c 6a04            push    4
751bd35e ff750c          push    dword ptr [ebp+0Ch]
751bd361 6a00            push    0
751bd363 6a08            push    8
751bd365 ff7508          push    dword ptr [ebp+8]
751bd368 6a07            push    7
751bd36a 6864331b75      push    NPI_MS_IPV4_MODULEID
751bd36f 6a01            push    1
751bd371 e88f5fffff      call    NsiGetParameter
751bd376 5d              pop     ebp
751bd377 c20800          ret     8

NPI_MS_IPV4_MODULEID is a global variable of some sort in iphlpapi:

0:000> db iphlpapi!NPI_MS_IPV4_MODULEID l8
751b3364  18 00 00 00 01 00 00 00  ........

Using the x command with ascending order, we can make an educated guess as to the size of this global by enumerating all symbols in iphlpapi in address space order:

0:000> x /a iphlpapi!*
751b3364 iphlpapi!NPI_MS_IPV4_MODULEID = <no type information>
751b3381 iphlpapi!NsiAllocateAndGetTable = <no type information>

So, we know that NPI_MS_IPV4_MODULEID must be no more than 0x1d bytes long. Taking a look around NPI_MS_IPV4_MODULE_ID, we see that past 0x18 bytes in, there appears to be code (nop instructions), making it likely that the global is 0x18 bytes long.

0:000> db 751b3364 
751b3364  18 00 00 00 01 00 00 00-00 4a 00 eb 1a 9b d4 11
751b3374  91 23 00 50 04 77 59 bc-90 90 90 90 90 ff 25 94

(The repeated 90 90 90 90 bytes are a typical sign of code. 90 is the opcode for the nop instruction on x86, which the compiler typically uses for padding out function start offsets for alignment.)

Given this, we should be able to replicate the behavior of GetInterfaceMetrics, as the only function it calls, NsiGetParameter, is exported by nsi.dll (of course, it isn’t documented…). From the above disassembly, we can see that NsiGetParameter takes a ulong-sized argument (constant 0x1), a pointer argument (address of NPI_MS_IPV4_MODULEID), a ulong-sized argument (constant 0x7), a pointer that is the address of the interface LUID (argument 1 of GetInterfaceMetrics, which we saw earlier), a ulong-sized argument (constant 0x8), a ulong or pointer-sized argument (constant 0x0), a pointer-sized argument (address of a ULONG containing the “interface metric”), a ulong-sized argument (constant 0x4), and (finally!) a ulong-sized argument (constant 0x1c). I would surmise that the 0x8 and 0x4 constants are the sizes of the LUID and output buffer, though I haven’t bothered to confirm that at this point.

From our knowledge of __stdcall, we can identify NsiGetParameter as __stdcall quickly by looking at the disassembly of GetInterfaceMetrics and noticing the behavior after the function call (not removing arguments from the stack space, assuming the callee (NsiGetParameter) performs that task.

Given all of this, we can make our own function that implements GetInterfaceMetric. Now, just to be clear, I would not recommend actually using this, unless Microsoft fails to provide a documented mechanism to determine the minimum metric permitted for CreateIpForwardEntry (or removes the restriction) prior to Vista RTM. I am going to try and do whatever I can to see what ISV’s are supposed to do with this particular problem (and whether it can be fixed before RTM) before this week is up, but in the event that I don’t get anywhere, I’ll have a backup plan (as ugly and hackish as it may be) – better than not being able to manipulate the route table, period, on Vista.

Anyway, the basic idea is that we call ConvertInterfaceIndexToLuid on the InterfaceIndex that we already have from iphlpapi, to convert this into a NET_LUID structure (new to Vista). It does so happen that ConvertInterfaceIndexToLuid is a documented API, which makes that the easy part.

Then, we simply replicate the call that we saw in GetInterfaceMetric inside iphlpapi.dll. For brevity, I am not posting the entire source code for my implementation of GetInterfaceMetric inline; you can, however, download it. With this reverse engineered implementation, all that is left is to call it to get the minimum metric for the interface we are about to add a route on, and place that metric in the MIB_IPFORWARDROW that we pass to CreateIpForwardEntry.

I’ll post back when I hear from Microsoft as to the official word as to how one is to handle this situation; I fully expect that there will be a documented API (or the restriction will go away) before RTM, at this point, given that this is a rather bad compatibility bug that breaks a long-existing documented API in such a way that requires you to go into undocumented hackery to continue to use it (especially since there is no other good way that I know of to replicate the functionality of the API in question).

Update: You can use the GetIpInterfaceEntry routine (new to Vista, in iphlpapi) to find the minimum metric for an interface. Note that you will very likely need to search on MSDN to find information on this function, as it’s not been included in recent SDKs to my knowledge.

(Note: Some of the debugger output was slightly modified or truncated by me to keep the formatting sane.)

6 Responses to “Debugging (or reverse engineering…) a real life Windows Vista compatibility problem: CreateIpForwardEntry in iphlpapi”

  1. Paul says:

    I definitely have seen something about changes in metric assignment in Vista; and that it is related to some properties of the interface, such as link speed. Try to ask network folks while you’re there.


  2. Hasan Jamal says:

    This is an excellent publication. I’m testing one of our application with Vista RC2 and CreateIpForwardEntry was also failing for me, I read all MSDN doc and did not get much help, so I thought to reverse engineer CreateIpForwardEntry. We use CreateIpForwardEntry to add a static route, the same is achieved from command prompt using ‘route add’. Interestingly, I found that when I set breakpoint to iphlpapi!CreateIpForwardEntry in WinDBG, it doesn’t break when I execute ‘route add’, I’m surprised that ‘route add’ doesn’t use CreateIpForwardEntry. So, I decided to look at the symbol in it’s exe file and found that CreateIpForwardEntry is not in it’s pe format of import table. May be if you reverse engineer Vista’s route.exe then we may find another way to implement ‘route add’. I’m sure many programmers around the world will be benefitted by it given the number of questions\queries coming out of googling for CreateIpForwardEntry. Keep us posted if Microsoft publishes new API GetInterfaceMetric.

    Keep up good work.


  3. Skywing says:

    Unfortunately, the Microsoft guys never got back to me about why there appears to be no documented way to get the minimum interface metric. I was told that the most likely person to pick the question up was out on vacation, so I’m still hoping for a solid answer on the question.

    Things don’t look good for this being fixed before RTM though, which means that this kludge may be required until perhaps SP1 (ick!). I’ll post an update if and when the MS appcompat guys get a response back from the appropriate team.

    Oh, and about how Route.exe works — I actually did reverse engineer just what it does before delving into how iphlpapi was making the metric check. In Vista, Route.exe uses the undocumented NSI APIs that iphlpapi calls on to both find out the minimum metric for the interface and actually create the route. So, yes, it is expected that you won’t see any calls to iphlpapi.

  4. Skywing says:

    The official response is that you must use GetIpInterfaceEntry (new to Vista) to do this. Unfortunately, at least as far back as August of this year this function was not even documented at all. It is a breaking change that will tend to disable programs that use CreateIpForwardEntry without them otherwise calling this new API.

  5. vineeta says:

    thnx this reverse engineering helps me a lot in studying the vista problem regarding CreateIpForwardEntry. i ‘ll try it and then if i ‘ll get some new result i’ll surely give u the reply

  6. devnull says:

    Thanks, you have really helped me. I got stuck with the same problem.