NWScript JIT engine: Under the hood of a generated MSIL subroutine

Yesterday, I expounded on the basics of how assemblies for scripts are structured, and how variables, subroutines, and IR instructions are managed throughout this process.

Nothing beats a good concrete example, though, so let’s examine a sample subroutine, both in NWScript source text form, and then again in MSIL form, and finally in JIT’d amd64 form.

Example subroutine

For the purposes of this example, we’ll take the following simple NWScript subroutine:

int g_randseed = 0;

int rand()
	return g_randseed =
     (g_randseed * 214013 + 2531101) >> 16;

Here, we have a global variable, g_randseed, that is used by our random number generator. Because this is a global variable, it will be stored as an instance variable on the main program class of the script program, as we’ll see when we crack open the underlying IL for this subroutine:

MSIL version

.method private instance int32  
NWScriptSubroutine_rand() cil managed
  // Code size       110 (0x6e)
  .maxstack  8
  .locals init (int32 V_0,
           uint32 V_1,
           int32 V_2,
           int32 V_3,
           int32 V_4)
  IL_0000:  ldarg.0
  IL_0001:  ldarg.0
  IL_0002:  ldfld      uint32 m_CallDepth
  IL_0007:  ldc.i4.1
  IL_0008:  add
  IL_0009:  dup
  IL_000a:  stloc.1
  IL_000b:  stfld      uint32 m_CallDepth
  IL_0010:  ldloc.1
  IL_0011:  ldc.i4     0x80
  IL_0016:  clt.un
  IL_0018:  brtrue.s   IL_0025
  IL_001a:  ldstr      "Maximum call depth exceeded."
  IL_001f:  newobj     instance void
  IL_0024:  throw
  IL_0025:  ldarg.0
  IL_0026:  ldfld      int32 m__NWScriptGlobal4
  IL_002b:  stloc.2
  IL_002c:  ldc.i4     0x343fd
  IL_0031:  stloc.3
  IL_0032:  ldloc.2
  IL_0033:  ldloc.3
  IL_0034:  mul
  IL_0035:  stloc.s    V_4
  IL_0037:  ldc.i4     0x269f1d
  IL_003c:  stloc.2
  IL_003d:  ldloc.s    V_4
  IL_003f:  ldloc.2
  IL_0040:  add
  IL_0041:  stloc.3
  IL_0042:  ldc.i4     0x10
  IL_0047:  stloc.s    V_4
  IL_0049:  ldloc.3
  IL_004a:  ldloc.s    V_4
  IL_004c:  shr
  IL_004d:  stloc.2
  IL_004e:  ldloc.2
  IL_004f:  stloc.3
  IL_0050:  ldarg.0
  IL_0051:  ldloc.3
  IL_0052:  stfld      int32 m__NWScriptGlobal4
  IL_0057:  ldloc.2
  IL_0058:  stloc.0
  IL_0059:  br         IL_005e
  IL_005e:  ldarg.0
  IL_005f:  ldarg.0
  IL_0060:  ldfld      uint32 m_CallDepth
  IL_0065:  ldc.i4.m1
  IL_0066:  add
  IL_0067:  stfld      uint32 m_CallDepth
  IL_006c:  ldloc.0
  IL_006d:  ret
// end of method
// ScriptProgram::NWScriptSubroutine_rand

That’s a lot of code! (Actually, it turns out to be not that much when the IL is JIT’d, as we’ll see.)

Right away, you’ll probably notice some additional instrumentation in the generated subroutine; there is an instance variable on the main program class, m_CallDepth, that is being used. This is part of the best-effort instrumentation that the JIT backend inserts into JIT’d programs so as to catch obvious programming mistakes before they take down the script host completely.

In this particular case, the JIT’d code is instrumented to keep track of the current call depth in an instance variable on the main program class, m_CallDepth. Should the current call depth exceed a maximum limit (which, incidentally, is the same limit that the interpretive VM imposes), the a System.Exception is raised to abort the script program.

This brings up a notable point, in that the generated IL code is designed to be safely aborted at any time by raising a System.Exception. An exception handler wrapping the entry point catches the exception, and the default return code for the script is returned up to the caller if a script is aborted in this way.

Looking back to the generated code, we can see that the basic operations that we would expect are all there; there is code to load the current value of g_randseed (m__NWScriptGlobal4 in this case), multiply it with a fixed constant (0x343fd, or 214013 as we see in the NWScript source text), then perform the addition and right shift, before finally storing the result back to g_randseed (m__NWScriptGlobal4 again) and returning. (Whew, that’s it!)

Even though there are a lot of loads and stores here still, most of these actually disappear once the CLR JIT compiles the MSIL to native code. To see this in action, let’s look at the same code, now translated into amd64 instructions by the CLR JIT. Here, I used !sos.u from the sos.dll debugger extensions (the instructions are colored using the same coloring scheme as I used above):

0:007> !u 000007ff`001cbac0
Normal JIT generated code
Begin 000007ff001cbac0, size 7e
push    rbx
push    rdi
sub     rsp,28h
mov     rdx,rcx
mov     eax,dword ptr [rdx+1Ch]
lea     ecx,[rax+1]
mov     dword ptr [rdx+1Ch],ecx
xor     eax,eax
cmp     ecx,80h
setb    al
test    eax,eax
je      000007ff`001cbb07
mov     eax,dword ptr [rdx+34h]
imul    eax,eax,343FDh
lea     ecx,[rax+269F1Dh]
sar     ecx,10h
mov     dword ptr [rdx+34h],ecx
mov     eax,dword ptr [rdx+1Ch]
dec     eax
mov     dword ptr [rdx+1Ch],eax
mov     eax,ecx
add     rsp,28h
pop     rdi
pop     rbx
lea     rdx,[000007ff`001f3fd8]
mov     ecx,70000005h
call    clr!JIT_StrCns
mov     rbx,rax
lea     rcx,[mscorlib_ni+0x4c6d28]
call    clr!JIT_TrialAllocSFastMP_InlineGetThread
mov     rdi,rax
mov     rdx,rbx
mov     rcx,rdi
call    mscorlib_ni+0x376e20
mov     rcx,rdi
call    clr!IL_Throw

(If you’re curious, this was generated with the .NET 4 JIT.)

Essentially each and every one of the fundamental operations was turned into just a single amd64 instruction by the JIT compiler — not bad at all! (The rest of the code you see here is the recursion guard.)


One Response to “NWScript JIT engine: Under the hood of a generated MSIL subroutine”

  1. […] Nynaeve Adventures in Windows debugging and reverse engineering. « NWScript JIT engine: Under the hood of a generated MSIL subroutine […]