{"id":453,"date":"2010-08-17T07:00:32","date_gmt":"2010-08-17T12:00:32","guid":{"rendered":"http:\/\/www.nynaeve.net\/?p=453"},"modified":"2019-12-13T17:41:56","modified_gmt":"2019-12-13T22:41:56","slug":"nwscript-jit-engine-under-the-hood-of-a-generated-msil-subroutine","status":"publish","type":"post","link":"http:\/\/www.nynaeve.net\/?p=453","title":{"rendered":"NWScript JIT engine: Under the hood of a generated MSIL subroutine"},"content":{"rendered":"<p><a title=\"NWScript JIT engine: Generating a .NET assembly for a JIT'd script\" href=\"http:\/\/www.nynaeve.net\/?p=435\">Yesterday<\/a>, I expounded on the basics of how assemblies for scripts are structured, and how variables, subroutines, and IR instructions are managed throughout this process.<\/p>\n<p>Nothing beats a good concrete example, though, so let&#8217;s examine a sample subroutine, both in NWScript source text form, and then again in MSIL form, and finally in JIT&#8217;d amd64 form.<\/p>\n<p><u><em>Example subroutine<\/em><\/u><\/p>\n<p>For the purposes of this example, we&#8217;ll take the following simple NWScript subroutine:<\/p>\n<pre>\r\nint g_randseed = 0;\r\n\r\nint rand()\r\n{\r\n\treturn g_randseed =\r\n     (g_randseed * 214013 + 2531101) >> 16;\r\n}<\/pre>\n<p>Here, we have a global variable, <em>g_randseed<\/em>, that is used by our random number generator.  Because this is a global variable, it will be stored as an instance variable on the main program class of the script program, as we&#8217;ll see when we crack open the underlying IL for this subroutine:<\/p>\n<p><u><em>MSIL version<\/em><\/u><\/p>\n<pre>.method private instance int32  \r\nNWScriptSubroutine_rand() cil managed\r\n{\r\n  \/\/ Code size       110 (0x6e)\r\n  .maxstack  8\r\n  .locals init (int32 V_0,\r\n           uint32 V_1,\r\n           int32 V_2,\r\n           int32 V_3,\r\n           int32 V_4)\r\n  IL_0000:  ldarg.0\r\n  IL_0001:  ldarg.0\r\n  IL_0002:  ldfld      uint32 m_CallDepth\r\n  IL_0007:  ldc.i4.1\r\n  IL_0008:  add\r\n  IL_0009:  dup\r\n  IL_000a:  stloc.1\r\n  IL_000b:  stfld      uint32 m_CallDepth\r\n  IL_0010:  ldloc.1\r\n  IL_0011:  ldc.i4     0x80\r\n  IL_0016:  clt.un\r\n  IL_0018:  brtrue.s   IL_0025\r\n  IL_001a:  ldstr      \"Maximum call depth exceeded.\"\r\n  IL_001f:  newobj     instance void\r\n                        System.Exception::.ctor(string)\r\n  IL_0024:  throw\r\n<span style=\"color:red\">  IL_0025:  ldarg.0\r\n  IL_0026:  ldfld      int32 m__NWScriptGlobal4\r\n  IL_002b:  stloc.2<\/span>\r\n<span style=\"color:green\">  IL_002c:  ldc.i4     0x343fd\r\n  IL_0031:  stloc.3\r\n  IL_0032:  ldloc.2\r\n  IL_0033:  ldloc.3\r\n  IL_0034:  mul\r\n  IL_0035:  stloc.s    V_4<\/span>\r\n<span style=\"color:blue\">  IL_0037:  ldc.i4     0x269f1d\r\n  IL_003c:  stloc.2\r\n  IL_003d:  ldloc.s    V_4\r\n  IL_003f:  ldloc.2\r\n  IL_0040:  add\r\n  IL_0041:  stloc.3<\/span>\r\n<span style=\"color:teal\">  IL_0042:  ldc.i4     0x10\r\n  IL_0047:  stloc.s    V_4\r\n  IL_0049:  ldloc.3\r\n  IL_004a:  ldloc.s    V_4\r\n  IL_004c:  shr\r\n  IL_004d:  stloc.2<\/span>\r\n<span style=\"color:darkorchid\">  IL_004e:  ldloc.2\r\n  IL_004f:  stloc.3\r\n  IL_0050:  ldarg.0\r\n  IL_0051:  ldloc.3\r\n  IL_0052:  stfld      int32 m__NWScriptGlobal4<\/span>\r\n<span style=\"color:darkorange\">  IL_0057:  ldloc.2\r\n  IL_0058:  stloc.0\r\n  IL_0059:  br         IL_005e<\/span>\r\n  IL_005e:  ldarg.0\r\n  IL_005f:  ldarg.0\r\n  IL_0060:  ldfld      uint32 m_CallDepth\r\n  IL_0065:  ldc.i4.m1\r\n  IL_0066:  add\r\n  IL_0067:  stfld      uint32 m_CallDepth\r\n<span style=\"color:darkorange\">  IL_006c:  ldloc.0\r\n  IL_006d:  ret<\/span>\r\n}\r\n\/\/ end of method\r\n\/\/ ScriptProgram::NWScriptSubroutine_rand\r\n<\/pre>\n<p>That&#8217;s a lot of code!  (Actually, it turns out to be not that much when the IL is JIT&#8217;d, as we&#8217;ll see.)<\/p>\n<p>Right away, you&#8217;ll probably notice some additional instrumentation in the generated subroutine; there is an instance variable on the main program class, m_CallDepth, that is being used.  This is part of the best-effort instrumentation that the JIT backend inserts into JIT&#8217;d programs so as to catch obvious programming mistakes before they take down the script host completely.<\/p>\n<p>In this particular case, the JIT&#8217;d code is instrumented to keep track of the current call depth in an instance variable on the main program class, m_CallDepth.  Should the current call depth exceed a maximum limit (which, incidentally, is the same limit that the interpretive VM imposes), the a System.Exception is raised to abort the script program.<\/p>\n<p>This brings up a notable point, in that the generated IL code is designed to be safely aborted at any time by raising a System.Exception.  An exception handler wrapping the entry point catches the exception, and the default return code for the script is returned up to the caller if a script is aborted in this way.<\/p>\n<p>Looking back to the generated code, we can see that the basic operations that we would expect are all there; there is code to <span style=\"color:red\">load the current value of g_randseed<\/span> (m__NWScriptGlobal4 in this case), <span style=\"color:green\">multiply it with a fixed constant<\/span> (0x343fd, or 214013 as we see in the NWScript source text), then <span style=\"color:blue\">perform the addition<\/span> and <span style=\"color:teal\">right shift<\/span>, before finally <span style=\"color:darkorchid\">storing the result back to g_randseed<\/span> (m__NWScriptGlobal4 again) and <span style=\"color:darkorange\">returning<\/span>.  (Whew, that&#8217;s it!)<\/p>\n<p>Even though there are a lot of loads and stores here still, most of these actually disappear once the CLR JIT compiles the MSIL to native code.  To see this in action, let&#8217;s look at the same code, now translated into amd64 instructions by the CLR JIT.  Here, I used !sos.u from the sos.dll debugger extensions (the instructions are colored using the same coloring scheme as I used above):<\/p>\n<pre>0:007> !u 000007ff`001cbac0\r\nNormal JIT generated code\r\nNWScriptSubroutine_rand()\r\nBegin 000007ff001cbac0, size 7e\r\npush    rbx\r\npush    rdi\r\nsub     rsp,28h\r\nmov     rdx,rcx\r\nmov     eax,dword ptr [rdx+1Ch]\r\nlea     ecx,[rax+1]\r\nmov     dword ptr [rdx+1Ch],ecx\r\nxor     eax,eax\r\ncmp     ecx,80h\r\nsetb    al\r\ntest    eax,eax\r\nje      000007ff`001cbb07\r\n<span style=\"color:red\">mov     eax,dword ptr [rdx+34h]<\/span>\r\n<span style=\"color:green\">imul    eax,eax,343FDh<\/span>\r\n<span style=\"color:blue\">lea     ecx,[rax+269F1Dh]<\/span>\r\n<span style=\"color:teal\">sar     ecx,10h<\/span>\r\n<span style=\"color:darkorchid\">mov     dword ptr [rdx+34h],ecx<\/span>\r\nmov     eax,dword ptr [rdx+1Ch]\r\ndec     eax\r\nmov     dword ptr [rdx+1Ch],eax\r\n<span style=\"color:darkorange\">mov     eax,ecx\r\nadd     rsp,28h\r\npop     rdi\r\npop     rbx\r\nret<\/span>\r\nlea     rdx,[000007ff`001f3fd8]\r\nmov     ecx,70000005h\r\ncall    clr!JIT_StrCns\r\nmov     rbx,rax\r\nlea     rcx,[mscorlib_ni+0x4c6d28]\r\ncall    clr!JIT_TrialAllocSFastMP_InlineGetThread\r\nmov     rdi,rax\r\nmov     rdx,rbx\r\nmov     rcx,rdi\r\ncall    mscorlib_ni+0x376e20\r\n  (System.Exception..ctor(System.String)\r\nmov     rcx,rdi\r\ncall    clr!IL_Throw\r\nnop\r\n<\/pre>\n<p>(If you&#8217;re curious, this was generated with the .NET 4 JIT.)<\/p>\n<p>Essentially each and every one of the fundamental operations was turned into just a single amd64 instruction by the JIT compiler &#8212; not bad at all!  (The rest of the code you see here is the recursion guard.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Yesterday, I expounded on the basics of how assemblies for scripts are structured, and how variables, subroutines, and IR instructions are managed throughout this process. Nothing beats a good concrete example, though, so let&#8217;s examine a sample subroutine, both in NWScript source text form, and then again in MSIL form, and finally in JIT&#8217;d amd64 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[51],"_links":{"self":[{"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=\/wp\/v2\/posts\/453"}],"collection":[{"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=453"}],"version-history":[{"count":10,"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=\/wp\/v2\/posts\/453\/revisions"}],"predecessor-version":[{"id":498,"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=\/wp\/v2\/posts\/453\/revisions\/498"}],"wp:attachment":[{"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=453"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=453"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.nynaeve.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=453"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}