Yesterday, we examined the IR instruction raising process, at a high level. With a basic understanding of both how the IR is generated, and the IR design itself, we can begin to talk about the JIT backends supported by the JIT system.
The first supported JIT backend is the MSIL JIT backend, which emits a .NET assembly (containing pure MSIL/CIL). The code generation process is performed via standard System.Reflection.Emit calls; one .NET assembly is created for each script program JIT’d (such that there may be many assemblies for a script host using multiple scripts).
There are two supporting components for this backend; a mixed C++/CLI DLL that defines most of the JIT system’s logic (NWNScriptJIT.dll), and a pure/verifiable C++/CLI DLL that describes interfaces referenced by the JIT’d code (NWNScriptJITIntrinsics.dll), created to work around an issue with referencing mixed mode CLR types in Reflection-emitted code. We’ll discuss the NWNScriptJITIntrinsics.dll module in further detail in a later post.
The backend encapsulates both logic to generate an assembly for a script, and supporting logic to serve as an execution environment for the generated assembly. To that end, the MSIL JIT backend (NWNScriptJIT.dll) exposes an external C API that can be used to generate JIT’d code for a script, and then repeatedly invoke the script.
Before we go into details as the extenal API that the JIT backend exposes, it’s important to understand the design philosophies behind the MSIL JIT backend, which are the following:
- The JIT backend should be agnostic to the underlying action service call API exposed by a script host. The action service call API prototypes are provided to the JIT backend from the script host via datatables at runtime. This ensures maximum flexibility for the backend.
- The backend should support efficient re-use of JIT’d code. In particular, the preferred and supported paradigm is to generate code for a script once during program execution, and then reuse the same code many times.
- The backend should support all NWScript programs that are compiler-generated (i.e. it should be feature complete from a language perspective).
- The backend should be compatible with the NWScriptVM logic that I had written, for easy drop-in replacement (or side by side execution).
- The backend should support the creation of “best-effort” safeguards to protect the script host from trivial programming errors (such as unbounded recursion or infinite loops) in a script program. These safeguards are best effort only with respect to resource consumption denial of service issues; it is accepted that a maliciously constructed script may be able to perform at most a denial of service attack against the script host (such as by resource consumption), but the script must not be able to execute arbitrary native instructions on the host.
In the future, a more strict quota system could be created, but the usage scenarios for the script program do not require it at this time (in the case of NWN-derived programs, any user who can supply an arbitrary script to a server (scripts are server-side) can control the behavior of the game world within the server completely; scripts merely need defend against a module malicious builder attempting to gain code execution against an end-user client in a scenario such as a downloadable single player module).
With these principles in mind, let’s take a look at the backend from the perspective of its external interface first. Although several APIs are exposed, the two most interesting are NWScriptGenerateCode, which takes raw NWScript instructions and produces an abstract NWSCRIPT_JITPROGRAM handle, and NWScriptExecuteScript, which takes a NWSCRIPT_JITPROGRAM handle and executes the underlying script using a given set of parameters.
The prototypes for these APIs are as follows:
// // Generate a JIT'd native program handle // for a script program, given an // analyzer instance which represents a // representation of the program's // function. // // The program may be re-used // across multiple executions. However, the script // program itself is single threaded and // does not support concurrent execution // across multiple threads. // // The routine returns TRUE on success, // else FALSE on failure. // BOOLEAN NWSCRIPTJITAPI NWScriptGenerateCode( __in NWScriptReaderState * Script, __in_ecount( ActionCount ) PCNWACTION_DEFINITION ActionDefs, __in NWSCRIPT_ACTION ActionCount, __in ULONG AnalysisFlags, __in_opt IDebugTextOut * TextOut, __in ULONG DebugLevel, __in INWScriptActions * ActionHandler, __in NWN::OBJECTID ObjectInvalid, __in_opt PCNWSCRIPT_JIT_PARAMS CodeGenParams, __out PNWSCRIPT_JITPROGRAM GeneratedProgram ); // // Execute a script program, // returning the results to the caller. // int NWSCRIPTJITAPI NWScriptExecuteScript( __in NWSCRIPT_JITPROGRAM GeneratedProgram, __in INWScriptStack * VMStack, __in NWN::OBJECTID ObjectSelf, __in_ecount_opt( ParamCount ) const NWScriptParamString * Params, __in size_t ParamCount, __in int DefaultReturnCode, __in ULONG Flags );
(For the curious, you can peek ahead and examine the full API set and associated comments (C++/CLI source); all functions defined in this source module are public APIs exposed by NWNScriptJIT.dll.)
Internally, both of these routines are thin wrappers around a C++/CLI class called NWScriptProgram, which represents the code generator and execution environment for a NWScript program. (The NWScriptProgram object instance created by NWScriptGenerateCode is returned to the caller in the form of a GCHandle cast to a NWSCRIPT_JITPROGRAM opaque type.)
This interface is explicitly designed to allow convenient use from C/C++ applications that are not integrated with the CLR. Although in its most optimized form, the backend generates code specific to a particular instance of a script host’s process, the backend does expose the capability to save the generated script assemblies to disk for debugging purposes.
There are several C++-level interfaces that the caller provides to the backend; IDebugTextOut (which simply abstracts the concept of a debug console print), INWScriptStack (which abstracts the concept of placing items on a VM stack or pulling them off a VM stack), and INWScriptActions, which facilitates interfacing with action service handler implementations, plus management of engine structure types.
The IDebugTextOut and INWScriptActions objects have lifetimes at least exceeding that of a NWSCRIPT_JITPROGRAM handle, whereas the INWScriptStack object has a lifetime at least exceeding that of a NWScriptExecuteScript invocation.
(Recall that as a design guideline, the JIT system should be compatible with action service handlers written against the interpretive script VM. Thus, the JIT system must place items on a dummy VM stack in order to interface with an action service handler in order for a compatible call interface to be retained, hence the INWScriptStack interface.)
In conventional usage, a script host will typically generate code for a script via NWScriptGenerateCode, and then make many repeated calls to the script’s generated code via NWScriptExecuteScript over the lifetime of the script host. Parameters are passed in the form of strings that are internally converted to their final types before calling the script entry point according to the standard conversion conventions for NWScript parameterized scripts. If the script program returns a value (only int-typed return values from entry point symbols are supported by the script compiler), then the return value of the script is returned from NWScriptExecuteScript; otherwise, the DefaultReturnCode parameter is returned.
Similarly, a set of APIs exist to deal with concepts such as aborting an executing script, or dealing with script saved states. This enables the JIT backend to support the same feature set that the interpretive NWScriptVM system does.
Next time, we’ll look at how the generated assembly representing a complete script program is made, including how the various fundamental data types are represented in MSIL and how the generated code corresponding to a given script is laid out structurally.