Side projects: NWScript JIT engine (introduction)

As an interesting experiment in program analysis, one of the side projects that I’ve been working on in my spare time is an execution environment for NWScript, the scripting language for Neverwinter Nights / Neverwinter Nights 2 and the Dragon Age series of games. (I happen to have spent some time analyzing NWN/NWN2 recently, hence this project.)

NWScript 101

From a source-level perspective, NWScript provides a C-like programming environment (there are no native arrays or pointers in the NWN/NWN2 iterations of NWScript). Like C, NWScript source text must be compiled to an object-code form in order for it to be consumed by the execution environment.

As a result, only the source text compiler for NWScript must deal with the program source code directly; the actual execution environment only deals with the NWScript object code. I’ll go into more details as to how the object code is structured in a future posting, but here’s a brief overview necessary to understand the scope of the project:

The NWScript object code format is essentially simply a raw instruction stream (think an old-style .com file), consisting of a series of NWScript instructions. The instruction set (in NWN/NWN2) is a stack-based, strongly-typed, small-complexity set of 45 distinct, variable-length instructions. (Some instructions perform several overloaded functions based on their operand types.) All control transfers in NWScript must be performed to fixed locations hardwired at compile time.

These properties combine to make the NWScript object code format readily analyzable (given sufficient effort), despite the fact that it is essentially provided in a completely unstructured, raw instruction-stream based form.

Project scope

The execution environment that I set out on creating involves several different components:

  • A NWScript VM, capable of interpretive execution of the NWScript object code format. (The BioWare games in question also use an interpretive VM style of execution environment.) While potentially slow(er), the NWScript VM supports even atypical, hand-built NWScript programs (as opposed to those emitted by a compiler).
  • A NWScript analyzer, capable of inspecting a NWScript program in object code form and performing several levels of analysis, up to producing a high-level, full-featured IR (or intermediate representation) describing the NWScript program’s behavior. The IR raising phase implemented by the NWScript analyzer is agnostic to any particular following backend phase, and it provides an opportunity to optimize the generated IR while it is in a backend-neutral form.
  • A NWScript JIT backend, capable of translating the high-level NWScript IR into machine-level instructions for native execution. The JIT backend is intended to be essentially a simple translator to a JIT environment that consumes NWScript IR (generated by the analysis phase), and ultimately emits native code. For this project, the first NWScript JIT backend that I have created emits MSIL that can be compiled to native code by the .NET JIT environment.

Out of scope for this particular posting series is the concept of a NWScript host, which is a program that invokes the NWScript execution environment in order to run a script. The NWScript host provides a series of extension points (action service handlers) that may be called by the script program in order to interface with the outside world. An example of a NWScript host might be the NWN1/NWN2 game servers.

An advanced NWScript host might utilize both the NWScript VM and the NWScript JIT system provided by this project (for example, scripts could be interpretively executed by the script VM until background JIT compilation finishes, after which subsequent execution could fall over to the JIT’d version of the script program).

Next time, I’ll go into some more gory details on the nature of the NWScript object code.

Credits

Tim Smith (Torlack), who now works at BioWare, documented the NWScript instruction set several years ago, and Justin Olbrantz contributed much of the code to the IR-raising portion of this project.

Tags:

3 Responses to “Side projects: NWScript JIT engine (introduction)”

  1. Nadav says:

    This sounds awesome. I wonder why you decided to use MSIL and not LLVM . LLVM has many advantages for JIT writers. It would be interesting to read your next posts.

  2. Skywing says:

    Justin actually suggested LLVM for this project. (Given how the IR generation phase is architected, it should be quite portable to other JIT code emit systems other than MSIL — the analyzer itself shouldn’t need any changes in that regard.)

    I went with MSIL for two reasons:

    1) Currently, all planned consumers of the JIT are Windows targets, and
    2) MSIL will get me support for all three architectures I support on Windows off the bat (x86, amd64, and ia64). [Yes, the ia64 build of the VM actually works and has been tested — though I would not be surprised if I end up being the only one to have ever tried it on an ia64.] Based on my understanding, LLVM’s Windows support isn’t fully mature and it seemed best to reduce the number of unknown components involved given all the moving parts between analyzing the program, generating IR, and translating it into a JIT-able form, at least for the initial bring-up work.

  3. […] Nynaeve Adventures in Windows debugging and reverse engineering. « Side projects: NWScript JIT engine (introduction) […]