As an interesting experiment in program analysis, one of the side projects that I’ve been working on in my spare time is an execution environment for NWScript, the scripting language for Neverwinter Nights / Neverwinter Nights 2 and the Dragon Age series of games. (I happen to have spent some time analyzing NWN/NWN2 recently, hence this project.)
From a source-level perspective, NWScript provides a C-like programming environment (there are no native arrays or pointers in the NWN/NWN2 iterations of NWScript). Like C, NWScript source text must be compiled to an object-code form in order for it to be consumed by the execution environment.
As a result, only the source text compiler for NWScript must deal with the program source code directly; the actual execution environment only deals with the NWScript object code. I’ll go into more details as to how the object code is structured in a future posting, but here’s a brief overview necessary to understand the scope of the project:
The NWScript object code format is essentially simply a raw instruction stream (think an old-style .com file), consisting of a series of NWScript instructions. The instruction set (in NWN/NWN2) is a stack-based, strongly-typed, small-complexity set of 45 distinct, variable-length instructions. (Some instructions perform several overloaded functions based on their operand types.) All control transfers in NWScript must be performed to fixed locations hardwired at compile time.
These properties combine to make the NWScript object code format readily analyzable (given sufficient effort), despite the fact that it is essentially provided in a completely unstructured, raw instruction-stream based form.
The execution environment that I set out on creating involves several different components:
- A NWScript VM, capable of interpretive execution of the NWScript object code format. (The BioWare games in question also use an interpretive VM style of execution environment.) While potentially slow(er), the NWScript VM supports even atypical, hand-built NWScript programs (as opposed to those emitted by a compiler).
- A NWScript analyzer, capable of inspecting a NWScript program in object code form and performing several levels of analysis, up to producing a high-level, full-featured IR (or intermediate representation) describing the NWScript program’s behavior. The IR raising phase implemented by the NWScript analyzer is agnostic to any particular following backend phase, and it provides an opportunity to optimize the generated IR while it is in a backend-neutral form.
- A NWScript JIT backend, capable of translating the high-level NWScript IR into machine-level instructions for native execution. The JIT backend is intended to be essentially a simple translator to a JIT environment that consumes NWScript IR (generated by the analysis phase), and ultimately emits native code. For this project, the first NWScript JIT backend that I have created emits MSIL that can be compiled to native code by the .NET JIT environment.
Out of scope for this particular posting series is the concept of a NWScript host, which is a program that invokes the NWScript execution environment in order to run a script. The NWScript host provides a series of extension points (action service handlers) that may be called by the script program in order to interface with the outside world. An example of a NWScript host might be the NWN1/NWN2 game servers.
An advanced NWScript host might utilize both the NWScript VM and the NWScript JIT system provided by this project (for example, scripts could be interpretively executed by the script VM until background JIT compilation finishes, after which subsequent execution could fall over to the JIT’d version of the script program).
Next time, I’ll go into some more gory details on the nature of the NWScript object code.
Tim Smith (Torlack), who now works at BioWare, documented the NWScript instruction set several years ago, and Justin Olbrantz contributed much of the code to the IR-raising portion of this project.