NWScript JIT engine: Subroutines, extension calls, and saved state

Previously, I discussed the fundamentals of how the NWScript instruction set operates, particularly with respect to the stack, the primary data store of the NWScript instruction set.

I left several topics out though, including subroutine calls, extension calls to APIs exposed by the script host (action service handler invocations, in NWScript parlance), and saved state (pseudo-coroutines). It’s worth discussing these before continuing further as the nature of how these constructs are implemented shapes the implementation of the JIT environment, in turn.

Subroutine calls

As with most programming languages, NWScript allows the programmer to define subroutines. NWScript’s support for subroutines is most akin to C-style functions (there are no classes in NWScript and thus no concept of member functions).

Unlike C, though, NWScript does not allow the control transfer address in the subroutine call instruction (JSR) to come from a non-constant source. Instead, the subroutine target must be hardcoded into the instruction operand (the instruction stream data space is not accessible to the instruction set itself).

One upshot of this is that, despite the fact that there is no structural listing of all subroutines in a compiled NWScript program, one can still determinstically identify such a listing by walking the instruction stream opcode-by-opcode.

Subroutines in NWScript are all invoked with a uniform calling convention, described as follows:

The caller first allocates stack space for the return values (if any) of the callee. (NWScript subroutines have a single return value from a source level perspective, but a subroutine that returns an aggregate type is implemented by simply returning multiple values.) Return value storage is created on the stack from right to left, and is created typed based on the return types of the subroutine. (Canonically, every stack cell is strongly typed in the NWScript instruction set.)
The caller then pushes the arguments (if any) to the callee in right-to-left order.
The caller executes the subroutine call via a JSR (jump subroutine) instruction. The return program counter is stored in a separate, dedicated stack that is invisible to the NWScript program (the stack pointer is not modified).
The callee completes its processing, and then copies any return values, if any existed, down the stack (to where the caller allocated return value space).
The callee cleans the parameters off the stack, and then executes a RET instruction to return to the caller. If there was a return program counter on the invisible return stack, control is transferred to it, otherwise the script program ends (i.e. return from main).

The main points to keep in mind are that the callee allocates the return value and parameter slot space, and the return values are left on the stack after the call.

The user’s entry point symbol (e.g. main) acts the same as any other subroutine from a calling convention perspective. However, there are some additional considerations when it comes to passing parameters to the entry point symbol, which we’ll discuss later.

Action service handler invocations (extension points)

In order for a script program to be particularly useful, it must have a means to interface with the outside world. This is accomplished via a set of action service handlers that a script host can expose to script programs.

Action service handlers can be logically thought of as an array of function pointers with a well-defined, uniform calling convention that can be invoked by the script VM on behalf of NWScript. As with subroutine calls, parameters and return values of action service handler calls are passed on the VM stack. There is a difference in the VM stack calling convention versus standard subroutine calls, however, in that the action servicer handler is responsible for pushing the return values (if any) on to the VM stack after it cleans the parameters. (Recall that for a standard subroutine call, the callee allocates space for the return value.)

In this respect, one can think of an action service handler call as simply an intrinsic, extended instruction that may be invoked.

To call an action service handler, a NWScript program uses the ACTION instruction, which includes several pieces of information:

An action ordinal, indicating which action service handler is to be invoked by the VM.
A parameter count, indicating how many (source-level) arguments have been passed to the action service routine. This is the count of arguments that the compiler recognized for the action handler at compile time and may not necessarily reflect the actual count of stack cells (for example, a ‘vector’ argument consumes three stack cells but increments the action parameter count by one).

Because there is no mechanism to perform structured linking of external procedure calls, the action service ordinal embedded into the instruction opcode indicates the extension function to invoke in the script host (similar to how a BIOS might expose functionality to real mode code via a specific interrupt number).

This mechanism provides two mechanisms to allow the set of actions defined by a script host to be expanded without breaking already compiled script programs:

Because actions are identified by ordinal, a script host can define a new action service hander and place it at the end of the action service handler array. Previously compiled script programs will still function, because prior action ordinals remain unchanged.
The ‘parameter count’ operand embedded in the ACTION instruction allows new parameters to be added (defaulted) to the end of the parameter list for a pre-existing action. In this case, by inspecting the parameter count, an action service handler can detect which parameters should be set to a defaulted value (if an old compiled script is being executed), or which are actually on the VM stack (if a new compiled script is being executed).

Additionally, in this design, the script source text compiler and the script VM themselves do not need to have any specific, hardcoded knowledge of the action service handlers. (The source text compiler looks for a list of prototypes in a well-defined header file, and assumes that these prototypes are actual service handler definitions, with the first prototype being action ordinal zero, and so forth.) This means that the script system itself can be structured to be logically isolated from the actual extension API exposed by a script host, making the core script compilation and runtime environment easily portable to new API sets (i.e. new games).

A typical implementation of the script environment would expose an API for action service handlers to push and pop data values from the VM stack, leaving the actual parameter and return value manipulation up to the implementation of a particular action service handler.

For example, an action service handler named ‘AngleToVector’ that might be prototyped like so in NWScript source text:

// Convert fAngle to a vector
vector AngleToVector(float fAngle);

… might typically be implemented like so by a script host:

#define SCRIPT_ACTION( Name )                                  \
	void                                                       \
	NWScriptHost::OnAction_##Name(                             \
	    __in NWScriptVM & ScriptVM,                            \
	    __in NWScriptStack & VMStack,                          \
	    __in NWSCRIPT_ACTION ActionId,                         \
	    __in size_t NumArguments                               \
	    )                                                      
// ...
SCRIPT_ACTION( AngleToVector )
/*++

Routine Description:

	This script action converts an angle to a vector.

Arguments:

	fAngle - Supplies the angle.

Return Value:

	The routine returns the converted vector.

Environment:

	User mode.

--*/
{
	float        fAngle = VMStack.StackPopFloat( ) * (PI / 180.0f);
	NWN::Vector3 Heading;

	Heading.x = cos( fAngle );
	Heading.y = sin( fAngle );
	Heading.z = 0.0f;

	VMStack.StackPushVector( Heading );
}

In a properly structured script execution environment, all of the application-specific extension points, such as the ‘AngleToVector’ action service handler described above, can thus be cleanly separated out from the core execution environment.

Saved states

As I mentioned previously, NWScript is designed to implement an easy-to-program concept for deferred execution. From a source-level, this is implemented by the concept of an ‘action’ argument type that may be passed to an action service handler (this type cannot be used for any purpose except in an action service handler prototype, as far as the source text compiler is concerned).

When the source text compiler encounters an ‘action’-typed value being referenced, it does not generate a typical stack data store instruction. Instead, it emits a special instruction, STORE_STATE, that directs the execution environment to save the current state of the script environment. For a typical VM-based implementation, this can very easily be accomplished by making a copy of the VM stack (which implicitly duplicates the entire data state of the script program).

The STORE_STATE instruction includes a (constant) operand that specifies a program counter to resume execution at. Using a saved VM execution state, the execution of the currently running script can be resumed at the STORE_STATE resume label. Because almost all of the state associated with the logical script invocation is present on the VM stack, resuming execution at a STORE_STATE resume label is very simple to implement for the VM-based execution environment.

From a NWScript source-text compiler perspective, any time that an ‘action’ type is passed to an action service handler, the source compiler allows an arbitrary statement that returns void to be substituted. (Typically, this might be a call to a subroutine that passes local or global variables on the argument list.)

When such a saved state construct is resumed, this arbitrary statement represents the STORE_STATE resume label, allowing a section of the script to be forked off for later execution (in a way that is conceptually similar in many respects to the standard POSIX ‘fork’ system call). The resume label has access to the global and local variables of the script program at the time when the script program’s state was saved, making maintaining state across a deferred execution context easy for the programmer.

From a script execution environment perspective, the execution environment would typically provide a special-purpose API to retrieve a snapshot of the most recently saved state (as created by a STORE_STATE instruction). This API would typically be used by action service handlers that accept an ‘action’-typed arguments.

To see how this paradigm can be used by a script host to provide a very easy to use way to delay execution, consider the following action service handler in NWN1/NWN2:

// Delay aActionToDelay by fSeconds.
void DelayCommand(float fSeconds, action aActionToDelay);

From a source level, it might be common to utilize this action service handler as follows:

DelayCommand( 1.0, DelayedFunctionCall( params ) );

The effect of such a source-level statement is that the ‘DelayedFunctionCall( params )’ section of logic is executed once the DelayCommand time expires (if we are speaking of the NWN1/NWN2 extension API set). Note that the script author is now freed of complex logic to package state up a timer callback or similar construct as other languages might often require; the deferred execution “just works” in a very intuitive manner.

Tags: NWN2

This entry was posted on Thursday, August 5th, 2010 at 7:00 am and is filed under Programming. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

One Response to “NWScript JIT engine: Subroutines, extension calls, and saved state”

NWScript JIT engine: Parameterized script entry point invocation and subroutine structural analysis « Nynaeve says:

August 9, 2010 at 10:33 pm

[…] Nynaeve Adventures in Windows debugging and reverse engineering. « NWScript JIT engine: Subroutines, extension calls, and saved state […]