Debugger internals: Why do ntoskrnl and ntdll have type information?

Johan Johansson sent me a mail asking why nt and ntdll have partial type information included (if you’re at all experienced with debugging on Windows, you’ll know that public symbols, such as what Microsoft ships on the public symbol server at, don’t typically include type information. Instead, one typically needs access to private symbols in order to view types.

However, nt and ntdll are an exception to this rule, on Windows XP or later. Unlike all the other PDBs shipped by Microsoft, the ones corresponding to ntdll and ntoskrnl do include type information for a seemingly arbitrary mix of types, some publicly documented and some undocumented. There is, however, a method to the madness with respect to what symbols are included in the public nt/ntdll PDBs.

To understand what symbols are chosen and why, though, it’s necessary to know a bit of history.

Way back in the days when Windows NT was still called Windows NT (and not Windows 2000 or Windows XP), the debugger scene was a much less friendly place. In those days, “remote debugging” involved dialing up your debugger person with a modem, and if you wanted to kernel debug a target, you had to run a different kernel debugger program specific to the architecture on the target computer.

Additionally, you had to use a debugger version that was newer than the operating system on the target computer or things wouldn’t work out very well. Furthermore, one had to load architecture-specific extension dlls for many kernel debugging tasks. One of the reasons for these restrictions, among other things, is that for different architectures (and different OS releases), the size and layout of many internal structures used by the debugger (and debugger extension modules) to do their work varied. In other words, the size and layout of, say, EPROCESS might not be the same on Windows NT 3.1 x86 vs Windows NT 4.0 for the DEC Alpha.

When Windows 2000 was released, things became a little bit better- Windows 2000 only publicly supported x86, which reduced the number of different architectures that WinDbg needed to publicly support going forward. However, Windows XP and Windows Server 2003 reintroduced mainstream support for non-x86 architectures (first IA64 and then x64).

At some point on the road to Windows XP and Windows Server 2003, a decision was made to clean things up from the debugger perspective and introduce a more manageable way of supporting a large matrix of operating systems and target architectures.

Part of the solution devised involved providing a unified, future-compatible (where possible; obviously, new or radically redesigned OS functionality would require debugger extension changes, but things like simple structure size chages shouldn’t require such drastic measures) method for accessing data on the remote system. Since all that a debugger extension does in the common case is simply quickly reformat data on the target system into an easily human-readable format (such as the process list returned by !process), a unified way to communicate structure sizes and layouts to debugger extensions (and the debugger engine itself) would greatly reduce the headache of supporting the debugger on an ever-expanding set of platforms and operating systems. This solution involved putting the structures used by the debugger engine itself and many debugger extension into the symbol files shipped with ntoskrnl and ntdll, and then providing a well-defined API for retrieving that type information from the perspective of a debugger extension.

Fast-forward to 2007. Now, there is a single, unified debugger engine for all target platforms and architectures (the i386kd.exe and ia64kd.exe that ship with the DTW distribution are essentially the same as the plain kd.exe and are vestigal remains of a by-gone era; these files are simply retained for backwards compability with scripts and programs that drive the debugger), real remote debugging exists, and your debugger doesn’t break every time a service pack is released. All of this is made possible in part due to the symbol information available in the ntoskrnl and ntdll PDB files. This is why you can use WinDbg to debug Windows Vista RTM, despite the fact that WinDbg was released months before the final RTM build was shipped.

Symbol support is also a reason why there is no ‘srv03ext’, ‘srv03fre’, or ‘srv03chk’ extension directories under your Debugging Tools for Windows folder. The nt4chk/nt4fre/w2kchk/w2kfre directories contain debugger extensions specific to that Windows build. Due to the new unified debugger architecture, there is no longer a need to tie a binary to a particular operating system build going forward. Because Windows 2000 and Windows NT 4.0 doesn’t include type data, however, the old directories still remain for backwards compatibility with those platforms.

So, to answer Johan’s question: All of the symbols in the ntoskrnl or ntdll PDBs should be used by the debugger engine itself, or some debugger extension, somewhere. This is the final determining factor as to what types are exposed via those PDBs, to my knowledge; whether a public debugger extension DLL (or the debugger engine itself) uses them or not.

Comments are closed.