C++ with its object features appears to be a natural match for the semantics of Microsoft Windows Driver Model (WDM) and Windows Driver Foundation (WDF) drivers. However, some C++ language features can cause problems for kernel-mode drivers that can be difficult to find and solve. To help you make an informed choice, this paper shares current insights and recommendations from Microsoft's ongoing investigation of using C++ to write kernel-mode drivers for the Windows family of operating systems.
This information applies for the following operating systems: Microsoft Windows 2000 Microsoft Windows XP Microsoft Windows Server 2003 Microsoft Windows Vista Microsoft Windows Server 2008
With its object features, C++ appears to be a natural match for the semantics of Microsoft Windows Driver Model (WDM) and Windows Driver Foundation (WDF) drivers, and it is appealing for the added convenience and expressive power it provides to developers. However, technical issues associated with writing kernel-mode code in C++ using the currently available Microsoft compilers can cause problems in driver code.
Many developers use the C++ compiler as a "super-C" without using the full C++ language, because the C++ compiler enforces certain rules more strictly than standard C compilers and provides some additional features that happen to be safe for use in the context of drivers. Using the C++ compiler in this way is typically expected to work for kernel-mode code. It is "advanced" C++ features such as non-POD ("plain ol' data", as defined by the C++ standard) classes and inheritance, templates, and exceptions that present problems for kernel-mode code. These problems are due more to the C++ implementation and the kernel environment than to the inherent properties of the C++ language.
Microsoft is investigating issues related to using C++ to write kernel-mode drivers for the Microsoft Windows family of operating systems. This paper shares current insights from Microsoft developers about the tradeoffs of writing drivers in C++.
The information in this paper applies to the standard Windows Driver Development Kit (DDK) build environment for creating kernel-mode drivers as of the Windows Server 2003 Service Pack 1 (SP1) DDK. If you are using build environments or compilers other than those provided with the DDK or Windows Driver Kit (WDK), you should determine whether any of the issues noted here apply to your development environment and whether there are additional concerns. The information to determine this might be available as documentation from the compiler provider, but it is more likely that you will have to inspect generated code and link maps, as discussed below.
This paper does not attempt to explain how to write kernel-mode drivers in C++. This paper assumes that you understand the basic principles of writing kernel-mode drivers. For general information about writing kernel-mode drivers, see the Kernel-Mode Architecture Guide and device-specific information in the Windows DDK documentation.
Kernel-mode code must take into account the following considerations, to avoid data corruption, unstable systems, and operating system crashes.
The kernel manages its own memory pages: You must manage the two conflicting requirements of correct operation and minimizing memory footprint.
Code and data must be in memory if code is to be executed when paging is not allowed. That is, when the system is running at IRQL DISPATCH_LEVEL or higher, the pages that contain the currently executing routine, any routines that it calls, or data it accesses (and so on down the chain of function calls) must be locked into memory until the IRQL drops below DISPATCH_LEVEL. Otherwise, a page fault occurs and the system crashes.
To increase the amount of memory available for user applications, drivers should make their code and data segments pageable where reasonable. This can improve system performance.
Not all processor resources are available all the time.
On x86 systems, the floating point and multimedia units are not available in kernel mode unless specifically requested. Trying to use them improperly may or may not cause a floating-point fault at raised IRQL (which will crash the system), but it could cause silent data corruption in random processes. Improper use can also cause data corruption in other processes; such problems are often difficult to debug.
On Intel Itanium systems, not all of the floating-point registers are available.
Resources, particularly the stack, are severely constrained. Resources that are "inexpensive" in user space may be expensive or require different methods to obtain in kernel mode. Specifically, the size of the kernel stack is 3 pages.
Not all of the standard libraries (C or C++) are present in kernel mode.
Versions of standard libraries provided with the build environment for use in kernel mode are not necessarily the same as those provided in user mode, because they cannot rely on the Win32 API and because they must be written to conform to kernel mode requirements. Kernel-mode implementations of standard libraries may have limited functionality or be constrained by other properties of kernel mode.
User-mode implementations of library routines might not work in kernel mode. Some do not link, some do not run, and some might appear to run, but with unintended side-effects.
It is important to remember that the compiler generates correct object code, but it may not be the code you expect, organized in the way that you expect. This is always true, but it is more likely to be a problem for C++ than for C. You must examine the object code to be sure it matches your expectations, or at least will work correctly in the kernel environment.
Output from the currently available C++ compiler is not guaranteed to work in kernel mode across all platforms and versions. The more your code uses the "advanced" C++ features, the more you risk problems with interoperability.
Key Areas for Kernel-Mode Code The following areas require particular care in kernel-mode drivers. These apply to both languages (C and C++), but may be more problematic in C++ code because the compiler does more things automatically, and you may not realize that it has created a problem.
Floating-point instructions must be properly protected—for example, with KeSaveFloatingPointState and KeRestoreFloatingPointState or other mechanisms described in Windows DDK documentation.
The InterlockedXxx functions should insert memory barrier instructions in the generated code. Check the output to ensure that the barriers you require are present.
The semantics of the volatile keyword must be carefully understood so that the intended level of indirection is the volatile object. Sometimes the volatile item is the pointer, sometimes it is the object itself, and sometimes both pointer and object are volatile. Applying the volatile keyword to the wrong thing is a common error, so carefully check your use of this keyword. For example, if you intend to use a non-volatile pointer to a volatile location, ensure (by careful code reading) that your code does not implement a volatile pointer to a non-volatile location.
Stack frames are severely limited. For example, on x86 systems the total stack available to a thread is 12K.
Non-obvious jumps or memory usage in function source code creates the risk of an unexpected page fault. Specifically, the compiler can generate functions and data objects whose existence is not immediately obvious. For details about objects that might not be expected, see "Code in Memory" later in this paper.
The use of inline functions (and __forceinline), to ensure that the code is resident in memory interacts with the compiler's optimization rules.
A function you expect to be inlined might not be inlined. Consequently, using the function might cause a page fault.
The compiler might generate inline code for a function when you do not expect it to.
Safe and Unsafe C++ Constructs Although it is not currently possible to provide a strict and testable definition of the "completely safe" subset of C++ for use in kernel-mode code, some useful guidelines are available for constructs that are usually safe and those that are usually not.
A good rule of thumb is that a C++ construct is probably safe if there is an obvious way to rearrange the code to make it legal C. An example is the relaxed ordering of declarations, including declaring variables in for statements.
The stricter type checking in C++ may disallow a technically legal, but semantically incorrect, construct. Such stricter type checking is a useful means of improving the reliability of the driver.
Anything involving class hierarchies or templates, exceptions, or any form of dynamic typing is likely to be unsafe. Using these constructs requires extremely careful analysis of the generated object code. Limiting use of classes to POD classes significantly reduces the risks.
Reviewing Generated Code One of the original design goals for C was that it be fairly easy to determine what the generated object code would be, thus making it quite suitable for kernel-mode work. C++ is a much more complex language, and consequently making it work in the kernel environment has proven to be much more difficult.
To write drivers in C++, you must understand the code generated by the compiler, ensure that it meets kernel-mode requirements, and ensure that it avoids the problems discussed in this paper. Be prepared to read the object code and to scan the link map to be sure that code and data are placed in the proper locations and that only kernel-safe libraries are used. Check code for pageability, inline functions, and correct program ordering.
We strongly recommend that you begin such code reading and testing now, rather than waiting until the source code is complete. Check early prototypes and test potentially troublesome usage so that if you encounter an insurmountable problem with C++, you have time to find and implement an alternative solution.
Microsoft developers have discovered a number of areas where C++ presents particular problems for kernel-mode drivers.
Code in Memory The most severe problem with using C++ for writing kernel-mode drivers is the management of memory pages, particularly code in memory, rather than data. It is important that large drivers be pageable, and paged code is not always in memory. All of the code that will be needed must be resident before the system enters a state in which paging cannot occur.
The way the C++ compiler generates code for non-POD classes and templates makes it particularly difficult to know where all the code required to execute a function might go and thus difficult to make the code safely pageable. The compiler automatically generates code for at least the following objects. These objects are put "out of line," and the developer has no direct control over the section in which they are inserted, which means they could happen to be paged out when needed.
Compiler-generated code such as constructors, destructors, casts, and assignment operators. (These can often be explicitly provided, but it requires taking care to recognize that they need to be provided.)
Adjustor thunks, used to convert between various classes in a hierarchy.
Virtual function thunks, used to implement calls to virtual function.
Virtual function table thunks, used to manage base classes and polymorphism.
Template code bodies, which are emitted at first use unless explicitly instantiated.
The virtual function tables themselves.
The C++ compiler does not provide mechanisms for direct control of where these entities are placed in memory. The pragmas necessary to control memory placement were not designed with C++ in mind. #pragma alloc_text cannot be used to control the location of a member function because (for several reasons) there is no way to name the member function. The scope of #pragma code_seg is ambiguous for compiler-generated functions, expanded template bodies, and compiler-generated thunks. There is no mechanism at all for controlling the location of virtual function tables, since they are not quite either code or data from the point of view of the compiler (they go into a section all their own).
If a function in a header is declared inline, but the compiler does not generate inline code for it, the function may be emitted in more than one code segment depending on where the function is used. When a class template is instantiated, it is generated in the section that is current at the point of first use, and it is not always immediately obvious which section that is. Both of these issues can lead to code being pageable when it should not be, or vice versa.
If a class hierarchy is in use, whether code for a base class needs to be in memory when the derived class is accessed depends on exactly which functions in the base class are called from the derived class (and whether the compiler can inline them), as well as what sections they were emitted in. For example, if the derived class provides a method that uses no base class methods, the base class code need not be in memory. However, it is difficult to know when that is the case. Additionally, any thunks used with the hierarchy and its classes might also need to be resident in memory.
Stack The compiler has always been free to generate additional data on the stack, such as creating temporary objects, deferring call cleanup, and other actions that use the stack in a hidden fashion. There are few differences between C and C++ with respect to the way a single function uses the stack, but because of the additional mechanisms that usually result in more function calls, C++ will often use more total stack. You should keep stack size in mind, as you would in any programming language when stack space is limited.
Exceptions also have an effect on the stack. See "Exceptions and RTTI" later in this paper.
Dynamic Memory Driver development tools such as Driver Verifier rely on tagged memory to validate memory usage in drivers. Using operator new and operator delete to allocate and free memory weakens the ability of these tools to detect memory leaks and other problems in driver code.
In user space, operator new and operator delete are convenient, but they can become cumbersome in drivers that use multiple memory pools or tagged memory. Because "placement new" takes additional operands, it is possible to pass in the information needed to select memory pools or generate tags into an overloaded operator new , but this is not much easier than using the memory functions directly. Because there is no "placement delete" with additional arguments to pass in a tag or a pooltype, there is no way to pass in a tag (or memory control, if needed) when using operatordelete, making it impossible to check that the tag at the point of release was the intended one, thus defeating much of the benefit of using tagged memory. It is possible to delete memory without providing a tag, but in each case you will need to decide whether the risks and disadvantages of not using tags in driver code overcome the apparent convenience.
Memory tracing tools often record the return address of the function that made an allocation. Some C++ compilers implement operator new as a function, causing all allocations to appear to come from a single location and defeating the purpose of that aspect of the memory tracing tool. This can be addressed, but you will have to determine for yourself if there is a benefit in doing so over using memory allocation directly.
Libraries There are a number of distinct concerns in creating and using libraries:
The name of exported C++ functions can vary from one release to another.
Not all of the functions available in user mode are available in the kernel-mode libraries.
The Standard Template Library is designed to work with data objects from a single DLL.
C++ functions are exported based on their entire signature, not on their name alone (as C functions are). The name of a C++ function is "mangled" to contain type information, which becomes part of its signature. Although the rules for name mangling are fairly stable, there is no guarantee that the mangled names will be the same from release to release of the compiler. Therefore, C++ functions cannot be reliably exported to a library from one release to the next, although functions that can be represented as extern "C" functions can. In addition, the use of a .def file can help mitigate the problem. Note that extern "C" functions are unique only on the basis of name, not the entire signature as in C++.
Not all library functions are available in kernel mode, particularly those associated with the "advanced" C++ language features. The Standard Template Library is the "usual" way to implement many C++ concepts such as variably sized arrays. However, it is unsafe to simply assume that the Standard Template Library is present and usable. Although much of the Standard Template Library is implemented as source code in headers, it occasionally uses library functions or other features that are not available or usable in the kernel environment.
The Standard Template Library is also based on the assumption that each data object it uses exists in only a single DLL. Although in most cases it works to pass references to POD objects across DLL boundaries, passing references to more complex structures such as lists may cause runtime failures that can be hard to diagnose. Known issues include the fact that freeing memory in a DLL other than the one in which the memory was allocated can cause failures (at least for debug-mode compiles) and that the "end of list" marker differs between DLLs, which can cause unexpected runaway list searches. You must be aware of these problems and take steps to prevent them.
We do not recommend using Standard Template Library functions in a kernel-mode driver, because it is not possible to assume that the Standard Template Library is there and "just works." In the case of kernel-mode code, understanding precisely how a particular data structure is implemented helps assure that it does not violate the requirements of kernel space. It is also possible that a specialized implementation will be smaller than the more general Standard Template Library functions, although the library is often very good in that regard.
Exceptions and RTTI It is tempting to use C++ exceptions, but they are difficult to implement in kernel mode. C++ exceptions require a kernel-mode-safe library, which does not currently exist. They also present an unavoidable runtime problem, because exception records that are generated when an exception is thrown are large objects on the very limited stack. On x86 systems, exception records are not particularly large (although they are large compared with many typical stack frames), but on Intel Itanium systems they are quite large: 3K to 4K, or one-sixth to one-eighth of the available 24K stack space. To preserve portability of a driver to 64-bit platforms, exceptions would have to be used in a very limited way, even on the x86 architecture. The rethrow operator can cause multiple exception records on the stack. Note that Structured Exception Handling (__try /__except/__finally) is available in kernel mode, although the space concerns remain. C++ exceptions have a number of semantic subtleties that prevent them from simply mapping onto Structured Exception Handling.
Run-time type information (RTTI) also requires a library that does not currently exist for C++ in kernel mode. So far, there have been few, if any, requests for this in kernel-mode code. Whether this lack of demand is a consequence of the other problems masking it or because it is not useful in kernel mode is unknown.
Compiler Versions Although the C++ language standard is stable, implementation techniques are still evolving. Consequently, compiler versions may change the way generated code operates. Such changes are unlikely to affect user-mode code, but they can affect kernel-mode code in which more of the underlying implementation is exposed to (and sometimes provided by) driver developers; version-to-version interoperability of kernel-mode code is not guaranteed.
You should carefully control any interface between two drivers or a driver and the operating system, usually by writing the interface in C instead of C++. Otherwise, version-to-version incompatibilities in the C++ implementation may cause interoperability failures.
Static and Global Scope Variables and Initialization C++ static variables (declared at either global or local scope) present a number of problems for drivers.
The C++ standard allows static variables declared at local scope to be initialized at the time of first use (the first time the scope is entered). The way this is implemented both creates the possibility of race conditions during initialization and a particularly high risk of unintended data sharing between threads, because variables declared static are globally static, not per-thread. For globally static data (shared among threads) it is best to do it explicitly at global scope, to make sure access protections appropriate to the situation are applied.
If a C++ global object requires initialization (a global constructor) is declared, there is no mechanism for the constructor to be called. Global objects that require constructors should either not be used, or you must develop a mechanism to assure that the constructor is called. Several sources on the Web claim to have solved this problem, and one of those solutions might work for you.
The order of initialization of global objects is not specified by the C++ standard, so even if there were a mechanism to call their constructors, either the order of initialization must be explicitly controlled by the driver code, or it must not matter.
Microsoft neither endorses nor prohibits the use of C++ for kernel-mode drivers. This conservative position is driven in part by the issues described in this paper, and in part by the need to support all platforms. You must be aware of known problems and risks described in this paper before attempting any development in C++ for kernel mode, and you should be alert for other issues not yet identified.
Microsoft is actively investigating ways of making C++ more usable in the kernel. It is not yet known whether all of the C++ features that can be applied to user-mode code can be made available for kernel-mode code.
The use of the C++ compiler as a "super-C" is typically expected to work, but such use of the compiler is at the developer's own risk.
It is currently impractical to identify problematic C++ constructs mechanically, so developers must carefully analyze the compiler output to ensure that the generated code is suitable for kernel mode.
Before committing to the use of C++, you should carefully assess whether it will work for you. In particular, you should test C++ constructs early in the development process, to ensure that the constructs do not cause the problems described in this paper or otherwise violate the principles of kernel-mode driver writing.
Some of the problems discussed in this paper might not become apparent until near the end of development, and that solving them might require code to be completely rewritten.
Several of the most insidious problems are extremely difficult to reproduce on demand while testing the driver, so a driver with an inherent unreliability might appear to run for extended periods with no problems and fail at random times. This reinforces the need for careful analysis.
It is possible to avoid many problems by careful coding and close examination of generated code. Other problems are very difficult to overcome. All of them require extra care and careful analysis on the part of the developer.