Search

Tuesday, April 02, 2013

C++/CLI and mixed mode programming

beach

I had very limited idea about how mixed mode programming on .NET works. In mixed mode the binary can have both native and managed code. They are generally programmed in a special variant of the C++ language called C++/CLI and the sources needs to be compiled with /CLR switch.

For some recent work I am doing I had to ramp up on Managed C++ usage and how the .NET runtime supports the mixed mode assemblies generated by it. I wrote up some notes for myself and later thought that it might be helpful for others trying to understand the inner workings.

History

The initial foray of C++ into the managed world was via the managed extension for C++ or MC++. This is deprecated now and was originally released on VS 2003.  This MC++ syntax turned out to be too confusing and wasn’t adopted well. The MC++ was soon replaced with C++/CLI. C++/CLI added limited extension over C++ and was more well designed so that the language feels more in sync with the general C++ language specification.

C++/CLI

The code looks like below.

ref class CFoo
{
public:
CFoo()
{
pI = new int;
*pI = 42;
str = L"Hello";
}

void ShowFoo()
{
printf("%d\n", *pI);
Console::WriteLine(str);
}

int *pI;
String^ str;
};

In this code we are defining a reference type class CFoo. This class uses both managed (str) and native (pI) data types and seamlessly calls into managed and native code. There is no special code required to be written by the developer for the interop.


The managed type uses special handles denoted by ^ as in String^ and native pointers continue to use * as in int*. A nice comparison between C++/CLI and C# syntax is available at the end of http://msdn.microsoft.com/en-US/library/ms379617(v=VS.80).aspx. Junfeng also has a good post at http://blogs.msdn.com/b/junfeng/archive/2006/05/20/599434.aspx


The benefits of using mixed mode



  1. Easy to port over C++ code and take the benefit of integrating with other managed code
  2. Access to the extensive managed API surface area
  3. Seamless managed to native and native to managed calls
  4. Static-type checking is available (so no mismatched P/Invoke signatures)
  5. Performance of native code where required
  6. Predictable finalization of native code (e.g. stack based deterministic cleanup)

 


Implicit Managed and Native Interop


Seamless, static type-checked, implicit, interop between managed and native code is the biggest draw to C++/CLI.


Calls from managed to native and vice versa are transparently handled and can be intermixed. E.g. managed --> unmanaged --> managed calls are transparently handled without the developer having to do anything special. This technology is called IJW (it just works). We will use the following code to understand the flow.

#pragma managed
void ManagedAgain(int n)
{
Console::WriteLine(L"Managed again {0}", n);
}

#pragma unmanaged
void NativePrint(int n)
{
wprintf(L"Native Hello World %u\n\n", n);
ManagedAgain(n);
}

#pragma managed

void ManagedPrint(int n)
{
Console::WriteLine(L"Managed {0}", n);
NativePrint(n);
}

The call flow goes from ManagedPrint --> NativePrint –> ManagedAgain


Native to Managed


For every managed method a managed and an unmanaged entry point is created by the C++ compiler. The unmanaged entry point is a thunk/call-forwarder, it sets up the right managed context and calls into the managed entry point. It is called the IJW thunk.


When a native function calls into a managed function the compiler actually binds the call to the native forwarding entry point for the managed function. If we inspect the disassembly of the NativePrint we see the following code is generated to call into the ManagedAgain function

00D41084  mov         ecx,dword ptr [n]         // Store NativePrint argument n to ECX
00D41087 push ecx // Push n onto stack
00D41088 call ManagedAgain (0D4105Dh) // Call IJW Thunk


Now at 0x0D4105D is the address for the native entry point. If forwards the call to the actual managed implementation

ManagedAgain:
00D4105D jmp dword ptr [__mep@?ManagedAgain@@$$FYAXH@Z (0D4D000h)]

Managed to Native


In the case where a managed function calls into a native function standard P/Invoke is used. The compiler just defines a P/Invoke signature for the native function in MSIL

.method assembly static pinvokeimpl(/* No map */) 
void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)
NativePrint(int32 A_0) native unmanaged preservesig
{
.custom instance void [mscorlib]System.Security.SuppressUnmanagedCodeSecurityAttribute::.ctor() = ( 01 00 00 00 )
// Embedded native code
// Disassembly of native methods is not supported.
// Managed TargetRVA = 0x00001070
} // end of method 'Global Functions'::NativePrint


The managed to native call in IL looks as

Manged IL:
IL_0010: ldarg.0
IL_0011: call void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) NativePrint(int32)

The virtual machine (CLR) at runtime generates the correct thunk to get the managed code to P/Invoke into native code. It also takes care of other things like marshaling the managed argument to native and vice-versa.


Managed to Managed


While it would seem this should be easy, it was a bit more convoluted. Essentially the compiler always bound to native entry point for a given managed method. So a managed to managed call degenerated to managed -> native -> managed and hence resulted in suboptimal double P/Invoke. See http://msdn.microsoft.com/en-us/library/ms235292(v=VS.80).aspx


This was fixed in later versions by using dynamic checks and ensuring managed calls always call into managed targets directly. However, in some cases managed to managed calls still degenerate to double P/Invoke. So an additional knob provided was the __clrcall calling convention keyword. This will stop the native entry point from being generated completely. The pitfall is that these methods are not callable from native code. So if I stick in a __clrcall infront of ManagedAgain I get the following build error while compiling NativePrint.

Error	2	error C3642: 'void ManagedAgain(int)' : cannot call a function with
__clrcall calling convention from native code <filename>

/CLR:PURE


If a C++ file is compiled with this flag, instead of mixed mode assembly (one that has both native and MSIL) a pure MSIL assembly is generated. So all methods are __clrcall and the Cpp code is compiled into MSIL code and NOT to native code.


This comes with some benefits as in the assembly becomes a standard MSIL based assembly which is no different from another managed only assembly. Also it comes with some limitation. Native code cannot call into the managed codes in this assembly because there is no native entry point to call into. However, native data is supported and also the managed code can transparently call into other native code. Let's see a sample


I moved all the unmanaged code to a separate /C++:CLI dll as

void NativePrint(int n)
{
wprintf(L"Native Hello World %u\n\n", n);
}

Then I moved my managed C++ code to a new project and compiled it with /C++:PURE

#include "stdafx.h"
#include

#include "..\Unmanaged\Unmanaged.h"
using namespace System;

void ManagedPrint(int n)
{
char str[30] = "some cool number"; // native data
str[5] = 'f'; // modifying native data
Console::WriteLine(L"Managed {0}", n); // call to BCL
NativePrint(n); // call to my own native methods
printf("%s %d\n\n", str, n); // CRT
}

int main(array ^args)
{
ManagedPrint(42);
return 0;
}

The above builds and works fine. So even with C/++:PURE I was able to



  1. Use native data like a char array and modify it
  2. Call into BCL (Console::WriteLine)
  3. Call transparently into other native code without having to hand generate P/Invoke signatures
  4. Use native CRT (printf)

However, no native code can call into ManagedPrint. Also do note that even though Pure MSIL is generated, the code is unverifiable (think C# unsafe). So it doesn't get the added safety that the managed runtime provides (e.g. I can just do str[200]  = 0 and not get any bounds check error)


/CLR:Safe


/CLR:safe compiler switch generates MSIL only assemblies whose IL is fully verifiable. The output is not different from anything generated from say C# or VB.NET compilers. This provides more security to the code but at the same time losses on several capabilities over and above the PURE variant



  1. No support for CRT


  1. Only explicit P/Invokes

So for /CLR:Safe we need to do the following

[DllImport("Unmanaged.dll")]
void NativePrint(int i);

void ManagedPrint(int n)
{
//char str[3000] = "some cool number"; // will fail to compile with
//str[5] = 'f'; // "this type is not verifiable"

Console::WriteLine(L"Managed {0}", n);

NativePrint(n); // Hand coded P/Invoke

Migration


MSDN has some nice articles on people trying to migrate from /CLR to



  1. To /CLR:Pure http://msdn.microsoft.com/en-US/library/ms173253(v=vs.80).aspx


  1. To /CLR:Safe http://msdn.microsoft.com/en-US/library/ykbbt679(v=vs.80).aspx

No comments: