Friday, December 27, 2013

Getting Arduino Uno work on Windows 8

I keep needing to do this and I couldn’t find one place where all the instructions are placed, so capturing it here. Also the standard instructions at http://arduino.cc/en/Guide/Windows didn’t work for me.

Get the Software

  1. Get Arduino Software from http://arduino.cc/en/Main/Software. Choose the Windows (ZIP file) and unzip it to local PC. I used D:\Skydrive\bin\arduino-1.0.5

Disable Driver Signature Enforcement

Unfortunately this step does disable a security feature of the OS, but I couldn’t find a way to do this otherwise.

  1. Open an command prompt and run the command
    shutdown.exe /r /o /f /t 00
  2. System restarts with Choose an option screen
  3. Select Troubleshoot
  4. Select Advanced options
  5. Select Windows Startup Settings
  6. Click Restart and it will restart into the Advanced Boot Options Screen
  7. Press the keyboard button for the number for Disable Driver Signature Enforcement (which was 7 in my case)
  8. System will restart with driver signature enforcement disabled.

Install The Driver

Press Windows key + W and type “Devices and Printers” and open that. Connect the Arduino board over USB. You should see something called Unknown Device shown in it.

image

Run the installer "D:\Skydrive\bin\arduino-1.0.5\drivers\dpinst-amd64.exe” or locate corresponding path from your installation folder. The window above should get updated as below.

image

You can also verify by again hitting Windows Key + W and typing Device Manager and launching it. Then expand to see the following

image

Saturday, December 14, 2013

.NET: NGEN, explicit loads and load-context promotion

Sunset over the Pacific

If you want to know the conclusion and want to skip the details jump to the end for the climax :). If you care to see this feature in, please vote for this at http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/5194915-ngen-should-support-explicit-path-based-assembly-l

Load-Context

In my previous post on how NGEN loads Native images I mentioned that NGEN images are supported only in the default load context. Essentially there are 3 load contexts (excluding Reflection-only context) and based on how you load an assembly it lands in one of those 3 contexts. You can read more about the load contexts at http://blogs.msdn.com/b/suzcook/archive/2003/05/29/57143.aspx. However for our purposes
  1. Default context: This is the context where assembly loaded through implicit assembly references or Assembly.Load(…) call lands
  2. LoadFrom context is where assemblies loaded with Assembly.LoadFrom call is placed
  3. Null-context or neither context is where assemblies loaded with Assembly.LoadFile, reflection-emit (among other APIs) are placed.
Even though a lot of people view the contexts only in the light of how they impact searching of assembly dependencies, they have other critical impact. E.g. native images of an assembly (generated via NGEN) is only loaded if that assembly is loaded in the default context.

#1 and #3 are pretty simple to understand. If you use Assemby.Load or if your assembly has other implicit assembly dependency then for those assemblies NativeBinder will search for their native images. If you try to load an assembly through Assembly.LoadFile(“c:\foo\some.dll”) then it will be loaded in null-context and will definitely not get native image support. Things get weird for #2 (LoadFrom).

Experiments

Lets see an simple example where I have an executable loadfrom.exe which has the following call
Assembly assem = Assembly.LoadFrom(@"c:\temp\some.dll");
some.dll has been NGEN’d as

c:\temp>ngen install some.dll
Microsoft (R) CLR Native Image Generator - Version 4.0.30319.17929
Copyright (c) Microsoft Corporation.  All rights reserved.
1>    Compiling assembly c:\temp\some.dll (CLR v4.0.30319) ...

Now we run the loadfrom.exe as follows
c:\temp>loadfrom.exe
Starting
Got assembly
In the the fusion log I can see among others the following messages

WRN: Native image will not be probed in LoadFrom context. Native image will only be probed in default load context, like with Assembly.Load().
LOG: Start validating all the dependencies.
LOG: [Level 1]Start validating native image dependency mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089.
Native image has correct version information.
LOG: Validation of dependencies succeeded.
LOG: Bind to native image succeeded.
Attempting to use native image C:\Windows\assembly\NativeImages_v4.0.30319_32\some\804627b300f73759069f96bac51811a0\some.ni.dll.
Native image successfully used.


Interestingly the native image was loaded for some.dll even though it was loaded using Assembly.LoadFrom. This was done in spite of loader clearly warning in the log that it will not attempt to load the native image.

Now lets trying running this same program just ensuring that the exe and dll is not in the same folder

c:\temp>copy loadfrom.exe ..
        1 file(s) copied.

c:\temp>cd ..
c:\>loadfrom.exe
Starting
Got assembly
In this case the log says something different

WRN: Native image will not be probed in LoadFrom context. Native image will only be probed in default load context, like with Assembly.Load().
LOG: IL assembly loaded from c:\temp\some.dll.

As you can see the NI image was not loaded.

The reason is Load Context promotion. When LoadFrom is used on a path from which a Load would’ve anyway found an assembly the LoadFrom results in loading the assembly in the default context. Or the load-context is promoted to the default context. In our first example since c:\temp\some.dll was on the applications base path (APPBASE) the load landed in default-context and ni was loaded. The same didn’t happen in the second example.

Conclusion

  1. NGEN images is only supported on the default context. E.g. for Assemblies loaded for implicit references or through Assemb.Load() API call
  2. NGEN images is not supported on explicit-loads done via Assembly.LoadFile(path)
  3. NGEN images is not reliably supported on explicit-loads done via Assembly.LoadFrom(path)
Given the above there is no real way to load Assemblies from arbitrary paths and get NGEN native image support. In the modern programming world a lot of large applications are moving away from the traditional GAC based approach to a more plug-in based, loosely-coupled-components approach. These large application locate it’s plug-ins via its own proprietary probing logic and loads them using one of the explicit path based load mechanisms. For these there is no way to get the performance boost based on native images. I think this is a limitation which CLR needs to address in the future.

Thursday, December 12, 2013

.NET: Loading Native (NGEN) images and its interaction with the GAC


It’s common for people to think that NGEN works with strong named assemblies only and it places output files or uses GAC closely. This is mostly not true.
If you are new to this there’s a quick primer on NGEN that I wrote http://blogs.msdn.com/b/abhinaba/archive/2013/12/10/net-ngen-gac-and-their-interactions.aspx.

GAC

The Global Assembly Cache or GAC is a central repository where managed assemblies can be placed either using the command like gacutil tool or programmatically using Fusion APIs. The main benefits of GAC is
  1. Avoid dll hell
  2. Provide for a central place to discover dependencies (place your binary in the central place and other applications will find it)
  3. Allow side-by-side publication of multiple versions of the same assembly
  4. Way to apply critical patches, especially security patches that will automatically flow all app using that assembly
  5. Sharing of assemblies across processes. Particularly helpful for system assemblies that are used in most managed assemblies
  6. Provide custom versioning mechanisms (e.g. assembly re-directs / publisher policies)
While GAC has it’s uses it has its problems as well. One of the primary problem being that an assembly has to be strongly named to be placed in GAC and it’s not always possible to do that. E.g. read here and here.

NIC

The NIC or Native Image Cache is the location where NGEN places native images. When NGEN is run to create a native image as in
c:\Projects>ngen.exe install MyMathLibrary.dll

The corresponding MyMathLibrary.ni.dll is placed in the NIC. The NIC has a similar purpose as GAC but is not the same location. NIC is placed at <Windows Dir>\assembly\NativeImages_<CLRversion>_<arch>. E.g. a sample path is

c:\Windows\assembly\NativeImages_v4.0.30319_64\MyMathLibrary\7ed4d51aae956cce52d0809914a3afb3\MyMathLibrary.ni.dll

NGEN places the files it generates in NIC along with other metadata to ensure that it can reliably find the right native image corresponding to an IL image.

How does the .NET Binder find valid native images

The CLR module that finds assemblies for execution is called the Binder. There are various kinds of binders that CLR uses. The one used to find native images for a given assembly is called the NativeBinder.

Finding the right native image involves two steps. First the IL image and the corresponding potential native image is located on the file system and then verification is made to ensure that the native image is indeed a valid image for that IL. E.g. the runtime gets a request to bind against an assembly MyMathLibrary.dll as another assembly program.exe has dependency on it. This is what will happen
  1. First the standard fusion binder will kick in to find that assembly. It can find it either in 
    1. GAC, which means it is strongly named. The way files are placed in GAC ensures that the binder can extract all the required information about the assembly without physically opening the file
    2. Find it the APPBASE (E.g. the local folder of program.exe). It will proceed to open the IL file and read the assemblies metadata
  2. Native binding will proceed only in the default context (more about this in a later post)
  3. The NativeBinder finds the NI file from the NIC. It reads in the NI file details and metadata
  4. Verifies the NI is indeed for that very same IL assembly. For that it goes through a rigorous matching process which includes (but not limited to) full assembly name match (same name, version,  public key tokens, culture), time stamp matching (NI has to be newer than IL), MVID (see below)
  5. Also verifies that the NI has been generated for the same CLR under which it is going to be run (exact .NET version, processor type, etc…) .
  6. Also ensures that the NI’s dependencies are also valid. E.g. when the NI was generated it bound against a particular version of mscorlib. If that mscorlib native image is not valid then this NI image is also rejected
The question is what happens if the assembly is not strongly named? The answer is in that case MVID is used to match instead of relying on say the signing key tokens. MVID is a guid that is embedded in an IL file when a compiler compiles it. If you compile an assembly multiple times, each time the IL file is generated it has an unique MVID. If you open any managed assembly using ildasm and double lick on it’s manifest you can see the MVID

.module MyMathLibrary.dll
// MVID: {EEEBEA21-D58F-44C6-9FD2-22B57F4D0193}

If you re-compile and re-open you should see a new id. This fact is used by the NativeBinder as well. NGEN stores the mvid of the IL file for which a NI is generated. Later the native binder ensures that the MVID of the IL file matches with the MVID of the IL file for which the NI file was generated. This step ensures that if you have multiple common.dll in your PC and all of which has version 0.0.0.0 and is not signed, even then NI for one of the common.dll will not get used for another common.dll.

The Double Loading Problem

In early version of .NET when a NI file was opened the corresponding IL file was also opened. I found a 2003 post from Jason Zander on this. However, currently this is partially fixed. In the above steps look at step 1. To match NI with its IL a bunch of information is required from the IL file. So if that IL file comes from the GAC then the IL file need not be opened to get those information. Hence no double loading happens. However, if the IL file comes from outside the GAC then it is indeed opened and kept open. This causes significant memory overhead in large applications. This is something which the CLR team needs to fix in the future.

Summary

  1. Unsigned (non strong-named) assemblies can also be NGEN’d
  2. Assemblies need not be placed in GAC to ngen them or to consume the ngen images
  3. However, GAC’d files provide better startup performance and memory utilization while using NI images because it avoids double loading
  4. NGEN captures enough metadata on an IL image to ensure that if its native image has become stale (no longer valid) it will reject the NI and just use the IL

Wednesday, December 11, 2013

NGEN Primer


I am planning to write couple of NGEN/GAC related posts. I thought I’d share out some introductory notes about NGEN. This is for the a beginner managed developer.

Primer

Consider I have a math-library which has this simple C# code.
namespace Abhinaba
{
    public class MathLibrary
    {
        public static int Adder(int a, int b)
        {
            return a + b;
        }
    }
}

The C# compiler compiles this code into processor independent CIL (Common Intermediate Language) instead of a machine specific (e.g. x86 or ARM) code. That CIL code can be seen by opening the dll generated by C# compiler in a IL disassembler like the default ildasm that comes with .NET. The CIL code looks as follows
.method public hidebysig static int32  Adder(int32 a,
                                             int32 b) cil managed
{
  // Code size       9 (0x9)
  .maxstack  2
  .locals init ([0] int32 CS$1$0000)
  IL_0000:  nop
  IL_0001:  ldarg.0
  IL_0002:  ldarg.1
  IL_0003:  add  IL_0004:  stloc.0
  IL_0005:  br.s       IL_0007
  IL_0007:  ldloc.0
  IL_0008:  ret
} // end of method MathLibrary::Adder

To abstract away machine architecture the .NET runtime defines a generic stack based processor and generates code for this make-belief processor. Stack based means that this virtual processor works on a stack and it has instructions to push/pop values on the stack and instructions to operate on the values already inside the stack. E.g. in this particular case to add two values it pushes both the arguments onto the stack using ldarg instructions and then issues an add instruction which automatically adds the value on the top of the stack and pushes in the result. The stack based architecture places no assumption on the number of registers (or even if the processor is register based) the final hardware will have.

Now obviously there is no processor in the real world which executes these CIL instructions. So someone needs to convert those to object code (machine instructions). These real world processors could be from the x86, x64 or ARM families (and many other supported platforms). To do this .NET employs Just In Time (JIT) compilation. JIT compilers responsibility is to generate native machine specific instructions from the generic IL instructions on demand, that is as a method is called for the first time JIT generates native instructions for it and hence enables the processor to execute that method. On my machine the JIT produces the following x86 code for the add
02A826DF  mov         dword ptr [ebp-44h],edx  
02A826E2  nop  
02A826E3  mov         eax,dword ptr [ebp-3Ch]  
02A826E6  add         eax,dword ptr [ebp-40h]  

This process happens on-demand. That is if Main calls Adder, Adder will be JITed only when it is actually being called by Main. If a function is never called it’s in most cases never JITed. The call stack clearly shows this on-demand flow.
clr!UnsafeJitFunction <------------- This will JIT Abhinaba.MathLibrary.Adder 
clr!MethodDesc::MakeJitWorker+0x535
clr!MethodDesc::DoPrestub+0xbd3
clr!PreStubWorker+0x332
clr!ThePreStub+0x11
App!ConsoleApplication1.Program.Main()+0x3c <----- This managed code drove that JIT

The benefits of this approach are
  1. It provides for a way to develop applications with a variety of different languages. Each of these languages can target the MSIL and hence interop seamlessly
  2. MSIL is processor architecture agnostic. So the MSIL based application could be made to run on any processor on which .NET runs (build once, run many places)
  3. Late binding. Binaries are bound to each other (say an exe to it’s dlls) late which results in allowing more significant lee-way on how loosely couple they could be
  4. Possibility of very machine specific optimization. As the compilation is happening on the exact same machine/device on which the application will run

JIT Overhead

The benefits mentioned above comes with the overhead of having to convert the MSIL before execution. The CLR does this on demand, that is when a method is just going to execute it is converted to native code. This “just in time” dynamic compilation or JITing adds to both application startup cost (a lot of methods are executing for the first time) as well as execution time performance. As a method is run many times, the initial cost of JITing fades away. The cost of executing a method n times can expressed as

Cost JIT + n * Cost Execution

At startup most methods are executing for the first time and n is 1. So the cost of JIT pre-dominates. This might result in slow startup. This effects scenarios like phone where slow application startup results in poor user experience or servers where slow startup may result in timeouts and failure to meet system SLAs.

Also another problem with JITing is that it is essentially generating instructions in RW data pages and then executing it. This does not allow the operating system to share the generated code across processes. So even if two applications is using the exact same managed code, each contains it’s own copy of JITed code.

NGEN: Reducing or eliminating JIT overhead

From the beginning .NET supports the concept of pre-compilation by a process called NGEN (derived from Native image GENeration). NGEN consumes a MSIL file and runs the JIT in offline mode and generates native instructions for all managed IL functions and store them in a native or NI file. Later applications can directly consume this NI file. NGEN is run on the same machine where the application will be used and run during installation of that application. This retains all the benefits of JIT and at the same time removes it’s overhead. Also since the file generated is a standard executable file the executable pages from it can be shared across processes.
c:\Projects\ConsoleApplication1\ConsoleApplication1\bin\Debug>ngen install MyMathLibrary.dll
Microsoft (R) CLR Native Image Generator - Version 4.0.30319.33440
Copyright (c) Microsoft Corporation.  All rights reserved.
1>    Compiling assembly c:\Projects\bin\Debug\MyMathLibrary.dll (CLR v4.0.30319) ...

One of the problem with NGEN generated executables is that the file contains both the IL and NI code. The files can be quiet large in size. E.g. for mscorlib.dll I have the following sizes

Directory of C:\Windows\Microsoft.NET\Framework\v4.0.30319

09/29/2013  08:13 PM         5,294,672 mscorlib.dll
               1 File(s)      5,294,672 bytes

Directory of C:\Windows\Microsoft.NET\Framework\v4.0.30319\NativeImages

10/18/2013  12:34 AM        17,376,344 mscorlib.ni.dll
               1 File(s)     17,376,344 bytes

Read up on MPGO tool on how this can be optimized (http://msdn.microsoft.com/library/hh873180.aspx)

NGEN Fragility

Another problem NGEN faces is fragility. If something changes in the system the NGEN images become invalid and cannot be used. This is true especially for hardbound assemblies.

Consider the following code
class MyBase
{
    public int a;
    public int b;
    public virtual void func() {}
}

static void Main()
{
    MyBase m = new MyBase();
    mb.a = 42;
    mb.b = 20;
}

Here we have a simple class whose variables have been modified. If we look into the MSIL code of the access it looks like
L_0008: ldc.i4.s 0x2a
L_000a: stfld int32 ConsoleApplication1.MyBase::a
L_000f: ldloc.0 
L_0010: ldc.i4.s 20
L_0012: stfld int32 ConsoleApplication1.MyBase::b

The native code for the variable access can be as follows
            mb.a = 42;
0000004b  mov         eax,dword ptr [ebp-40h] 
0000004e  mov         dword ptr [eax+4],2Ah 
            mb.b = 20;
00000055  mov         eax,dword ptr [ebp-40h] 
00000058  mov         dword ptr [eax+8],14h 

The code generation engine essentially took a dependency of the layout of MyBase class while generating code to modify and update that. So the hard coded layout dependency is that compiler assumes that MyBase looks like

<base>MethodTable
<base> + 4a
<base> + 8b

The base address is stored in eax register and the updates are made at an offset of 4 and 8 bytes from that base. Now consider that MyBase is defined in assembly A and is accessed by some code in assembly B, and that Assembly A and B are NGENed. So if for some reason the MyBase class (and hence assembly A is modified so that the new definition becomes.
class MyBase
{
    public int foo;
    public int a;
    public int b;
    public virtual void func() {}
}

If we looked from the perspective of MSIL code then the reference to these variables are on their symbolic names ConsoleApplication1.MyBase::a, so if the layout changes the JIT compiler at runtime will find their new location from the metadata located in the assembly and bind it to the correct updated location. However, from NGEN this all changes and hence the NGEN image of the accessor is invalid and have to be updated to match the new layout

<base>MethodTable
<base> + 4foo
<base> + 8a
<base> + 12b

This means that when the CLR picks up a NGEN image is needs to be absolutely sure about it’s validity. More about that in a later post.

Saturday, December 07, 2013

.NET: Figuring out if your application is exception heavy

Ocean beach
In the past I worked on an application which used modules from different teams. Many of these modules raised and caught a ton of exceptions. So much so that performance data was showing that these exceptions were causing issues. So I had to figure out an easy way to programmatically find out these code and inform their owners that exception is for exceptional scenarios and shouldn’t be used for normal code-flow :)
Thankfully CLR provides an easy hook in the form of an AppDomain event. I just need to hook into the AppDomain.FirstChanceException event and CLR notifies upfront when the exception is raised. It does that even before any managed code gets a chance to handle it (and potentially suppresses it).
The following is a plugin which throws and catches exception.
namespace Plugins
{
    public class FunkyPlugin
    {
        public static void ThrowingFunction()
        {
            try
            {
                Console.WriteLine("Just going to throw");
                throw new Exception("Cool exception");
            }
            catch (Exception ex)
            {
                Console.WriteLine("Caught a {0}", ex.Message);
            }
        }
    }
}

In the main application I added code to subscribe to the FirstChanceException event before calling the plugins
using System;
using System.Runtime.ExceptionServices;
using System.Reflection;

namespace foo
{
    public class Program
    {
        static void Main()
        {
            // Register handler
            AppDomain.CurrentDomain.FirstChanceException += FirstChanceHandler; 
            Plugins.FunkyPlugin.ThrowingFunction();
        }
        
        static void FirstChanceHandler(object o, 
                                       FirstChanceExceptionEventArgs e)
        {
            MethodBase site = e.Exception.TargetSite;
            Console.WriteLine("Thrown by : {0} {1}({2})", site.Module, 
                                                          site.DeclaringType, 
                                                          site.ToString());
            Console.WriteLine("Stack: {0}", e.Exception.StackTrace);
        }
    }
}

The FirstChanceHandler just dumps out the name of the assembly and type that raises the exception. The output of this program is as follows
Just going to throw
Thrown by : some.dll Plugins.FunkyPlugin(Void ThrowingFunction())
Stack:    at Plugins.FunkyPlugin.ThrowingFunction()
Caught a Cool exception

As you can see the handler runs even before the catch block executes and I have the full information of the assembly, type and method that throws the exception.

Behind the Scene

For most it might suffice to know that the event handler gets called before anyone gets a chance to handle the exception. However if you care about when this is fired, then its in the first pass (first chance) just after the runtime notifies the debugger/profiler.

The managed exception system piggy backs on native OS exception handling system. Though the x86 exception handling (FS:0 based chaining) is significantly different from the x64 (PDATA) it has the same basic idea
  1. From outside a managed exception looks exactly like a native exception and hence the OSes normal exception handling mechanism kicks in
  2. Exception handling requires some mechanism to walk the thread callstack on which the exception is thrown. So that it can find an up-level catch block as well as call the finally block of all functions in-between the catch and the point of exception being thrown. The mechanism varies in between x86 and x64 but is not super relevant for our discussion. (a series of data-structures pushed onto the stack in case of x86 or a series of data-structure table registered with OS in x64).
  3. On an exception the OS walks the stack and for managed function frames calls into CLR’s registered personality routine (that's what its called :)). This routine knows how to handle managed exceptions
  4. This routine notifies the profiler then the debugger of this first-chance exception, so that debugger can potentially break on the exception and do other relevant operations. If debugger did not handle the first chance exception the processing of the exception continues
  5. If there is a registered handler for FirstChanceException that is called
  6. JIT is consulted to find appropriate catch block for the exception (none might be found)
  7. The CLR returns the right set of information to the OS indicating that indeed the exception will be processed
  8. The OS initiates the second-pass
  9. For every function in between the frame of exception and the found catch block the CLR’s handler routine is called and the CLR consults the JIT to find the appropriate finally blocks and proceeds to call them for cleanup. In this phase the stack actually starts unwinding
  10. This continues till the frame in which the catch was initially found is reached. CLR proceeds to execute the catch block.
  11. If all is well the exception has been caught and processed and peace is restored to the world.
As it should be evident from the above basic flow the FirstChanceHandler will get called before any code gets the chance to catch it and also in case the exception will go unhandled.

PS: Please don’t throw an exception in the FirstChance handler :)

Thursday, December 05, 2013

Bing it on

1468682_10151739093417274_1187424887_n

In early 2008 I joined the CLR team to clean garbage (or to write Garbage Collectors :)). It has been a wild ride writing generational Garbage Collectors, memory managers and tinkering with runtime performance or memory models. It’s been great to see people on the street use stuff that I had helped build or to see internal team reports as .NET updates/downloads went out to 100s of millions of machines. In one case I was under a medical diagnostic device clearly running on .NET. I didn’t run out indicating my faith in the stuff we built (or maybe I was sedated, who knows).

I decided to change things up a bit. So I decided to move from the world of devices, desktops and consoles to that of the cloud. From this week I have begun working in the Bing team. From now on I will no longer be a part of the team that builds CLR but will become part of the team which really pushes the usage of CLR to the extreme. Using .NET to serve billions of queries on thousands of machines.

I hope to continue blogging about CLR/.NET and provide a users perspective of the best managed runtime in the world.

BTW the photo above is the fortune cookie I got at my farewell lunch with the CLR team. Very appropriate.