Search

Saturday, May 24, 2008

Stylecop has been released

me

Microsoft released the internal tool StyleCop to public under the fancy yet boring name of Microsoft Source Analysis for C#. Even though the name is boring the product is not.

You'll love this tool when it imposes consistent coding style across your team. You'll hate this tool when it imposes the same on you. The result is stunning looking, consistently styled code which your whole team can follow uniformly.

StyleCop has been in use for a long time internally in Microsoft and many teams mandate it's usage. My previous team VSTT used it as well. The only crib I had is that it didn't allow single line getters and setters (and our team didn't agree to disable this rule either).

// StyleCop didn't like this one
public int Foo
{
get { return Foo; }
}

// StyleCop wanted this instead
public int Foo
{
get
{
return Foo;
}
}


 


Read more about using StyleCop here. You can set this up to be run as a part of your build process as documented here. Since this is plugged in as a MsBuild project you can use it in as a part of Team Foundation Build process as well.


Let the style wars begin in team meetings :)

Lambda the ultimate

GunGun

Whatever I said about lambda before is crap. Each time time I use it I feel happy.

I had to fire a timer every so often this is what I can use with lambda...

dataStoreTimer = new Timer(new TimerCallback(
(obj) => { (obj as AutoResetEvent).Set(); }), pollEvent, 100, 1000);

Sweet!!!


**BTW you can't blame me for having my shortest post just after the longest!!

Friday, May 23, 2008

Cell phone assault

Visakhapatnam - Ramakrishna beach

Last two weeks my cell phone got assaulted thrice. First it was someone sending me a virus over bluetooth (a sis file actually). This happened when I was taking a photograph of my daughter with the cell phone camera in a restaurant (Aromas of China, City Center mall in Hyderabad).

The next one was bluetooth based advertisement messages in the Forum Mall in Bangalore. They were actually sending offers of the hour over bluetooth and I got 2 such messages.

The third incident was in the airport when someone was again trying to send me and make me open an trojan app.

I was really surprised with the rapid growth of cell phone based attacks. Worst is few people know of this. My wife had no idea that you can actually send applications over bluetooth and that can infect the phone.

Thursday, May 22, 2008

Building Scriptable Applications by hosting JScript

The kind of food I should have, but I don't

If you have played around with large applications, I'm sure you have been intrigued how they have been build to be extendable. The are multiple options

  1. Develop your own extension mechanism where you pick up extension binaries and execute them.
    One managed code example is here, where the application loads dlls (assemblies) from a folder and runs specific types from them. A similar unmanaged approach is allow registration of guids and use COM to load types that implement those interfaces
  2. Roll out your own scripting mechanism:
    One managed example is here where on the fly compilation is used. With DLR hosting mechanism coming up this will be very easy going forward
  3. Support standard scripting mechanism:
    This involves hosting JScript/VBScript inside the application and exposing a document object model (DOM) to it. So anyone can just write standard JScript to extend the application very much like how JScript in a webpage can extend/program the HTML DOM.

Obviously the 3rd is the best choice if you are developing a native (unmanaged) solution. The advantages are many because of low learning curve (any JScript programmer can write extensions), built in security, low-cost.

In this post I'll try to cover how you go about doing exactly that. I found little online documentation and took help of Kaushik from the JScript team to hack up some code to do this.

The Host Interface

To host JScript you need to implement the IActiveScriptSite. The code below shows how we do that stripping out the details we do not want to discuss here (no fear :) all the code is present in the download pointed at the end of the post). The code below is in the file ashost.h

class IActiveScriptHost : public IUnknown 
{
public:
// IUnknown
virtual ULONG __stdcall AddRef(void) = 0;
virtual ULONG __stdcall Release(void) = 0;
virtual HRESULT __stdcall QueryInterface(REFIID iid,
void **obj) = 0;

// IActiveScriptHost
virtual HRESULT __stdcall Eval(const WCHAR *source,
VARIANT *result) = 0;
virtual HRESULT __stdcall Inject(const WCHAR *name,
IUnknown *unkn) = 0;

};

class ScriptHost :
public IActiveScriptHost,
public IActiveScriptSite
{
private:
LONG _ref;
IActiveScript *_activeScript;
IActiveScriptParse *_activeScriptParse;

ScriptHost(...){}

virtual ~ScriptHost(){}
public:
// IUnknown
virtual ULONG __stdcall AddRef(void);
virtual ULONG __stdcall Release(void);
virtual HRESULT __stdcall QueryInterface(REFIID iid, void **obj);

// IActiveScriptSite
virtual HRESULT __stdcall GetLCID(LCID *lcid);
virtual HRESULT __stdcall GetItemInfo(LPCOLESTR name,
DWORD returnMask, IUnknown **item, ITypeInfo **typeInfo);

virtual HRESULT __stdcall GetDocVersionString(BSTR *versionString);
virtual HRESULT __stdcall OnScriptTerminate(const VARIANT *result,
const EXCEPINFO *exceptionInfo);
virtual HRESULT __stdcall OnStateChange(SCRIPTSTATE state);
virtual HRESULT __stdcall OnEnterScript(void);
virtual HRESULT __stdcall OnLeaveScript(void);
virtual HRESULT __stdcall OnScriptError(IActiveScriptError *error);

// IActiveScriptHost
virtual HRESULT __stdcall Eval(const WCHAR *source,
VARIANT *result);
virtual HRESULT __stdcall Inject(const WCHAR *name,
IUnknown *unkn);

public:

static HRESULT Create(IActiveScriptHost **host)
{
...
}


};

Here we are defining an interface IActiveScriptHost. ScriptHost implements the IActiveScriptHost and also the required hosting interface IActiveScriptSite. IActiveScriptHost exposes 2 extra methods (in green) that will be used from outside to easily host js scripts.


In addition ScriptHost also implements a factory method Create. This create method does the heavy lifting of using COM querying to get the various interfaces its needs (IActiveScript, IActiveScriptParse) and stores them inside the corresponding pointers.


Instantiating the host


So the client of this host class creates the ScriptHosting instance by using the following (see ScriptHostBase.cpp)

IActiveScriptHost *activeScriptHost = NULL;
HRESULT hr = S_OK;
HRESULT hrInit = S_OK;

hrInit = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
if(FAILED(hr)) throw L"Failed to initialize";

hr = ScriptHost::Create(&activeScriptHost);
if(FAILED(hr)) throw L"Failed to create ScriptHost";


 


With this the script host is available through activeScriptHost pointer and we already have JScript engine hosted in our application


Evaluating Scripts


Post hosting we need to make it do something interesting.This is where the IActiveScriptHost::Eval method comes in.

HRESULT __stdcall ScriptHost::Eval(const WCHAR *source, 
VARIANT *result)
{
assert(source != NULL);

if (source == NULL)
return E_POINTER;

return _activeScriptParse->ParseScriptText(source, NULL,
NULL, NULL, 0, 1,
SCRIPTTEXT_ISEXPRESSION,
result, NULL);
}

Eval accepts a text of the script, makes it execute using IActiveScriptParse::ParseScriptText and returns the result.


So effectively we can accept input from the console and evaluate it (or read a file and interpret the complete script in it.

while (true) 
{
wcout << L">> ";
getline(wcin, input);
if (quitStr.compare(input) == 0) break;

if (FAILED(activeScriptHost->Eval(input.c_str(), &result)))
{
throw L"Script Error";
}
if (result.vt == 3)
wcout << result.lVal << endl;
}

So all this is fine and at the end you can run the app (which BTW is a console app) and this what you can do.

JScript sample Host
q! to quit

>> Hello = 7
7
>> World = 6
6
>> Hello * World
42
>> q!
Press any key to continue . . .


So you have extended your app to do maths for you or rather run basic scripts which even though exciting but is not of much value.


Extending your app


Once we are past hosting the engine and running scripts inside the application we need to go ahead with actually building the application's DOM and injecting it into the hosting engine so that JScript can extend it.


If you already have a native application which is build on COM (IDispatch) then you have nothing more to do. But lets pretend that we actually have nothing and need to build the DOM.


To build the DOM you need to create IDispatch based DOM tree. There can be more than one roots. In this post I'm not trying to cover how to build IDispatch based COM objects (which you'd do using ATL or some such other means). However, for simplicity we will roll out a hand written implementation which implements an interface as below.

class IDomRoot : public IDispatch 
{
// IUnknown
virtual ULONG __stdcall AddRef(void) = 0;
virtual ULONG __stdcall Release(void) = 0;
virtual HRESULT __stdcall QueryInterface(REFIID iid,
void **obj) = 0;

// IDispatch
virtual HRESULT __stdcall GetTypeInfoCount( UINT *pctinfo) = 0;
virtual HRESULT __stdcall GetTypeInfo( UINT iTInfo, LCID lcid,
ITypeInfo **ppTInfo) = 0;
virtual HRESULT __stdcall GetIDsOfNames( REFIID riid,
LPOLESTR *rgszNames,
UINT cNames, LCID lcid,
DISPID *rgDispId) = 0;

virtual HRESULT __stdcall Invoke( DISPID dispIdMember, REFIID riid,
LCID lcid, WORD wFlags,
DISPPARAMS *pDispParams,
VARIANT *pVarResult,
EXCEPINFO *pExcepInfo,
UINT *puArgErr) = 0;

// IDomRoot
virtual HRESULT __stdcall Print(BSTR str) = 0;
virtual HRESULT __stdcall get_Val(LONG* pVal) = 0;
virtual HRESULT __stdcall put_Val(LONG pVal) = 0;

};


 


At the top we have the standard IUnknown and IDispatch methods and at the end we have our DOM Root's methods (in blue). It implements a Print method that prints a string and a property called Val (with a set and get method for that property).


The class DomRoot implements this method and an additional method named Create which is the factory to create it. Once we are done with creating this we will inject this object inside the JScript scripting engine. So our final script host code looks as follows

IActiveScriptHost *activeScriptHost = NULL;
IDomRoot *domRoot = NULL;
HRESULT hr = S_OK;
HRESULT hrInit = S_OK;

hrInit = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
if(FAILED(hr)) throw L"Failed to initialize";

// Create the host
hr = ScriptHost::Create(&activeScriptHost);
if(FAILED(hr)) throw L"Failed to create ScriptHost";

// create the DOM Root
hr = DomRoot::Create(&domRoot);
if(FAILED(hr)) throw L"Failed to create DomRoot";

// Inject the created DOM Root into the scripting engine
activeScriptHost->Inject(L"DomRoot", (IUnknown*)domRoot);

What happens with the inject is as below

map rootList;
typedef map::iterator MapIter;
typedef pair InjectPair;

HRESULT __stdcall ScriptHost::Inject(const WCHAR *name,
IUnknown *unkn)
{
assert(name != NULL);

if (name == NULL)
return E_POINTER;

_activeScript->AddNamedItem(name, SCRIPTITEM_GLOBALMEMBERS |
SCRIPTITEM_ISVISIBLE );
rootList.insert(InjectPair(std::wstring(name), unkn));

return S_OK;
}


&npsp;


In inject we store the name of the object and the corresponding IUnknown in a map (hash table). Each time the script will encounter a object in its code it calls GetItemInfo with that objects name and we then de-reference into the hash table and return the corresponding IUnknown

HRESULT __stdcall ScriptHost::GetItemInfo(LPCOLESTR name,
DWORD returnMask,
IUnknown **item,
ITypeInfo **typeInfo)
{
MapIter iter = rootList.find(name);
if (iter != rootList.end())
{
*item = (*iter).second;
return S_OK;
}
else
return E_NOTIMPL;
}

After that the script calls into that IDispatch to look for properties and methods and calls into them.


The Whole Flow


By now we have seen a whole bunch of code. Let's see how the whole thing works together. Let's assume we have a extension written in in JScript and it calls DomRoot.Val = 5; this is what happens to get the whole thing to work



  1. During initialization we had created the DomRoot object (DomRoot::Create) which implements IDomRoot and injected it in the script engine via AddNamedItem and stored it at our end in a rootList map.
  2. We call activeScriptHost->Eval(L"DomRoot.Val = 5;", ...) to evaluate the script. Evan calls _activeScriptParse->ParseScriptText.
  3. When the script parse engine sees the "DomRoot" name it figures out that the name is a valid name added with AddNamedItem and hence it calls its hosts ScriptHost::GetItemInfo("DomRoot");
  4. The host we have written looks up the same map filled during Inject and returns the IUnknown of it to the scripting engine. So at this point the scripting engine has a handle to our DOM root via an IUnknown to the DomRoot object
  5. The scripting engine does a QueryInterface on that IUnknown to get the IDispatch interface from it
  6. Then the engine calls the IDispatch::GetIDsOfNames with the name of the property "Val"
  7. Our DomRoots implementation of GetIDsOfNames returns the required Dispatch ID of the Val property (which is 2 in our case)
  8. The script engine calls IDispatch::Invoke with that dispatch id and a flag telling whether it wants the get or the set. In this case its set. Based on this the DomRoot re-directs the call to DomRoot::put_Val
  9. With this we have a full flow of the host to script back to the DOM

In action

JScript sample Host
q! to quit

>> DomRoot.Val = 5;
5
>> DomRoot.Val = DomRoot.Val * 10
50
>> DomRoot.Val
50
>> DomRoot.Print("The answer is 42");
The answer is 42

 


Source Code


First of all the disclaimer. Let me get it off my chest by saying that the DomRoot code is a super simplified COM object. It commits nothing less than sacrilege. You shouldn't treat it as a sample code. I intentionally didn't do a full implementation so that you can step into it without the muck of IDispatchImpl or ATL coming into your way.


However, you can treat the script hosting part (ashost, ScriptHostBase) as sample code (that is the idea of the whole post :) )


The code organization is as follows


ashost.cpp, ashost.h - The Script host implementation
DomRoot.cpp, DomRoot.h - The DOM Root object injected into the scripting engine
ScriptHostBase.cpp - Driver


Note that in a real life example the driver should load jscript files from a given folder and execute it.


Download from here

Monday, May 19, 2008

Model, View, Controller

Chocs

These days the whole world is abuzz with the Model, View, Controller (MVC) architecture. This is not something new and is known by computer scientists for close to 30 years. I guess the new found popularity is due to the fact that this has heavy application is web development and lot of main-stream web development platform are putting in support for this. Ruby-on-rails and ASP.NET MVC are classic examples.

Coding horror has a nice post on this topic. I liked the following statement it made

"Skinnability cuts to the very heart of the MVC pattern. If your app isn't "skinnable", that means you've probably gotten your model's chocolate in your view's peanut butter, quite by accident."

I actually use a very similar concept. The moment I see an application's architecture (be it an interview candidate or a friend showing off something) I ask the question "Can you write a console version of this easily?". If the answer is no or it needs a re-design it means that the separation of model, view and controller is not correct. You are going to have a nightmare if you write and maintain that software.

Saturday, May 17, 2008

You need to be careful about how you use count in for-loops

Negotiating

Lets consider the following code

 MyCollection myCol = new MyCollection();
myCol.AddRange(new int[] { 1, 2, 3, 4, 5, });
for (int i = 0; i < myCol.Count; ++i)
{
Console.WriteLine("{0}", i);
}


 


What most people forgets to consider is the condition check that happens for the for loop (in Red). In this case it is actually a call to a method get_Count. The compiler doesn't optimize this away (even when inlined)!! Since the condition is evaluated after each iteration, there's actually a method call happening each time. For most .NET code it is low-cost because ultimately it's just a small method with count being returned from an instance variable (if inlined the method call overhead also goes away). Something like this is typically written for count.

public int Count
{
get
{
return this.count;
}
}

Someone wrote something very similar in C, where in the for-loop he used strlen. Which means that the following code actually is O(n2) because each time you are re-calculating the length...

for(int i = 0; i < strlen(str); ++i) {...}

Why the compiler cannot optimize it is another question. For one it's a method call and it is difficult to predict whether there's going to be any side-effect. So optimizing the call away can have other disastrous effect.


So store the count and re-use it if you know that the count is not going to change...

Tuesday, May 13, 2008

Alternatives to XML

Halloween

Though not as much as the Jeff Atwood I don't like overuse of XML as well. In our last project we used XML in a bunch of places where it made sense and also planned to use it in bunch of other places where it didn't. For some strange reason some folks think its actually readable and suggested we use XML to dump the user actions we recorded because it's easy to parse and is human readable/editable. While I'm perfectly fine doing it in XML, but definitely not for that reason.

Anyways, sense prevailed and even though we do store it in XML we dump out automation source code in obviously more readable C#/VB.NET.

Before I completely get sidetracked let me state that this post is not about XML or about JSON but about the fact  that there exists many alternatives to both. Head on to here (via Coding Horror)...

What is the similarity between Windows and Textile Industry

...they both use threads and fibers :)

Pink....

Most people are aware of processes and threads. Windows offers an even finer granularity over execution. This is called Fiber. To quote MSDN

A fiber is a unit of execution that must be manually scheduled by the application. Fibers run in the context of the threads that schedule them.

The obvious question that should come to managed code developers is whether .NET supports Fibers? The answer is from 2.0, CLR does support fiber mode. This means that there are hosting APIs using which a host can make CLR use Fibers to run its threads. So in effect there's no requirement that a .NET thread be tied to the underlying OS's threads. A classic example is that of SQL Server which hosts .NET in fiber mode because it wants to take care of scheduling directly. Head over to here (scroll down to SQL Server section) for an excellent read about this topic.

There's also the book Customizing the Microsoft .NET Framework Common Language Runtime written by Steven Pratschner which has a chapter on customizing CLR to use Fibers. I have already ordered the book. Once it comes in and I get a chance to read it, I'll post more about this.

Monday, May 05, 2008

My workstation layout

Our team just moved to a new building in Microsoft India campus. A lot of people were going around checking out other people's office. I got asked couple of times about my workstation layout and thought I'll do a quick post on that.

Workstation

Like most people in our team I use a dual monitor setup. Last time I estimated I spend about 12% of my life looking for things (40% of it for my car keys). So even though there are people who use 7 monitors I'm never going to join that gang and bump that number to 30% by adding the time to search for my app window. And Microsoft will definitely not fund those many monitors either :). So for me 2 is enough.

Both monitors I have, are standard HP1965 (19" monitors) hooked on to ATI Radeon cards. One of the monitors (the one on the left) is looped through a KVM switch and I can rotate that among the other 2 machines that I have. The other I have rotated in portrait (vertical) mode and use it primarily for coding. The image below should explain why

Workstation

This provides a much better code view. In the font size I use (Consolas 9pt) I can see 74 lines of code vs 54 in the landscape mode. So this means 37% more!!! Since I have ATI card I use Catalyst Control Center to rotate the display.

I also prefer dark background and use white/light-color text on it. My eyes feel better with it. I keep both Visual Studio and GVim in dark color mode. You can download my vssettings from here and .vimrc from here.

That kind of rounds up the workstation layout that I use in office. I try my best not to work on the laptop directly.  I TS on to it in case I need to use it for any reason. When I took the picture it was quietly napping on the other side of my office :)

Friday, May 02, 2008

Struggling with email overload

It literally rains email at Microsoft (if you've been to Seattle/Redmond you know why :) )

I've always struggled to keep up with the email in Microsoft. When I joined I was stunned with the downpour. The number of email I got on the first day was more than what I got in a month in Adobe. The situation worsened when I went through team transition last month because for some time I had to listen to the email threads of both teams DLs (distribution list).

I have tried using various techniques to cope before. This included complex labyrinth of folders (I've met folks with 9 level deep folder nesting), rules, search folders, you name it!!

All of them failed until I saw this post from John Lam. This talks about reverse pimping outlook. Even though I didn't go to the extreme he did, I basically got the following done

  1. Removed all rules, toolbars, folders
  2. Created 3 simple folders Archive, Followup, Automated
  3. Created one big fat rule to move all emails from automated DLs (e.g. checkin notices) to the automated folder
  4. Copy pasted macros from Johns blog and setup toolbar buttons to launch these macros and also associated short-cuts with these. Go to John's blog for the macros or download from here
  5. Effectively all emails from human beings land up in my inbox.
  6. When an email comes I read it. After that I have only 3 options
    1. Delete it
    2. Hit Alt+R to archive it (this launches a macro to mark the email as read and moves it to the archive folder)
    3. Hit Alt+U for follow up (this launches a macro to flag the email to be replied by EOD and moves it to the follow up folder)

This ensures that I read all emails that come to me, I never miss an email now. I go on hitting zero emails in the inbox couple of time a day. Couple of times a day I scan the followup folder to ensure that I have replied/taken-action on all emails in the follow-up folder.

Even though the process sounds complex it's working miraculously for me for the last two months. I can finally forget about email overload.

My outlook looks as shown below. It's more cluttered than John's version because I need to see upcoming meetings in the right pane.

Thursday, May 01, 2008

Forcing a Garbage Collection is not a good idea

Our cars 

Someone asked on a DL about when to force a GC using GC.Collect. This has been answered by many experts before, but I wanted to re-iterate. The simple answer is

"extremely rarely from production code and if used ensure you have consulted the GC folks of your platform".

Lets dissect the response...

Production Code

The "production code" bit is key here. It is always fine to call GC.Collect from test/debug code when you want to ensure your application performs fine when a sudden GC comes up or you want to verify all your objects have been disposed properly or the finalizers behave correctly. All discussion below is relevant only to shipping production code.

Rarely

A lot of folks jumped into the thread giving examples of where they have done/seen GC.Collect being used successfully. I tried understanding each of the scenarios and explaining why in my opinion it is not required and doesn't qualify to make it to the rare scenario. I have copy pasted some of these scenarios with my response below (with some modifications).

  1. For example, your process has a class which wraps a native handle and implements Dispose pattern. And the handle will used in exclusive mode. The client of this class forgets to call Dispose/Close to release the native handle (they rely on Finalizer), then other process (suppose the native handle is inter-process resource) have to wait until next GC or even full GC to run Finalizer, since when Finalizer will run is not expected – other process will suffer from waiting such exclusive sharing resource…
    This is a program bug. You shouldn’t be covering a dispose pattern misuse with a GC call. You are essentially shipping buggy code or in case you provide the framework then allowing users to write buggy code. This should be fixed by ensuring that the clients call the dispose and not by forcing GC. I would suggest adding an Assert in the finalizer in your debug bits to ensure that you fail in the test scenario. In case of Fx write the right code and let performance issues surface so that users also writes the right code
  2. Robotics might be another example—you might want time-certain sampling and processing of data.
    .NET is not a Real time system. If you assume or try to simulate Real Time operations on it then I have only sympathy to offer :). Is the next suggestion to call all methods in advance so that they are already jitted?
  3. Another case I can think of is the program is either ill-designed or designed specially to have a lot of Finalizers (they wrap a lot of native resources in the design?). Objects with Finalizer cannot be collected in generation 0, at least generation 1, and have great chance to go to generation 2…
    This is not correct. The dispose pattern is there exactly for this reason. Any reason why you are not using dispose pattern and using GC suppress in the dispose method?
  4. Well, one “real world” scenario that I know of is in a source control file diff creation utility.  It loops through processing each file in the pack, and loads that entire file into memory in order to do so it calls GC.Collect when it’s finished with each file, so that the GC can reclaim the large strings that are allocated.
    Why cannot it just not do anything and is there a perf measurement to indicate otherwise? GC has per-run overhead. So incase nothing is done it may so happen that for a short diff creation the GC is never run or atleast run for every 10 files handled leading to less number of runs and hence better perf. For a batch system where there is no user interaction happening in the middle what is the issue if there is a system decided GC in the middle of the next file?
  5. A rare case in my mind is you allocate a lot of large objects > 85k bytes, and such size objects will be treated as generation 2 objects. You do not want to wait for next full GC to run (normally GC clears generation 0 or generation 1), you want to compact managed heap as soon as possible.
    Is it paranoia or some real reason? If it holds native resources then you are covered by dispose patterns and if you are considering memory pressure then isn’t GC there to figure out when to do it for you?

In effect most usage are redundant.

Question is then what qualifies as a rare scenario where you want to do a GC.Collect. This has been explained by Rico Mariani (here) and Patrick Dussud (here).

‘In a nutshell, don’t call it, unless your code is unloading large amounts of data at well-understood, non-repeating points (like at the end of a level in a game), where you need to discard large amounts of data that will no longer be used.”

Its almost always when you know for sure a GC run is coming ahead (which you completely understand and maybe confirmed with the GC guys of your framework) and you want to control the exact point when you want it to happen. E.g.in case of a game level end you have burned out all the data and you know that you can discard them and if you don’t GC will start after 6 frames of rendering in your next level and you are better off doing it now as the system is idle and you’d drop a frame of two if it happened in the middle of the next frame.

And obviously you call GC.Collect if you found an issue reported/discussed in the forums and you have figured out a GC bug which you want to work around.

I would highly recommend seeing this video where Patrick Dussud the father of .NET GC explains why apparent GC issues may actually be side-effect of other things (e.g finalizes stuck trying to delete behind the scene STA COM objects).

What is the problem with calling GC.Collect

So why are folks against calling GC.Collect? There are multiple reasons

  1. There's an inherent assumption that the user knows more about when the GC is run. This cannot be true because according to CLR spec there is no standard time. See here. Since GC is plugged into the execution engine it knows best of the system state and knows when to fire. With Silver Light and other cross-plat technologies being mainstream it will become harder and harder to predict where your app is run. There's already 3 separate GCs the desktop, server and compact framework. Silver light will bring in more and your assumptions can be totally wrong.
  2. GC has some cost (rather large):
    GC is run by first marking all the objects and then cleaning them. So whether garbage or not the objects will be touched and it takes awful amount of time to do that. I've seen folks measure the time to do GC.Collect and figure out the time taken. This is not correct because GC.Collect fires the collection and returns immediately. Later GC goes about freezing all the threads. So GC time is way more than what collect takes and you need to monitor performance counter to figure out what is going on,
  3. GC could be self tuning:
    The desktop GC for example tunes itself based on historical data. Lets assume that a large collection just happened which cleaned up 100mb of data. Incidentally exactly after that a forced GC happened which resulted in no data to be cleaned up. GC learns that collection is not helping and next time when a real collection is to be fired (low memory condition) it simply backs off based on the historical data. However, if the forced GC didn't occur it'd have remembered that 100mb got cleared and would've jumped in right away.

Both 2 and 3 are GC implementation specific (differs across desktop and Compact GC) stressing the first point which is most assumption are implementation details of the GC and may/will change jeopardizing the attempt to try out-guess the GC when to run.