Sam Gentile's Radio Weblog : Partying with .NET

 

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.

 
 

Introduction to the .NET CLR

What is the CLR?

The CLR is the heart and soul of .NET. The CLR is the runtime. In a nutshell, the CLR is a run-time environment in which .NET applications run. The CLR is layered on top of an operating system, and exists to provide services by being a layer between .NET applications and the operating system.

The CLR loads your code, manages it, runs it and provides a number of support services. Some of these vital support services include resource management, thread management, remoting, as well as enforcing code safety and security constraints. Code that is loaded and running under the control of the CLR is referred to as managed code. Compiled code in .NET does not contain assembly language instructions. Rather, code is compiled into assemblies that contain Microsoft Intermediate Language (MSIL). MSIL is a low level language, similar in idea to Java byte-code. The MSIL is NOT interpreted. It is JIT-compiled into native machine code.

In principle. the CLR, somewhat resembles the runtimes of languages like Smaltalk and Java. However, the similarity is only in principle: the CLR is not an interpreter. The difference is that all the .NET language compilers emit MSIL, an abstract, intermediate form that is independent of any programming language or OS or target machine. It is because of this MSIL and the CTS that .NET languages can interoperate very closely and easily. However, the target processor always executes native assembly language: the MSIL is always JITed in some form before it runs. Thats the big difference with the other runtimes of the past.

What is MSIL?
The abstract intermediate representation of the .NET application is made up of two main pieces: Metadata and managed code. The managed code represents the functionality of the application encoded in a binary form known as Microsoft Intermediete Language or MSIL. MSIL is a series of opcodes. For a thorough examination of MSIL, please see Inside Microsoft .NET IL Assembler. You need this book if you want to truly understand some major concepts in .NET. 
The IL is "managed" by the runtime. The runtime offers an impressive list of features. These include self-describing components through the use of metadata, trust and security sandboxing, memory management, cross-language integration, simple deployment, and versioning.
 Relationship to Metadata and CTS   
In addition to the program’s logic, .NET compilers all emit metadata. So when a PE file (DLL or exe) is created into an assembly, that file will also contain metadata. That metadata describes the types, members and references in the code. What is the metadata used for? Well, one use is that the CLR uses it to locate and load classes, prepare space in memory, resolve method invocations, generate native code and enforce security constraints. An assembly is a group of resources and types, along with metadata about those resources and types that is deployed as a unit. The metadata is called an assembly manifest and includes information such as a list of types and resources visible outside the assembly. The manifest also includes information about dependencies, such as the version of the assemblies used when the assembly was built.
A .NET appplication targeted for execution under the runtime, consists of one or more managed executables, each of which has metadata, and (optionally) managed code in the form of IL. Whoa, you say! Yes, it is possible to build a managed executable that has no methods!! Managed .NET applications are called assemblies.
Assemblies can be private to an application or shared by multiple applications. Multiple versions of an assembly can be deployed on a machine at the same time. Application configuration information defines where to look for assemblies, thus the runtime can load different versions of the same assembly for two different applications that are running concurrently. This eliminates issues that arise from incompatibilities between component versions, improving overall system stability. If necessary, administrators can add configuration information, such as a different versioning policy, to assemblies at deployment time, but the original information provided at build time is never lost.

Because assemblies are self-describing, no explicit registration with the operating system is required. Application deployment can be as simple as copying files to a directory tree...). This is what is meant by x-copy deployment. Configuration information is stored in XML files that can be edited by any text editor.
Perhaps the most touted and needed service of the runtime is of automatic memory management. The runtime automatically handles object layout and deletion and reclamation of memory resources through the use of garbage collection. Objects that are no longer in use are released automatically. This, of course, eliminates memory leaks as well as some programming errors.


The CLR Code Execution Model
So, the managed executable consists of metadata and IL. The two important relevant subsystems of the CLR are the class loader and the Just-In-Time (JIT) compiler. It is the job of the loader to read the assembly's metadata and create, in memory, an internal representation of the layout of the classes and methods.
The assemblies are usually demand-loaded and then JIT-ed at the time of loading. Demand loading simply means that the class loader does its thing only when a class is referenced. At that time, we say that it is "demand loaded" and laid out in memory. At load time, the assembly goes through some level of verification. The most basic check is that the CLR has to be able to make sense of the IL in order to generate machine code. Higher levels of checking enable the execution engine to ensure that the assembly is memory-safe.
Remember: The runtime is NOT an interpreter. It does NOT execute code. Instead, the IL is compiled in memory into native code, and the native code is the thing that is actually executed by the processor. The CLR uses the JIT compiler to do this and the JIT compiler is also demand loaded. A method will only be compiled when it is called. However, for performance and other reasons, it is possible to "pre-compile" an assembly from IL to native code using the NGEN utility.
The CLR maintains an in-memory representation similar to a v-table from the metadata. The metadata tells the loader al the information it needs to know about a type and where to find it. When the runtime loads a type, it replaces the address of each method in this v-table like structure with a piece of stub code. When, and only when, the method is called, the JIT compiler is invoked on the method, compiling the method's IL into native X86 assembly code or whatever platform the CLR is running on. This code is cached in memory and the CLR now changes the stub code to point to this native code. So, the next time, the method is referenced, the calls to the method will not invoke the JIT compiler and actually use the cached native code.
The JITed code sticks around in memory and is discarded when the type is unloaded. This is, of course, differerent in the case of pre-jitting, as I refered to above,w ith NGEN. In this case, the initial setup time is longer but the code will execute quicker.
Of course, people wary of repeating the Java fiasco, are wary of all this stuff. Will this run slower than my native code and will it take minutes to paint a window (ala Java)? The answer is no. IL is never interpreted for one. It is always JITed, by design, to native code, not an afterthought like Java. Also, there is the possibility that the JITed code, in many circumstances could actually run faster than native code! How can this be? The JIT compiler has the advantage of running on the machine whewre the code will be executed. This means that the JIT compiler can examine metrics like the processor speed and the amount of memory in the machine, and use this to tune the native code. Native compilers do not have this advantage: they must either create code for the general case or the developer has to do a whole bunch of work to tune the compiler and produce several versions.
 
 
 

Comments, feedback



© Copyright 2002 Sam Gentile.
Last update: 6/26/2002; 8:56:20 AM.

Click here to visit the Radio UserLand website.