Master C# Logo banner
Welcome to MasterCSharp.com - Master C#, the easy way... - by Saurabh Nandu

 


Exploring your first C# application - Hello World

Add Comment
 

 
Introduction
In the previous article you learned to compile a C# program. In this article you will understand the source code written in the previous article viz. the HelloWorld program.
Later, you will skinny dip into Intermediate Language (IL) code for the HelloWorld program which will give you a greater understanding of the internal working of the .NET Platform.

HelloWorld - Source Code
Bellow is the full source code of the HelloWorld program written in the previous article:
/*
 HelloWorld.cs - First C# Program
 Written by - Saurabh Nandu
Compilation:
 csc HelloWorld.cs
*/

public class HelloWorld
{
  public static void Main()
  {
    //Print Hello World
    System.Console.WriteLine( "Hello World !" ) ;
  }
}

Code Components
Lets understand the source code:
Code Comments
The first few lines in the source code are code comments. Comments are nothing but additional text a programmer supplies to add some meaningful information for the program. Adding comments to code is optional, but a good practice since over time people tend to forget important details about the program. Liberal usage of comments helps tremendously when you want to understand a program later.

When you compile the program, the compiler totally ignores all the comments you have added, hence comments do not appear in the final executable generated.

Comments can be inserted anywhere in the source code, many time while writing applications, programmers use comments to hide a parts of a program from the compiler.

C# supports 3 kinds of comments

1) Delimited comments
Delimited comments start with /* and end with */. This style of commenting will be familiar for C++ programmers. All the text between the delimiters are considered as comments and ignored by the compiler. This kind of comment can span multiple lines.
The snip below shows an example of a delimited comment used in the HelloWorld program.
/*
 HelloWorld.cs - First C# Program
 Written by - Saurabh Nandu
Compilation:
 csc HelloWorld.cs
*/

Note: Dont forget to close this kind of comment with the */ delimiter else all the code below will be considered as comment and the compiler will throw an error.

2) Single Line Comment
Single line comments are useful to comment things in-place. They start with // and end when the line ends. As the name suggests they are single lined and if you want to have multiple lines of such comments you need to prefix all the lines with //.
These comments are generally provided over or besides the code statement. The line below shows an example from the HelloWorld program.

//Print Hello World

3) XML Documentation Comments
The third type of comments supported by C# is XML documentation comments. This style of commenting is used to document the code while coding. I will cover this style of comments in detail in a separate article.

Class Definition
Class is a blue print or template of an object. All the behavior and data for an object is packaged into the class. You can also call class as the basic building block of your application, all code written is within the context of a class
The following lines from the HelloWorld code consist of the class definition:

public class HelloWorld
{
..
..
}

The keyword class is used to define a class in C#. The keyword public before the word class is the access modifier for the class, indicating that any assembly can create an instance of the class.
After the class definition, the scope operators i.e. braces { } are used to demark the class. All the behavior and data of the class has to be contained within the brace delimiters.
I will discuss Object Oriented Programming as well as classes in more detail in the following articles.

Main method Entry point of an application.
The behavior of the object is broken down into various methods within the class. In my example I have one method that is the Main method defined as follows:

public static void Main( )
{
..
..
}

The definition of the Main method starts with its access modifier i.e. public, hence any object can call this method. Its followed by the keyword static, this keyword makes turns the method into a class level member. Hence you do not have to first create an object of the class to call the Main method. I understand things might seem a bit cloudy now, but they will get clarified as we move on.
Then there is the return type, after every method finishes execution it returns the control back to the caller. When a method returns, it has to have a return value. In this case since this method is going to return no value the void keyword is used to denote that the method returns no value. Finally, there is the method name, Main with a set of empty parentheses. Like the classes, methods too use scope operators i.e. { } to define the scope of the method. All the code of the method has to be packaged within the scope operators for the method.

There is a special significance attached with the method Main in C#. When you click on an application to run it, the runtime should know from where (which class, which method) it should start executing the code. In C#, like many other programming languages there is the Main method which the runtime uses by default to start running the application. Hence the Main method is also known as the entry point of the application.

Note: C / C++ / Java programmers its called the Main method with a capital M, unlike other languages.

Note: Unlike C++ Main method is not a global function, it has to be defined as a class member.

Code Statement Method Call
Within the Main method there is one line of code statement which actually performs the work of displaying the Hello World message on the console. This statement defines the behavior of the Main method.

System.Console.WriteLine( "Hello World !" ) ;

In the above line a call is made to the WriteLine method of the Console class by passing a string parameter to it.

System is a namespace, its closest analogy in Java is the package. Namespaces are used to resolve the issue of name clashing i.e. two classes from different libraries having the same name. Namespace is a logical grouping of related classes. Namespaces are discussed in detail in the following articles.

Console is a class, like the HelloWorld class defined in the System namespace. Within the Console class there is a class level WriteLine method that is called in the example. The dot (.) operator is used to reference a class belonging to a namespace as well as it is used to reference a method within a class.
Since the WriteLine method is also a class level method like the Main method it can be called directly without making an instance of the Console class.

A string parameter (Hello World !) is passed to the WriteLine method, this parameter is used by the WriteLine method to print on the Console screen. The behavior of the WriteLine method is to take a parameter and print it on the console screen.
Since in the HelloWorld application the only statement is to print the message to the screen, once the message is printed the program will end automatically since it has no more tasks to perform.

Note: String values are delimited by quotes .

Note: All statements in C# end with a semi-colon. If you forget to add a semi-colon after a C# statement the compiler will complain.

This ends the overview of various components of the HelloWorld program.

Understanding the compilation and execution process of a C# application.
Compilation Process
In the previous article you learned to compile a C# source code file and execute it. Lets see what happens underneath when you compile a C# source code file. The diagram below represents the compilation process.


Figure 1: C# Compilation Process

During the compilation of a source code file, the C# compiler (csc.exe) converts the C# code into Microsoft Intermediate Language (MSIL) code. It packages the MSIL code as a Win32 executable file with some extended features. The header table of the executable has been expanded to accommodate additional metadata about the assembly. Also the code contained within it is not assembly language but its MSIL.

Note: MSIL and IL refer to the same thing. Thats intermediate code generated by the language compiler. Java people can compare it with Java Byte Code.

Execution Process
When the user either clicks on the executable assembly or calls it from the command prompt the execution process starts. First, the Operating System loads the executable assembly, the C# compiler crafts the assembly in such a way that as soon as the OS load it the control jumps to the Microsoft .NET Runtime or Common Language Runtime. The common language runtime has a Just-In-Time compiler which is very smart. The JIT compiler inspects the IL within the assembly and identifies the part thats required to run the assembly. It only converts the required IL into native code. Native code then executes against the hardware generating the final output.

Note: Since C# Executable Assemblies have IL code they cannot execute without the .NET Runtime. Hence you cannot run your C# applications on client machines without the .NET framework installed.


Figure 2: C# Application Execution Process

Skinny Dipping in IL
There are a lot of articles available today that teach you the C# language. I was looking at ways to add some interesting learning besides just learning the language. In order to get in-depth understanding of how Microsofts C# compiler emits IL code which is consumed by the .NET runtime its necessary to dig into IL. Please be aware that the information in this section totally relies on the version of the C# compiler being used, Microsoft can anytime choose to optimize/change the output of IL from the source code.

The goal of this section is not to teach you to write IL, but its to show you some nifty things in IL as well as reconfirm a few things we have discussed above. Digging into IL will give you a clear understanding of the concepts.

Note: Exploring IL code can be an addictive hobby!! Caution is advised.

Intermediate Language (IL) An Overview
As described in the compilation process section above, when a C# source code is compiled by the C# compiler its converted into an assembly containing IL code.
The fact that makes this a very important step is that all managed languages like VB.NET, Jscript.NET, J# etc. follow the same process. That is, on compilation of any managed language the compiler produces an assembly with IL code. IL is a language itself, which has been standardized by ECMA.
The impact of this is that the .NET runtime can run any managed assembly without caring in which language it was created. IL abstracts out the language specific details from the .NET Runtime. Hence even you can write a new language can create a compiler that will generate IL code and everyone who had just the .NET Runtime installed will be able to execute your application.
IL can also be considered the assembly level language for the .NET Runtime. Even though its a bit cryptic to understand, still its quite verbose to make some sense out of it. In fact once you learn to read and understand IL you can explore the internals of the libraries provided by Microsoft.

IL Disassembler (ILDASM) Tool to explore IL
In order to extract the IL code out of a managed assembly you need to use the ILDASM tool. This tool along with other development tools can be generally found in the C:\Program Files\Microsoft .NET\FrameworkSDK\v1.1\Bin directory.
There are a lot of features in this tool. In this article it will be only used to extract the IL code from the assembly.

Analyzing the HelloWorld.exe assembly
Let us extract the IL for the HelloWorld.exe assembly we generated in the previous article. Open the command prompt and navigate to the directory containing the HelloWorld.exe assembly (c:\csharp).
On the command prompt give the following command to extract the IL from the assembly.

ildasm HelloWorld /output:HelloWorld.il


Figure 3: Decompiling HelloWorld.exe

This command will extract the IL from the HelloWorld.exe assembly into HelloWorld.il file.

Note: Please ensure you have compiled the HelloWorld.cs source code to produce the HelloWorld.exe assembly before you can disassemble it.

Open the HelloWorld.il file in notepad (or any other text editor). Now dont get scared by all the code in that file.
Scan through the file till you find the Class Member Definition section. The snip below displays the IL from this section.

.class public auto ansi beforefieldinit HelloWorld
       extends [mscorlib]System.Object
{
  .method public hidebysig static void  Main() cil managed
  {
    .entrypoint
    // Code size       11 (0xb)
    .maxstack  1
    IL_0000:  ldstr      "Hello World !"
    IL_0005:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000a:  ret
  } // end of method HelloWorld::Main

  .method public hidebysig specialname rtspecialname 
          instance void  .ctor() cil managed
  {
    // Code size       7 (0x7)
    .maxstack  1
    IL_0000:  ldarg.0
    IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
    IL_0006:  ret
  } // end of method HelloWorld::.ctor

} // end of class HelloWorld

Class Definition in IL
The .class IL keyword defines a class. You can note that the access modifier public is also present in the class definition. The extends keyword denotes that the HelloWorld class inherits from the System.Object class. Every class in .NET implicitly inherits from a base class Object defined in the System namespace even if its not explicitly mentioned (we will learn about inheritance in future articles). Similarly, in the HelloWorld source code even though we did not inherit HelloWorld class from any other base class it implicitly inherits the System.Object class. The [mscorlib] attribute declaration indicates that the System.Object class is implemented in the mscorlib assembly. This way of .NET to clearly mark the assemblies from which the classes are referenced makes all .NET assemblies totally self-describing.

Main Method in IL
As noted above the Main method has a special significance in C#. Since the Main method is the first method that is called when the application starts. The snip below shows the IL for the Main method.

.method public hidebysig static void  Main() cil managed
  {
    .entrypoint
    // Code size       11 (0xb)
    .maxstack  1
    IL_0000:  ldstr      "Hello World !"
    IL_0005:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000a:  ret
  } // end of method HelloWorld::Main


The .method IL keyword defines a method. The access modifier public, the method type static and the method return type void can also be observed.
The most interesting IL keyword to note is .entrypoint. Its the .entrypoint keyword which indicates to the .NET runtime to start executing the assembly from this method.

Its interesting to note that its the languages like C#, VB.NET that lay the rule that an assembly can only start from a method called Main. On the IL level any public static method that has the .entrypoint keyword defined will be first called when the application starts.

Note: There cannot be two .entrypoint keywords defined within a single assembly.

Note: All executables (*.exe) need to have an entrypoint. Dlls dont have/need an entrypoint.

On the line IL 0000 the ldstr keyword allocates memory for the string Hello World ! on the stack.

Line IL 0005 then uses the call keyword to call the WriteLine method that takes a string parameter from the stack as an argument. The WriteLine method is from System.Console class from the mscorlib assembly, this is clearly expressed in IL.

Lastly, line IL 000a returns back. As mentioned earlier all methods return after completing execution.

Default Constructor in IL
Another implicit declaration is that every class has a default constructor. A constructor is a special method that is called when an object is created. All the code to initialize an object before it can be used is added in the constructor. The default constructor for the HelloWorld class is shown below.

.method public hidebysig specialname rtspecialname 
          instance void  .ctor() cil managed
  {
    // Code size       7 (0x7)
    .maxstack  1
    IL_0000:  ldarg.0
    IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
    IL_0006:  ret
  } // end of method HelloWorld::.ctor


The C# compiler by default adds a constructor in the IL code if your class does not contain the definition of it.
Here the .method defines a new method but the method as a special name .ctor indicating its the classs constructor.

This method also returns no value as indicated by the void return type.
On line IL 0001 the default constructor of HelloWorlds base class i.e. System.Object classs is called. Line IL 0006 returns the control after the method has finished executing.

Note: On scanning the HelloWorld.il file you must have noticed that comments added in source code do not appear in the IL code.

I hope this insight into IL will help you understand the internals of C# better and provide a deeper understanding on why things have been implemented in a certain way.

C# Keywords Encountered
1) class Defines a class
2) public Access Modifier for class and methods
3) static Defines the method to be a class level member.
4) void - Used to indicate that a method returns no value

Points to Remember
1) C# has multiple types of comments. A liberal usage of comments is recommended. Comments are omitted by the C# compiler while compiling.
2) Class is the blue print / template of an object. All behavior and data is wrapped inside the class.
3) Main method is required in all applications and is the first method the .NET Runtime uses to start executing the application.
4) The WriteLine method of the System.Console class is used to print messages to the console window.
5) The C# compiler csc.exe converts the C# source code into an assembly containing IL code.
6) When you execute a managed assembly the JIT compiler within the .NET Runtime is invoked and it compiles the necessary IL code from the assembly into native language code.
7) All managed language compilers compile to produce an assembly containing IL code.
8) ILDasm is the tool that can be used to disassemble a managed assembly into IL.

Next Step
Read this article that provides a quick overview of Object Oriented Programming (OOPS).

Curious Minds
1) Try to modify the HelloWorld program to print your own message to the screen. Remember to save the source code after each code change and recompile the program to reflect the changes.
2) Try adding more comments in the C# source code.
3) What happens when 2 or more classes in the same assembly have the Main method defined? Which method is used by the runtime to while loading the application?
(Hint: Look at the C# compiler options documentation for the answer.)
4) Explore the System.Console class and try using its different methods.

Comments

Add Comment