Understanding LINQ (C#)


This article is about LINQ which I think is one of the most exciting features in Orcas. LINQ makes the concept of querying a first-class programming concept in .NET. The data to be queried can take the form of XML (LINQ to XML), databases (LINQ-enabled ADO.NET: LINQ to SQL, LINQ to Dataset and LINQ to Entities) and objects (LINQ to Objects). LINQ is also highly extensible and allows you to build custom LINQ enabled data providers (e.g.: LINQ to Amazon, LINQ to NHibernate, LINQ to LDAP).

I will discuss some of the new language features and improvements that are introduced in C# 3.0, and it is those features that enable the full power of LINQ which makes it possible to write something like this:

var result = from c in Customers
             where c.City == Boston"
             orderby c.LastName descending
             select new { c.FirstName, c.LastName, c.Address };

Remember that if you want to play around with LINQ or try the examples yourself, you will need to download Visual Studio Orcas Beta 1.
In case you don't want to download Visual Studio, you can check the LINQ Preview (May 2006 CTP) which runs on top of Visual Studio 2005 (there are a few changes in Beta 1 from the way LINQ worked in the May CTP).

New Language Features

public class Point {
    private int _x, _y;
    public int X {
        get { return _x; }
        set { _x = value; }
    public int Y {
        get { return _y; }
        set { _y = value; }

The above code simply defining a class with basic properties. Now with 
the new C# compiler in Orcas, we can write a shorter cleaner version 
using Automatic Properties which automatically generates the private fields with get/set operations :

public class Point {
    public int X { get; set; }
    public int Y { get; set; }

The code above is even more readable and less verbose.
Note that this
 feature has nothing to do with LINQ. I just thought it would be 
appropriate to list it with the other new language features.

II. Local Variable Type Inference

var num = 50;
var str = "simple string";
var obj = new myType();
var numbers = new int[] {1,2,3};
var dic = new Dictionary<int,myType>();

The compiler would generate the same IL as if we compiled:
int num = 50;
string str = "simple string";
myType obj = new myType();
int[] numbers = new int[] {1,2,3};
Dictionary<int,myType> dic = new Dictionary<int,myType>();

Note that there is no un-typed variable reference nor late-binding 
happening, instead the compiler is inferring and declaring the type of 
the variable from the right-hand side of the assignment. As a result, 
the var keyword is generating a strongly typed variable reference.

III. Object Initializers & Collection Initializers

Point p = new Point();
p.X = 0;
p.Y = 0;

This could be rewritten using Objects Initializers and combined into:

Point p = new Point() { X = 0, Y = 0 };

This feature can also be used with collection. Take a look at this example:

List<Point> points = new List<Point> {
    new Point { X = 2,  Y = 5 },
    new Point { X = 1, Y = -10 },
    new Point { X = 3, Y = 0 }

Note that the compiler will generate a long hand code equivalent to the above one. It makescalls to the Add() method to add elements to the collection one at a time.

IV. Anonymous Types

delegate R MyDeleg<A,R>(A arg);
We can then write using anonymous methods:
MyDeleg<int,bool> IsPositive = delegate(int num) {
                                   return num > 0;

Or we can use the new lambda expressions to write:
MyDeleg<int,bool> IsPositive = num => num > 0;

VI. Extension Methods

Extension methods make it possible to extend existing types and constructed types with additional methods, without having to derive from them or recompile the original type. So instead of writing helper methods for objects, they become part of that object itself.
As an example, suppose we want to check a string to see if it is a valid email address. We would do this by writing a function that takes a string as an argument and returns a true/false. With Extension Methods, we can do the following:
public static class MyExtensions {
    public static bool IsValidEmailAddress(this string s) {
        Regex regex = new Regex( @"^[w-.]+@([w-]+.)+[w-]{2,4}$" );
        return regex.IsMatch(s);

We defined a static class with a static method containing the Extension Method. Note how the static method above has a this keyword before the first parameter argument of type string. This tells the compiler that this particular Extension Method should be added to objects of type string. And then we can call it from the string as a member function:

using MyExtensions;

string email = Request.QueryString["email"];
if ( email.IsValidEmailAddress() ) {
    // ...

It is worth mentioning that LINQ syntax makes use of built-in Extension Methods (e.g. where(), orderby(), select(), sum(), average() and many more) that reside in the new System.Linq namespace
 in Orcas and define standard query operators that can be used against 
relational databases, XML and any .NET objects that implement IEnumerable<T>.

VII. Query Syntax

var result = Customers.Where( c => c.City.StartsWith("B") )
                      .OrderBy( c => c.LastName  )
                      .Select( c => new { c.FirstName, c.LastName, c.Address } );

The advantage of using Query Syntax is that the code is easier and more readable.
Also note that a query expression begins with a from clause and ends with either a select or group clause.

Next Post »