Unity in diversity

This excerpt is from the book >NET 4.0 Generics Beginner's Guide authored by Sudipta Mukherjee.ISBN 1849690782, Copyright 2012, Release Date January 2012. For more info, please visit the publisher site http://www.packtpub.com/net-generics-4-0-beginners-guide/book .

In the last two chapters, we have learned about lists and dictionaries. These are two very important data structures which .NET Generics has to offer. The ability to query these in an efficient manner is a very important factor for the all-over efficiency of an app.

Imperative programming languages enforce the how part of the program more than the what part. Let me explain that a little bit. Suppose you want to sort a list of student objects. In an imperative style, you are bound to use a looping construct. So the what part is to sort the elements and the how part is the instructions that you write as part of the program, to achieve sorting.

However, we are living in a very exciting time where a purist programming approach is fast becoming a thing of the past. Computing power is more popular than ever. Programming languages are going through a change. Declarative syntax that allows programmers to concentrate on the what part more than the how part is the new trend. This allows compiler developers to tune the compiler to emit optimized machine code for any given declarative syntax. This will eventually increase the performance of the application.

.NET 3.5 came up a step ahead with the introduction of Language Integrated Query (LINQ). This allows programmers to query in-memory objects and databases (in fact any data source) alike, while it does the required optimization in the background.

In this chapter, we shall learn:

  • How to put your own method for existing classes using Extension methods
  • How to use Functors
  • How to use Actions
  • Lambda expressions
  • LINQ Standard Query Operators (LSQO)

Some of the LSQO for joining have been left out deliberately as they are not very useful for LINQ to Objects in general. A couple of other operators such as Aggregate and Average are also left out as they are not very applicable to LINQ to Objects in general; they find specific usage while dealing with numeric data types.

After reading this chapter, you should be able to appreciate how LINQ drastically removes the complexity for querying in-memory collections and the usage of looping constructs and its side effects. There is a lot more in LINQ to Objects than we can discuss in this chapter. We shall concentrate more on the LSQO as we need them to be able to query-out generic collections in a better optimized way.

What makes LINQ?

LINQ is not a technology in itself, instead it is made up using different building blocks. These are the things that make LINQ.

Extension methods

.NET Generics follows a good design philosophy. This API is open for extension, but closed for modification. Think about the .NET string API. You can't change the source code for any member method. However, if you want you can always put a new method that can behave as if it was built-in.

The technology that allows us to do this legal stretching (if you will) is known as the

Extension method. Here are a few very important details about Extension methods:

  • Every Extension method must be in a static class.
  • Every Extension method must be static.
  • Every Extension method must take at least one parameter, passed as this Type <parameter name>. One example is this string name.
  • The argument declared with this has to be the first one in the parameter list. Every other parameter (which is optional, if any) should follow.
  • This gives a feel that this newly introduced method is like one of those inbuilt ones. If we want to inject a method for all concrete types of an interface, the Extension method declared on the interface is the only option.

Time for action - creating an Extension Method

Let's say that you have a List of strings and you want to find out how many match a given regular expression pattern. This is a very common activity:

  1. Create a class library project. Call it StringEx. We are trying to extend string class functionalities, so I named it StringEx.
  2. Rename the generated class as StringExtenstions.cs.
  3. Mark the class as static: public static class StringExtensions
  4. Add the following using directive to the header of the class: using System.Text.RegularExpressions;
  5. Add the following code snippet as the method body:
public static bool IsMatching(this string input, string pattern)
{
return Regex.IsMatch(input, pattern,RegexOptions.CultureInvariant);
}

What just happened?

We just created an Extension method called IsMatching() for the string class. The first argument of the method is the calling object. pattern is the regular expression against which we want to validate the input string. We want to compare the strings ignoring the culture.

So, while calling IsMatching(), we shall just have to pass the pattern and not the string.

Time for action - comsuming our new Extension method

Now, let's see how we can consume this method:

  1. Stay in the project you created. Add a console application. Call it StringExTest.
  2. Add a reference of StringEx class library to this project:
  3. Add the following using directive in Program.cs: using StringEx; (see figure below)
  4. Add the following string variable in the Main() method. This validates Swedish zip codes:
string pattern = @"^(s-|S-){0,1}[0-9]{3}\s?[0-9]{2}$";

  1. Add a few test input strings in a List of strings:
List<string> testInputs = new List<string>()
{"12345", "425611", "932 68", "S-621
46", "5367", "31 545" };
Else
    
Console.WriteLine {testInput + } “ is NOT a valid Swedish ZIP code”)

What just happened?

As this IsMatching() method is described as an Extension method in StringEx; and StringEx is referenced in this StringExTest project, IsMatching() will be available for invocation on any string object in the StringExTest project.

Any Extension method appears with a downward arrow, as shown in the following screenshot. Also the compiler shows the intelligence support (if it is selected) mentioning that this is an Extension method:

Notice that while calling the Extension method, we do not need to pass the string argument. testInput is the string object on which we call this Extension method.

The keyword this in the parameter list did the trick.

It is possible to define the Extension method for any type of .NET class. So all you have to do is add a static class and add the static Extension methods to the class. You can find an API

for string processing written using the Extension method at http://www.codeplex.com/ StringDefs. I created this for different kinds of string processing needs. This will give you an idea about how Extension methods can be written for several purposes.

For using LINQ to Objects, we shall need System.Linq namespace. The standard

query operators of LINQ are based on the Extension methods that extend any type that

implements IEnumerable<T>.

Check out these guidelines for when not to use Extension methods

Although an Extension method might seem to solve your problem, too much usage of the Extension method is probably an indication that you should re-think your design a little bit. Remember; they are meant to extend the functionality of an existing type. So, if the

functionality you want to achieve can't be done using already existing methods, you are better off creating your own type by inheriting the type you wanted to extend.

Object initializers

Assume that you have a class called Student and you want to initialize student objects using different variables at different times. Sometimes, you will only have name and age, while at other times you might also have the course they are enrolled in as part of the input. These situations call for a set of parameterized constructors in the Student class.

Otherwise, we can have a blank constructor and public properties for the Student class attributes. In this situation, the code to assign a student object might look similar to the following:

Student sam = new Student(); sam.FirstName = "Sam"; sam.LastName = "Hood";
    
Student dorothy = new Student(); dorothy.Gender = "F"; dorothy.LastName = "Hudson";

This type of object initialization spans over multiple code lines and, at times, is difficult to read and keep up. Thus, C# 3.0 came up with a feature called object initializers that allows programmers to construct and assign an object in a single line. Any public field or property can be assigned in the initialization statement by assigning that property name to a value. Multiple assignments can be made by the comma-separated assignment list. The C# compiler calls the default constructor behind the scene and assigns each public field or

property as if they were previously declared and assigned one after another. So, the previous two initializations will be changed to the following:

Although this technical advancement might, at first, look like syntactical sugar, while writing LINQ queries, we shall find ourselves using this feature quite often to construct object instances as part of the result set by setting property values on the fly.

Collection initializers

Taking the object initializers a step ahead, the ability to add elements to the collections at the time of creating them is added. This is called a collection initializer. The only constraint is that the collection has to implement the IEnumerable interface and provide an appropriate Add() method implementation that will be used to add the elements.

The good thing is that collection initializers can use object initialization technique to create objects on the fly while adding them to the collection. Here are some examples:

List<string> names = new List<string>() { "Anders", "David", "James", "Jeff", "Joe", "Erik" };
List<int> numbers = new List<int>() {56, 12, 134, 113, 41, 1, 0}; List<Student> myStudents = new List<Student>()
{
new Student(){Name = "Sam", Course = "C#"},
new Student(){Name = "Dorothy", Course = "VB.NET"}
};

Implicitly typed local variables

C# 3.5 introduced a new way to declare variables using the var keyword. This type of a new variable is inferred from the initialization expression. These types of variables are strongly typed.

Here are a few examples:

var shirtSize = "L"; var shoeSize = 10; var age = 29;
var coursesTaken = new List<string>() { "C#", "C++", "Java", "Ruby" };
    
Console.WriteLine(shirtSize + " " + shirtSize.GetType().Name); Console.WriteLine(shoeSize + " " + shoeSize.GetType().Name); Console.WriteLine(age + " " + age.GetType().Name); Console.WriteLine(coursesTaken + " " + coursesTaken.GetType().Name);

And here is the output for the preceding snippet:

L String

10 Int32

29 Int32

System.Collections.Generic.List`1[System.String] List`1

Although declaring a variable in this way may seem to have little value, it is a necessary feature for declaring some types that can't be declared in any other way; for example, anonymous types.

Anonymous types

Anonymous types are compile-time generated types where the public properties are inferred from the object initialization expression at compile time. These types serve the purpose of temporary storage. This saves us from building specific classes for any operation.

Anonymous types are declared using the var keyword as an implicit variable and their type is inferred from the expression used to initialize.

Anonymous types are declared by omitting the type after the new keyword. Here are a few

examples:

var item = new { Name = "Sam", Age = 30 };
var car = new { Make = "Honda", Price = 30000 };

These types are used in queries where collections are built using a subset of properties from an existing type also known as projections. We shall see how these help in LINQ in a short while.

Lambda expressions

Lambda expressions are built on an anonymous method. These are basically delegates in a very brief syntax. Most of the time these expressions are passed as an argument to the LINQ Extension methods. This type of expression can be compound and can span multiple lines while used in LINQ Extension methods also known as LINQ Standard Query Operators or LSQO in short.

Lambda expressions use the operator => to separate parameters from expressions.

Before the Lambda expression came into existence, we had to write query using delegates

as follows:

Students.Find(delegate(Student s) {return s.Course == "C#";});

Using the Lambda expression, the query will be as follows:

Students.Find(s => s.Course == "C#");

The highlighted part in the second code line is the Lambda expression. One way to visualize this is to think about mathematical sets. Assume => stands for rule of membership. So the preceding line is basically trying to find all the students where course is C# and s denotes a temporary variable of the data source being queried; in this case the Students collection.

This is a very important concept for understanding and using LINQ.

Functors

Functors are basically space holders for Lambda expressions. It is a parameterized type in C#. There are 17 overloaded constructors to declare different types of Functors. These are created by the Func<> keyword as follows:

Notice the tooltip. It is expecting a method name that accepts no parameter but returns a

Boolean value. So boolMethod can hold any method that matches that signature.

So if we have a method as shown in the following snippet:

private static bool IsTime()
{
return DateTime.Today.Day == 8 && DateTime.Today.Month == 7;
}

the Functor can be created using the method name as an argument to the constructor,

as follows:

Func<bool> boolMethod = new Func<bool>(Program.IsTime);

The last parameter to a Functor is the output and the rest all are input. So the following

Functor constructor expects a method that accepts an integer and returns a bool:

These types of Functors that take some argument and return a Boolean variable (in most cases due to some operation on the input argument) are also known as Predicates.

You can also use Lambda expression syntax to declare Functors as follows:

//Declaring the Functor that takes an integer and checks if it is odd
//When the Functor takes only one input it is not needed to wrap the
//input by parenthesis. Highlighted part is the Lambda expression
//So you see, Lambda expressions can be used directly where the
//argument type is of Func<>
Func<int, bool> isEven = x => x % 2 == 0;
    
//Invoking the Functor
bool result = isEven.Invoke(8);

Most of the LSQO accept Functors as selector functions. A selector function is a function that operates on each of the element in a collection. The outcome of these operations decide which values will be projected in the resulted collection. This will be more clear as we discuss all the LSQO in a while.

Predicates

Predicates are essentially Func<int, bool>. You can use Lambda expressions to initialize them like Functors as follows:

Predicate<int> isEven = new Predicate<int>(c => c % 2 == 0);

Here is how you can call this Functor:

int[] nums = Enumerable.Range(1, 10).ToArray();
foreach (int i in nums)
if(isEven(i))
Console.WriteLine(i.ToString() + " " + " is even " );

The Lambda expression is hidden. It is advisable to use Func<> for LINQ-related operations as it matches the signatures of the LSQO. Predicates was introduced with System. Collections.Generic in .NET Framework 2.0 and Func<> was introduced in .NET 3.5.

Although they behave similarly, there are a couple of key differences between a Predicate

and a Functor:

  1. Predicate always returns a Boolean value whereas Functors can be programmed to return a custom type:
public delegate bool Predicate<T>(T obj);
public delegate TResult Func<TResult>();

  1. Predicates can't take more than one parameter while Functors can be programmed to take a maximum of four input parameters:
public delegate TResult Func<T1, T2, T3, T4, TResult> (T1 arg1, T2 arg2, T3 arg3, T4 arg4);

Actions

This is a built-in delegate, introduced in .NET Framework 4.0. There are several overloaded versions to handle different styles of delegates.

Here is an example:

Action<string> sayHello = new Action<string> (c=>Console.WriteLine("Hello " + c));
    
List<string> names = new List<string>() { "Sam", "Dave", "Jeff", "Erik"};
names.ForEach(sayHello);

ForEach() is a method in the List<T> class that takes an argument of the type

Action<T>. The preceding snippet prints the following output:

Hello Sam

Hello Dave

Hello Jeff

Hello Erik