As a developer, it is no surprise to encounter unexpected behavior when working with a new technology.Microsoft added the Entity Framework (EF) to ADO.NET with the .NET 3.5 Service Pack 1 released in 2008 enabling developers to incorporate a data model directly in their application and interact with their data through the model rather than

working directly against the database. For background on EF, see my previous article, “Introducing ADO.NET Entity Framework” in the Nov/Dec 2007 issue of CODE Magazine.

Because ADO.NET Entity Framework approaches data access from a perspective that is new to many developers, what might be expected behavior in EF can stump developers who are new to working with the technology. I have been working with EF for over two years and spent a good part of that time writing an in-depth book on the topic (Programming Entity Framework, O’Riely). So along the way I encountered many bumps. Once I had a better understanding of EF, I was able to avoid many of these issues or at least understand them quickly when I saw an error or exception message. Being a version 1 product, there are also a number of surprises that are not “by design”.

In this article, I will introduce you to some of the more painful “gotchas” that you are likely to encounter while working with Entity Framework. These are not the obvious pain points about EF that you may have heard about frequently such as lack of lazy loading or the difficulties of change tracking across tiers. Instead I will focus on some lesser known gotchas which I have experienced myself, and whose repeated occurrence in the MSDN Forums [http://social.msdn.microsoft.com/forums/en-US/adodotnetentityframework/threads/] assures me that I am not alone with these surprises. For each one of the gotchas, I’ll explain the behavior, why it is occurring and how to avoid or work around it, if possible. My goal is that you will be aware and prepared for these behaviors and therefore your workflow won’t be severely interrupted by them.

1. Runtime LINQ Query Failures

LINQ is an enhancement to both the Visual Basic and C# languages. LINQ provides you with strongly typed query syntax and IntelliSense to aid you in constructing queries. With LINQ to Entities, and LINQ to SQL for that matter, there is something that you need to be considerate of when constructing LINQ queries. Not everything that is valid in the eyes of the .NET Framework is equally valid in the eyes of your database engine. It is quite possible to write a LINQ to Entities query that compiles perfectly well, yet throws an exception at run time.

It is quite possible to write a LINQ to Entities query that compiles perfectly well, yet throws an exception at run time.

Take, for example, this very tempting query that tries to format my return values as part of the query using a very convenient .NET method-ToShortDateString.

C#
from c in context.Customers
select new {
 FirstName = c.FirstName.Trim(),
 LastName = c.FirstName.Trim(),
 ModDate=c.ModifiedDate.ToShortDateString()
 }
    
VB
From c In context.Customers
Select
 FirstName = c.FirstName.Trim,
 LastName = c.LastName.Trim,
 ModDate = c.ModifiedDate.ToShortDateString

This feels right when I construct it and both the C# and VB compilers are happy. The ToShortDateString method is a huge convenience and I expect it to return nice looking results.

But at runtime, the query throws a NotSupportedException when EF attempts to compile the query. The exception message is straightforward.

LINQ to Entities does not recognize the method 'System.String ToShortDateString()' method, and this method cannot be translated into a store expression.

Why is this? SQL Server has no method, operator or function that is comparable to ToShortDateString.

Entity Framework and LINQ to SQL can only compile methods and functions that have some counterpart in the target store. In the case of Entity Framework, the store could just as easily be Firebird or Oracle as it could be SQL Server. Even if SQL Server had a comparable method, every EF provider would also need to target a database that supports this method as well. Therefore, LINQ to Entities does not support the method.

But the .NET compiler has no way of knowing this at compile time because of the way Visual Studio evaluates LINQ expressions. All the compiler sees is valid .NET syntax. It is not until runtime that the process of translating the LINQ to Entities query into a store command (e.g., T-SQL or PL/SQL) is performed; and it is at that time that the problem with the query is detected.

As a side note, because Entity SQL is a string-based syntax, the compilation of its queries is completely deferred until runtime. Entity SQL does, however, have the advantage of leveraging provider-specific functions and operators.

There’s no way to avoid this gotcha except to have an awareness of what methods are reasonable store functions. Other than that, testing your application and all of your queries will protect you from releasing these problems into the wild.

2. Modified SSDL Gotcha

The Entity Framework metadata (the Entity Data Model (EDM) schema along with the mapping and store schema) support a wide variety of customization possibilities. However, the current iteration of the EF designer tools does not support all of these features. One of the interesting features of the Store Schema Definition Language (SSDL) is that while it is a representation of the schema of your database, you can modify the SSDL to enhance your model. A change to the SSDL does not impact your database, but allows you to build logic into your model that is not part of the data store.

Some of these enhancements are DefiningQueries (which allow you to inject store commands), modifying attributes of columns or even defining new parameters in stored procedures.

You can solve a lot of quandaries by modifying the SSDL directly. However, there is an expensive gotcha when taking advantage of this powerful capability. Some (but not all) of these manually added enhancements to the SSDL will be overwritten when you use the Update Model Wizard.

Why would this happen? The Update Model Wizard recreates any element in the SSDL which represents an object in the database. For example, if you have a Customer table in the database, the wizard’s job is to ensure that the SSDL correctly represents that table. If you have made any manual changes to the elements or attributes of that table’s definition in the SSDL, you will lose them completely as the wizard rebuilds the description of the Customer table.

The wizard will not touch elements that it does not recognize. For example, if you create a DefiningQuery to define a “virtual” stored procedure, in other words a procedure that does not exist in the database, but one which you want Entity Framework to be able to execute, this DefiningQuery will remain intact when you update the model. There is nothing to overwrite it with.

In all, the Update Model Wizard does not completely rewrite the SSDL, but only updates pieces of it. Those pieces that have a true counterpart in the database will be rewritten from scratch and any changes to those particular elements will be lost.

I’ve suffered from this overwrite enough times to make me think twice about customizing entities in the SSDL.

There is, unfortunately, no workaround for this problem. I tend to be very strict with my use of SSDL customizations and when I do insert them, I keep a separate file that I can use to copy and paste them back in if necessary. I hope that someday, some clever developer will find a way to create a tool for merging SSDL elements, perhaps something like “partial classes for XML”.

3. Function Mapping to Stored Procedures in Inheritance Hierarchies

While the EDM does support stored procedures in a number of scenarios, the rules by which these can be implemented are very strict. You may find yourself in a situation where you have happily mapped some stored procedures (called “functions” in the model) and then suddenly realize that you are about to unravel a large ball of yarn.

I have hit one particular function mapping rule numerous times. The rule is that if you have a stored procedure mapped to an entity that is part of an inheritance hierarchy, you must also map stored procedures to every entity in that hierarchy. If you have a lot of inheritance built into your model and you have not planned ahead for this, you will certainly be surprised.

If you have a stored procedure mapped to an entity that is part of an inheritance hierarchy, you must also map stored procedures to every entity in that hierarchy.

Imagine that in your database, you have a Person table, a Customer table and a SalesPerson table. The Customer and Salesperson tables provide additional details for those people who are either a Customer or a Salesperson as shown in Figure 1.

Figure 1: Table schema in database.

In your model you have identified the Customer and SalesPerson as entities which derive from Person in a Table per Type inheritance (Figure 2).

Figure 2: SalesPerson and Customer entities inheriting from Person.

This is very nice because you can easily query data and interact with these entities without having to constantly navigate through the Person entity in order to get to properties such as FirstName and LastName. For example, rather than writing a query that looks like this:

C#
from c in context.Customer.Include("Person")
orderby c.Person.LastName
select c;
    
VB
From c In context.Customer.Include("Person")
Order By c.Person.LastName
AdventureWorksLTEntity

You could express the query without navigating through Person to get to the LastName property.

C#
from c in context.People.OfType<Customer>
orderby c.LastName
select c;
    
VB
From c In context.People.OfType(Of Customer)
Order By c.LastName

Similarly, when interacting with the resulting Customer objects, you don’t need to go through the Person navigation property to get at the scalar properties of Person.

Now imagine that you have a nice set of insert, update and delete procedures in your database for the Person table. You can map them directly to the Customer entity in the model and SaveChanges will call those stored procedures when persisting modifications to a Person entity. However, when you validate the model, you’ll get an error telling you:

If an EntitySet mapping includes a function binding, function bindings must be included for all types. The following types do not have function bindings: Model.Customer, Model.SalesPerson.

The message says that you must also map insert, update and delete procedures to Customer and SalesPerson.

This is not a limitation of the designer or a problem with the model. Rather, it is an effect of how the model mappings work and more importantly, how the Entity Framework interprets the model in order to send the updates from your application to the database. As you grow to understand EF more thoroughly, this will not come as a surprise but might act as a deterrent from mapping the functions if you were hoping to quickly build a model and get to the task of building your program.

To me, this rule is not a deterrent or annoyance. A well-designed data model is an investment in your application. Either way, it is definitely something that will surprise you if you are not prepared for it.

NOTE: This rule applies to associated entities as well in certain scenarios. If you map functions to an entity which is the 1 or 0..1 end of a relationship you are also required to map the associated entities. Therefore if you mapped functions to Customer and Customer has a one-to-many relationship with Order, you must also map functions to the Order entity. However, if you map functions to the Order entity, which is the “many” end of this association, you don’t have to map functions to Customer.

4. ObjectContext.Refresh and Entity Graphs

A convenient and important feature of the ObjectContext class is the ability to refresh cached objects from the database. This can be used in response to OptimisticConcurrency Exceptions or as part of the general logic of your application. ObjectContext.Refresh supports both ClientWins and StoreWins options.

A ClientWins refresh will push entity values into the database and then set the state of the entity or entities involved to Unchanged. A StoreWins refresh will pull the current server values into the target entities and again, set their EntityState to Unchanged. In effect, StoreWins overwrites any changes made by the user with the database values.

When calling Refresh, you must identify which cached entities are to be refreshed. This could be a single entity or some set of entities that are contained in a variable, for example, a variable that points to a list of customer entities. What Refresh does not do is simply refresh every entity in the cache. The following code snippet shows Refresh being used with a specific list of Customer entities.

C#
List<Customer> custList=
    context.Customers
   .where(c=>c.FirstName=="Julie")
   .ToList();
context.Refresh
 (RefreshMode.ClientWins,custList);
    
VB
Dim custlist As List(Of Customer) = _
    context.Customers _
   .Where(Function(c) c.FirstName="Julie") _
   .ToList
context.Refresh _
 (RefreshMode.ClientWins,custList)

However there is something else that Refresh does not do. I discovered this when hammering on Refresh relentlessly while I was writing a chapter about Entity Framework Exceptions for my book. Refresh will not update graphs. It will only update the parent entity in a graph. In other words if you call Refresh and pass in a Customer graph, e.g., a Customer with Orders and LineItems attached to it as shown in the following example, only the Customer entity will get refreshed.

C#
Customer cust=
 context.Customers.Include("Orders.Items")
 .Where(c=>c.FirstName=="Julie")
 .First();
context.Refresh
 (RefreshMode.ClientWins,cust);
    
VB
Dim cust As Customer = _
 context.Customers.Include("Orders.Items") _
 .Where(Function(c) c.FirstName="Julie") _
  .First
context.Refresh
 (RefreshMode.ClientWins,cust)

Looking in SQL Profiler at the store command sent by the Refresh, you can see that only the Customer was refreshed and not the other entities in the graph.

SELECT
1 AS [C1],
[Extent1].[PersonID] AS [PersonID],
[Extent1].[AccountNumber] AS [AccountNumber],
[Extent1].[ModifiedDate] AS [ModifiedDate],
[Extent1].[TimeStamp] AS [TimeStamp],
[Extent1].[CustomerTypeID] AS [CustomerTypeID]
FROM [dbo].[Customer] AS [Extent1]
WHERE [Extent1].[PersonID] = 54

What’s worse is that there is no exception thrown. I call this a quiet failure. The way Refresh works is by design and the fact that you have requested that the full graph be refreshed is simply ignored. I imagine that many applications out there use Refresh with graphs and the developers have no idea that this method is not doing what they had expected.

I don’t know of a workaround for this except to be aware of the behavior.

5. Destroying an ObjectContext without Detaching Its Cached Entities

This gotcha really surprised me to the tune of many hours of wondering what on earth I was doing wrong. Finally, I asked about it in the MSDN forums (see the MSDN Forum Thread sidebar) and learned that is purely the result of the logic of how Entity Framework’s ObjectContext manages its entities.

You will likely discover this gotcha if you are attempting to cache entities to be used by different contexts throughout the lifetime of your application.

Here’s how I first approached this task. I executed a query that retrieved some Product entities that I would use in various places in my application.

C#
prodList =
 context.Products
        .OrderBy(p => p.ProductName)
        .ToList();
    
VB
prodList = _
 context.Products _
        .OrderBy(Function(p) p.ProductName)_
        .ToList

Then I disposed the ObjectContext, leaving the objects in memory. Next I spun up a new context and within that context I created a new Order and then a Line Item. As part of the Line Item definition, I attached one of the products from the prodList.

C#
newItem.Product= prodList[0];
    
VB
newItem.Product=prodList(0)

At runtime, when I finally attached the new order graph to the context, an InvalidOperationException was thrown with the message:

An entity object cannot be referenced by multiple instances of IEntityChangeTracker.

It took me a while to understand where it was coming from. The problem was caused by the Product I had attached to the LineItem. Although the original context which was used to query the Products was long gone, deep within the Product object there was a lingering reference to that ObjectContext. When I attempted to add the Order graph, including the particular Product that I had attached to the LineItem, it was not possible to attach the Product to the new context.

I would have had to explicitly detach each of the products from that original context before disposing the context. Unfortunately, that was not something that seemed necessary because this is very subtle behavior. As explained by Jeff Derstadt in the MDSN thread mentioned earlier, the fact that Detach didn’t happen implicitly when the context was disposed was a design decision.

Now that I understand EF much better, it makes perfect sense to me that I should use the NoTracking MergeOption for the initial Product query when I intend to use the results as a reference list. This solution completely avoids attaching the products to the context and therefore, no detach would be necessary.

Because I originally used a LINQ to Entities query to retrieve the Products, the following code first casts that query to an ObjectQuery, then applies the MergeOption before executing the query.

C#
IEnumerable<Product> linqQuery =
 context.Products
 .OrderBy(p => p.ProductName);
((ObjectQuery)linqQuery).MergeOption =
 MergeOption.NoTracking;
prodList = linqQuery.ToList();
    
VB
Dim linqQuery As IEnumerable(Of Product) = _
  context.Products _
  .OrderBy(Function(p) p.ProductName)
CType(linqQuery,ObjectQuery).MergeOption =
 MergeOption.NoTracking;
prodList = linqQuery.ToList

This solves the problem of the context. But allow me to point out two other problems with this code that may surprise you as well.

First, note that the Product is now attached to the new context. Prior to disposing the new context, you’ll need to explicitly detach the product from this context. In order to do this, you’ll want to have a variable by which to reference the instance.

Rather than merely assigning the product using prodList[0], create an instance of it first and use that instance to assign to the LineItem. Then before disposing the context, call Detach (see Listing 1).

That’s the first problem. The second is so common that it deserves to be highlighted as my next gotcha.

6. Adding Graphs with Both New and Existing Entities to the ObjectContext

If you ran the code in Listing 1, you will still run into one more gotcha-one that developers frequently ask about not only in the forums, but in e-mails that I receive.

When the ObjectContext executes the AddToOrders method, the runtime will throw an InvalidOperationException with the message:

The object cannot be added to the ObjectStateManager because it already has an EntityKey. Use ObjectContext.Attach to attach an object that has an existing key.

The message is clear, yet it still leaves a mystery. Most often, when presented with this error, the developer will focus on the Order because that is the target of the AddToOrders method. But, in fact, there are three entities in the Order graph-newOrder, newItem, and existingProduct.

The existingProduct is the entity that already has an EntityKey and is creating the problem. This EntityKey was defined when the Product entity was initially materialized as a result of the earlier query.

The code-generated AddToOrders method leans on the ObjectContext.AddObject method. AddObject applies the same logic to everything in the graph. So internally it tries to Add the newOrder to the context and then call Add with newItem, and finally call Add with the existingProduct.

And there is your problem. Entity Framework is calling Add on existingProduct when that object is not in the proper state to be added. You need to attach the existingProduct object, not add it.

Attach Reference Entities to the Context before Attaching them to a New Entity

There is an easy way to solve this problem, but the pattern you need for the solution is not one that is obvious or discoverable. You need to separate your steps into two sets of logic. First, explicitly add the Product to the new context and add the Order graph (with its LineItem) to the context. Then after you’ve attached these both to the context, you can hook the pre-existing Product up to the Added Order.LineItem. With this pattern, everything will align properly.

C#
context.Attach(existingProduct);
context.AddToOrders(newOrder);
newItem.Product = existingProduct;
context.SaveChanges();
context.Detach(existingProduct);
    
VB
context.Attach(existingProduct)
context.AddToOrders(newOrder)
newItem.Product = existingProduct
context.SaveChanges()
context.Detach(existingProduct)

Interestingly, if you attempt to set the newItem.Product when only one of these has been bound to the ObjectContext (either the existingProduct or the Order graph, but not both), then you will get yet another exception.

The relationship between the two objects cannot be defined because they are attached to different ObjectContext objects.

The logic of this is certainly not apparent. When you detach one entity in the graph, its ObjectContext property evaluates to null. The other, which is now attached to the context, returns the ObjectContext instance. Entity Framework considers the null ObjectContext to be a “different” context and therefore won’t allow objects from different contexts to be joined. This looks to me like an oversight in the EF classes. I have spent plenty of time trying to figure out the one true and simple way to join pre-existing entities to new entities. Hopefully this saves you some time and aggravation.

I have spent plenty of time trying to figure out the one true and simple way to join pre-existing entities to new entities.

Avoid Attaching Issues by Working Directly with EntityKeys

I’ll now offer another more direct way to get around this problem. Rather than working with the reference entity, existingProduct, to identify the LineItem’s product, you can work directly with its EntityKey. Using the product’s EntityKey is akin to simply assigning a foreign key value to the LineItem.Product property. And since you will be working with an EntityKey and not the entity itself, you will avoid the collision caused by the EntityState of the various entities in the graph.

To solve this problem, you will work directly with the EntityReference property that is created by the code generator along with the navigation reference property from the model. In this case, LineItem has a Product property as well as a ProductReference property. Product points to the actual product instance while ProductReference contains only the EntityKey for that product.

With this pattern you won’t ever have to worry about the product getting attached to the context and then having to detach it again so that you can reuse it.

Listing 2 shows the code that creates the Order and LineItem, assigns the LineItem’s Product using the EntityKey and then saves the graph back to the database using a new ObjectContext instance.

Note that all of the extra code dealing with the product’s binding to the context has gone away, including the last bit of code to detach the product from the context.

Additionally, the code assigns the order’s pre-existing customer using the same logic and the assumption that you have an existing list of Customer entities available.

7. ObjectContext Does Not Track Changes Naturally in Web Apps

The Visual Studio compiler does not alert you when you are using bad patterns. There is a very bad pattern that you could easily code using EF in websites without realizing that EF will completely drop the ball. This is more likely to impact developers who are building RAD Web apps using EF but not using the EntityDataSource control.

Here’s the setup. Suppose you declare an ObjectContext (MyEntities) in the Page class. In the Page Load event, query for some data and bind it to one or more UI controls. Then, using a button on the page (Save), insert an ObjectContext.SaveChanges method into the button’s Click event.

It looks right (see Listing 3) and, in fact, is a pattern that works perfectly with Windows Forms and WPF apps.

What’s wrong with this innocent-looking code? The problem is specific to how Web pages work. While this may be obvious to experienced Web developers, it is not so obvious to those who are newer to creating Web pages. When the Page Load event is run, this is occurring on the server. The page class does all of its work including executing the EF query and then creates the HTML that represents the page. The HTML is sent down to the browser and the server is finished with its job. At that point, the page class and all of its objects are disposed. That means the ObjectContext that was created and the objects are also disposed.

When the user does their work on the Web page and then clicks the Save button, the server creates a brand new page class, processes all of the required code and creates a new set of HTML to send back to the browser. That means a brand new ObjectContext is created.

The ObjectContext that is used to call SaveChanges in the Save button event is not the same one that was used to execute the query. What you lose when working with the Web page is all of the change tracking capabilities that you would normally have with a long-running ObjectContext. Therefore, when SaveChanges is called, there are no changes for the context to save. Nothing will happen and the user’s edits will not be persisted to the database.

If you are doing RAD development for a website, you have a number of options to allow changes to be saved. The EntityDataSource control is similar to SqlDataSource and LINQDataSource, but is designed specifically to work with Entity Framework. It is configured as a UI control and by default, hides all of the work that it does to ensure that the users’ modifications are persisted to the database. You can customize the behavior on the server side. Another technology, ASP.NET Dynamic Data, allows you to create dynamic sites using pre-defined page templates and controls which work with a data model (e.g., LINQ to SQL or Entity Data Model) to automate website creation. By default, Dynamic Data templates and controls hide all of the plumbing so that you don’t have to worry about getting data to the page or back to the database. You can also use entities with ASP.NET MVC applications or the ASP.NET ObjectDataSource control, where you can provide more separation of business logic in your entities. Finally, if you are building a highly layered application, you can use Web Services, WCF Services, ADO.NET Data Services or construct your own business logic for persisting changes from the UI through to the server.

8. Strongly Typed Variables in a Query Break Eager Loading

Here’s a gotcha that came up in the MSDN forums as I was writing this article.

Entity Framework provides the capability to easily query for graphs. In other words, you can request customers along with their orders using the ObjectContext.Include method. For example context.customers.Include(“Orders”). This is referred to as eager loading.

A developer had written a query that used this method to eager load some data along with its related data. In the query, he had strongly typed the reference variable. The query below demonstrates this strong-typing by declaring the reference variable, c, as a Customer type in the query.

C#
IQueryable<Customer> query =
 from Customer c
 in context.Customers.Include("Orders")
 select c;
List<Customer> customers = query.ToList();
    
VB
Dim query As IQueryable(Of Customer) = _
  From c As Customer _
  In context.Customers.Include("Orders")
Dim customers As List(Of Customer) = _
 query.ToList()

When the query executed, it only returned Customers. None of the related Orders were in the graph. Digging a little further, you can see in the T-SQL snippet below that the store query that hit the database only asks for Customers. There is no indication that the client requested that the Orders be eager-loaded with the Customers.

SELECT
[Extent1].[CustomerID] AS [CustomerID],
[Extent1].[CompanyName] AS [CompanyName],
[Extent1].[ContactName] AS [ContactName],
[Extent1].[ContactTitle] AS [ContactTitle],
[Extent1].[Address] AS [Address],
[Extent1].[City] AS [City],
[Extent1].[Region] AS [Region],
[Extent1].[PostalCode] AS [PostalCode],
[Extent1].[Country] AS [Country],
[Extent1].[Phone] AS [Phone],
[Extent1].[Fax] AS [Fax]
FROM [dbo].[Customers] AS [Extent1]

Strong typing causes the problem when used in tandem with an Include method. The fact that the query specifically indicates to return Customers overrides the Include which is then completely ignored when the query is compiled.

Strong typing causes the problem when used in tandem with an Include method.

By removing the strong typing, the Include works as expected and the Orders are brought back with their Customers.

C#
IQueryable<Customer> query =
 from c
 in context.Customers.Include("Orders")
 select c;
    
VB
Dim query As IQueryable(Of Customer) = _
  From c _
  In context.Customers.Include("Orders")

Again, this is another quiet failure. The compiler does not indicate that the strong typing and the eager loading are not compatible.

Summary

Entity Framework is an important addition to ADO.NET, but as with any new technology, it takes a while to work some of the kinks out and to get enough shared knowledge about how the tool works for developers to leverage.

I find the MSDN forums to be a great resource when I get stuck on something that just doesn’t make sense. More and more developers are gaining expertise with Entity Framework and many of them, along with quite a few members of the Entity Framework team, spend time on the forums.

The gotchas that I covered in this article may be aggravating, but once you are familiar with them and prepared for them, you’ll have a much smoother journey with Entity Framework and be able to focus on its many benefits.