Overview of the ADO.Net Entity Framework
ADO.Net Entity Framework
What is it?
Microsoft definition:
The ADO.NET Entity Framework is designed to enable developers to create data access applications by programming against a conceptual application model instead of programming directly against a relational storage schema. The goal is to decrease the amount of code and maintenance required for data-oriented applications. Entity Framework applications provide the following benefits:
- Applications can work in terms of a more application-centric conceptual model, including types with inheritance, complex members, and relationships.
- Applications are freed from hard-coded dependencies on a particular data engine or storage schema.
- Mappings between the conceptual model and the storage-specific schema can change without changing the application code.
- Developers can work with a consistent application object model that can be mapped to various storage schemas, possibly implemented in different database management systems.
- Multiple conceptual models can be mapped to a single storage schema.
- Language-integrated query (LINQ) support provides compile-time syntax validation for queries against a conceptual model.
So what does that mean?
A way to reference data as "plain old clr objects" (POCO), without regard to how the data is actual retrieved or persisted (unless you want to know). You can query, create, update, and delete using .Net objects and either extension methods or Linq to Entities.
History
Data Readers - serial data access, no type safety, manual loading into objects.
DataSets - disconnected access, sorting, filtering, no type safety, manual loading or databinding.
Typed DataSets - disconnected, strongly typed, still datasets ("heavy", lots of dataset methods, easily abused)
Linq To Sql - connected, first attempt at using CLR code to access data. Still one-to-one to database entities. Lazy loading both good and bad.
Entity Framework - connected or disconnected, doesn't need to map directly to data model, lazy or eager loading, "mostly" POCO.
So why would I want to use this?
Less code. More time to focus on what matters, not data access logic. Bindable, can be used directly to databind in Asp.Net or WPF. Partial classes, allowing easy extension.
Examples
Acme Construction Consulting. An application written in WPF using Sql Express 2008 for data storage and Entity Framework for data access. Middle tier components accept scalar parameters and return Entity Framework objects to the front end. DataContext is set in WPF UI directly to Entity Framework objects. State management plays nicely with databinding, allowing changes to be tracked for persistence at a time of the developer's choosing.
Very connected model. Entire object graph can be retrieved with a single statement and bound to the page and elements on the page.
Property Changed events allow "turbo-tax" style updates whenever appropriate data elements are changed.
Large amount of data, can be slow to retrieve, but ok due to the nature of the application.
Partial classes extended to allow custom formulas to be exposed as properties.
public partial class SoftCost
{
#region Calculated properties.
public double? LumpSum
{
get
{
if (Scenario.TotalConstructionCost == null ||
PercentofProject == null)
{
return null;
}
else
{
return
Scenario.TotalConstructionCostWithoutEscalation.Value *
PercentofProject.Value;
}
}
}
#endregion
}
Different types of ways to get at the data (extension methods, linq)
public IQueryable<DataObjects.RenovationType> GetRenovationTypes()
{
return
from renovationType in db.RenovationType
select renovationType;
}
public DataObjects.RenovationType GetRenovationType(int id)
{
return db.RenovationType.First(r => r.Id == id);
}
More complex queries possible
public IQueryable<DataObjects.DepartmentGroupType> GetDepartmentGroupTypes(int sectorId)
{
return
from departmentGroupType in db.DepartmentGroupType
.Include("DepartmentType")
where departmentGroupType.Sector.Id == sectorId
|| departmentGroupType.Sector.Id == 0 // Include "All" sectors
orderby departmentGroupType.Description
select departmentGroupType;
}
Lessons learned
Performance. If you have an object that has nested related objects, you have a few choices as to how to retrieve that related information. You can use the Include method to load all of the data on the database server, returning the full object graph. Or, you can use the Load method to load individual objects after retrieving the initial object. The problem with Include is that it creates a large UNION statement in the database to return one row for each of the lowest-level objects included. This can be VERY slow if there are many child objects or data. Include is certainly an easier approach to use if performance not an issue, but using the "Gatlin Gun" approach (use load many times instead of Include) is much faster.
Object context. The object context is not always disconnected, even if Context is gone (for example, by using a Using statement). The state tracker (ObjectStateManager) is still there and will complain if you try to "mix and match" objects that were retrieved from different contexts.
Value Added Website. An Asp.Net application using Entity Framework for database access.
Different model than WPF application. Everything is stateless. Each page request is new connection to database, all entities are disconnected. Change Tracking works differently.
You can use ObjectDataSource with methods that accept or return Entity Framework objects, which can speed up your development time.
If you want to compile objects into one object graph that come from different business objects, you must be using the same context or you will get an exception.
"Unit of Work Scope" makes it easy to ensure that you are getting one instance.
protected DataObjects.Entities db
{
get
{
if (UnitOfWorkScope.CurrentObjectContext != null)
{
return UnitOfWorkScope.CurrentObjectContext;
}
else
{
if (_db == null)
{
_db = new DataObjects.Entities();
}
return _db;
}
}
}
To use:
using (new Service.UnitOfWorkScope(true))
{
...
}
Criticisms
Performance. Can be slower than raw data access code, unless you put in the extra time to optimize. However, you can use stored procedures, can decide how things are loaded, can view actual queries to be run, can compile linq expression.
Not really POCO. Still has a lot of extra methods. Still has connected capabilities, even when you don't want them.
Context can be a pain, as can state management. Mixing and matching, updating some entities while adding others and doing nothing to still others can be very complicated. (MSDN August 2009 has a lot of good information on this).
No easy way to reattach complex object graphs, though there are patterns for doing it for the more simple entities.
Designer is sometimes "too" careful. Removing entities from the EDM designer doesn't always get rid of all parts. Sometimes you need to go into the Xml and manually get rid of elements. Can be easier to remove all and regenerate in many cases.
Future
POCO.
Improved n-tier support. Attach and change states.
T4 based code. Use build in templates for common patterns or create your own.
Foreign Key properties. Right now, can't set the foreign key (Id), can only set the actual object. This also helps with n-tier and "mix and match" issues.