Entity Framework 4.1 Virtual Properties - entity-framework-4.1

If i have declared entity relationship in my model as virtual then there is no need to use the Include statement in my LINQ query, right ??-
For ex: This is my model class :
public class Brand
{
public int BrandID { get; set; }
public string BrandName { get; set; }
public string BrandDesc { get; set; }
public string BrandUrl { get; set; }
public virtual ICollection<Product> Products { get; set; }
}
Now, for the above model class, i dont need to use the var brandsAndProduct = pe.Brands.Include("Products").Single(brand => brand.BrandID == 22); .
Instead, I can just use the simple var brandsAndProduct = pe.Brands.Where(brand => brand.BrandID == 22); and i will automatically have the related entity available when accessed.
Am I correct in my understanding ?
Also, please tell me in what situations i should prefer one over the other ??

You are correct but the rule is more complex to make it really work as expected. If you define your navigation property virtual EF will at runtime create a new class (dynamic proxy) derived from your Brand class and use it instead. This new dynamically created class contains logic to load navigation property when accessed for the first time. This feature is called lazy loading (or better transparent lazy loading).
What rules must be meet to make this work:
All navigation properties in class must be virtual
Dynamic proxy creation must not be disabled (context.Configuration.ProxyCreationEnabled). It is enabled by default.
Lazy loading must not be disabled (context.Configuration.LazyLoadingEnabled). It is enabled by default.
Entity must be attached (default if you load entity from the database) to context and context must not be disposed = lazy loading works only within scope of living context used to load it from database (or where proxied entity was attached)
The opposite of lazy loading is called eager loading and that is what Include does. If you use Include your navigation property is loaded together with main entity.
Usage of lazy loading and eager loading depends on your needs and also on performance. Include loads all data in single database query but it can result in huge data set when using a lot of includes or loading a lot of entities. If you are sure that you will need Brand and all Products for processing you should use eager loading.
Lazy loading is in turn used if you are not sure which navigation property you will need. For example if you load 100 brands but you will need to access only products from one brand it is not needed to load products for all brands in initial query. The disadvantage of the lazy loading is separate query (database roundtrip) for each navigation property => if you load 100 brands without include and you will access Products property in each Brand instance your code will generate another 100 queries to populate these navigation properties = eager loading would use just singe query but lazy loading used 101 queries (it is called N + 1 problem).
In more complex scenarios you can find that neither of these strategies perform as you need and you can use either third strategy called explicit loading or separate queries to load brands and than products for all brands you need.
Explicit loading has similar disadvantages as lazy loading but you must trigger it manually:
context.Entry(brand).Collection(b => b.Products).Load();
The main advantages for explicit loading is ability to filter relation. You can use Query() before Load() and use any filtering or even eager loading of nested relations.

Related

Saving breezejs entities with child/related entities populated to save in a single transaction scope

I'm trying to save a breezejs entity which has a collection of entities within it, a selection of 'choices' if you will.
something crudely like
public class Form{
public class Choice{
public string Name {get;set;}
public bool Selected {get;set;}
}
[Key]
public Guid Id{get;set;}
public ICollection<Choice> Choices{get;set;}
}
When breezejs saves the changes to the entities it batches them out to respective odata controllers, one for "Form" and one for "Choice". This would be fine, but I want/need to make the change within a transaction on the server - so ideally I would be able to get a Form model in the Form odata controller which has a collection of Choices populated within it. Then I can make my changes within a single transaction scope.
I spent a few hours digging, but I can't find a way to ask breezejs to 'embed' the collection of 'Choices' within the 'Form' to get a single Post with a fully populated 'Form' model.
Any suggestions?
Thank you!
The current server side OData controllers from MS don't really support transactions involving multiple entity type saves. ( This is a known MS issue, but they have been very slow to address it. )
However, breeze's standard WebApi controller does handle transactions involving multiple entity type saves. And providing that you are using EF, the transition between the two is relatively simple.
See:
http://www.getbreezenow.com/documentation/odata-vs-webapi and
http://www.getbreezenow.com/documentation/aspnet-web-api

Guidance for synchronising reverse associations in Entity Framework 4.1

EF 4.1 synchronises reverse associations when you create your instances. Is there any documentation of or best practices guidance available for this behaviour?
What I mean by synchronising the reverse association is that given:
public class Blog
{
public Blog() { Posts = new List<Blog>(); }
public int Id { get; set; }
public ICollection<Post> Posts { get; private set; }
}
public class Post
{
public Blog Blog { get; set; }
public int Id { get; set; }
}
Then after the following line the Post will have it's Blog property set.
var blog = new Blog();
context.Blogs.Add(blog);
blog.Posts.Add(new Post());
I believe - but I'm not sure - with "synchronising the reverse association" you mean a feature in Entity Framework which is called Relationship Fix-up or Relationship Span and is responsible to assign automatically navigation properties between objects in the ObjectContext. This is not specific to EF 4.1 but exists also for older versions.
I don't know a comprehensive documentation for this feature but here are a few resources which may give a bit more insight - especially the second one:
A brief definition: http://blogs.msdn.com/b/alexj/archive/2009/04/03/tip-10-understanding-entity-framework-jargon.aspx
A more detailed explanation (Zeeshan Hirani): http://www.daltinkurt.com/upload/dosyalar/file/Diger/entity_framework_learning_guide.pdf (Chapter 3.4 at page 125 - 133)
About situations where one wants to avoid relationship span: http://blogs.msdn.com/b/alexj/archive/2009/04/07/tip-11-avoiding-relationship-span.aspx
Edit
I am not able to give a comprehensive explanation of relationship span and all its impacts. But I can try to give a few examples where I feel safe that it's not completely wrong what I say:
In the answer you have linked in the comment Morteza makes a difference between entities which are derived from EntityObject (only ObjectContext in EF 4.0, not possible with DbContext in EF 4.1) and POCOs (possible with ObjectContext and DbContext).
If you have POCOs then adding a new object to a navigation collection of another object which is already loaded into the context would not attach the new object to the context. This is not surprising because POCOs are, well..., POCOs, which means that they don't know anything about the EF context. Adding an object to a navigation collection is really nothing more than something like List<T>.Add(...). This generic Add method doesn't do any operation on the EF context.
This is another situation with EntityObject and EntityCollection which both have references to the context internally and can therefore attach to the context immediately when you add to the collection.
One conclusion from this consideration is that the last code example in your question would not actually set the Blog property in the Post when you use POCOs. But: It will be set after you have called DetectChanges or SaveChanges (which calls DetectChanges internally). In this situation DetectChanges (which is probably a very complex method) looks into context what objects are there (it'll find the Blog parent object) and then runs through the whole object graph (the Posts collection in our case) and checks if the other objects in the graph (the Post objects) are also in the context. If not - and this is the case in your example - it will attach them to the context in Added state and - here comes relationship span into play now - also fix the navigation properties in the object graph.
Another situation where relationship span also acts with POCOs is when you load objects into the context.
For example: If you have a Blog with id = x and a Post with id = y which belongs to this Blog in the database then this code ...
var blog = context.Blogs.Find(x); // no eager loading of the Posts collection!
var post = context.Posts.Find(y); // no eager loading of the Blog property!
would automatically build up the navigation properties in each object, so the Posts collection of the Blog will suddenly contain the post and the Blog property in Post will refer to the blog. This relationship fix-up depends on the fact that the objects are indeed loaded into the context. If you suppress this by using AsNoTracking for example ...
var blog = context.Blogs.AsNoTracking().Where(b => b.Id == x).Single();
var post = context.Posts.AsNoTracking().Where(p => p.Id == y).Single();
... relationship span doesn't work and the navigation properties will stay null.
A last note: Relationship span - as in the example above - only works if the assocation on at least one end has a cardinality of 0...1 (one-to-one or one-to-many associations). It never works for many-to-many associations. This was recently discussed here (with EF 4.1): EF 4.1 loading filtered child collections not working for many-to-many

Data Repository Organization

So, I'm developing some software, and trying to keep myself using TDD and other best practices.
I'm trying to write tests to define the classes and repository.
Let's say I have the classes, Customer, Order, OrderLine.
Now, do I create the Order class as something like
abstract class Entity {
int ID { get; set; }
}
class Order : Entity {
Customer Customer { get; set; }
List<OrderLine> OrderLines { get; set; }
}
Which will serialize nice, but, if I don't care about the OrderLines, or Customer details is not as lightweight as one would like. Or do I just store IDs to items and add a function for getting them?
class Order : Entity {
int CustomerID { get; set; }
List<OrderLine> GetOrderLines() {};
}
class OrderLine : Entity {
int OrderID { get; set; }
}
And how would you structure the repository for something like this?
Do I use an abstract CRUD repository with methods GetByID(int), Save(entity), Delete(entity) that each items repository inherits from, and adds it's own specific methods too, something like this?
public abstract class RepositoryBase<T, TID> : IRepository<T, TID> where T : AEntity<TID>
{
private static List<T> Entities { get; set; }
public RepositoryBase()
{
Entities = new List<T>();
}
public T GetByID(TID id)
{
return Entities.Where(x => x.Id.Equals(id)).SingleOrDefault();
}
public T Save(T entity)
{
Entities.RemoveAll(x => x.Id.Equals(entity.Id));
Entities.Add(entity);
return entity;
}
public T Delete(T entity)
{
Entities.RemoveAll(x => x.Id.Equals(entity.Id));
return entity;
}
}
What's the 'best practice' here?
Entities
Let's start with the Order entity. An order is an autonomous object, which isn't dependent on a 'parent' object. In domain-driven design this is called an aggregate root; it is the root of the entire order aggregate. The order aggregate consists of the root and several child entities, which are the OrderLine entities in this case.
The aggregate root is responsible for managing the entire aggregate, including the lifetime of the child entities. Other components are not allowed to access the child entities; all changes to the aggregate must go through the root. Also, if the root ceases to exist, so do the children, i.e. order lines cannot exist without a parent order.
The Customer is also an aggregate root. It isn't part of an order, it's only related to an order. If an order ceases to exist, the customer doesn't. And the other way around, if a customer ceases to exist, you'll want to keep the orders for bookkeeping purposes. Because Customer is only related, you'll want to have just the CustomerId in the order.
class Order
{
int OrderId { get; }
int CustomerId { get; set; }
IEnumerable<OrderLine> OrderLines { get; private set; }
}
Repositories
The OrderRepository is responsible for loading the entire Order aggregate, or parts of it, depending on the requirements. It is not responsible for loading the customer. If you need the customer, load it from the CustomerRepository, using the CustomerId from the order.
class OrderRepository
{
Order GetById(int orderId)
{
// implementation details
}
Order GetById(int orderId, OrderLoadOptions loadOptions)
{
// implementation details
}
}
enum OrderLoadOptions
{
All,
ExcludeOrderLines,
// other options
}
If you ever need to load the order lines afterwards, you should use the tell, don't ask principle. Tell the order to load its order lines, and which repository to use. The order will then tell the repository the information it needs to know.
class Order
{
int OrderId { get; }
int CustomerId { get; set; }
IEnumerable<OrderLine> OrderLines { get; private set; }
void LoadOrderLines(IOrderRepository orderRepository)
{
// simplified implementation
this.OrderLines = orderRepository.GetOrderLines(this.OrderId);
}
}
Note that the code uses an IOrderRepository to retrieve the order lines, rather than a separate repository for order lines. Domain-driven design states that there should be a repository for each aggregate root. Methods for retrieving child entities belong in the repository of the root and should only be accessed by the root.
Abstract/base repositories
I have written abstract repositories with CRUD operations myself, but I found that it didn't add any value. Abstraction is useful when you want to pass instances of subclasses around in your code. But what kind of code will accept any BaseRepository implementation as a parameter?
Also, the CRUD operations can differ per entity, making a base implementation useless. Do you really want to delete an order, or just set its status to deleted? If you delete a customer, what will happen to the related orders?
My advice is to keep things simple. Stay away from abstraction and generic base classes. Sure, all repositories share some kind of functionality and generics look cool. But do you actually need it?
I would divide my project up into the relevant parts. Data Transfer Objects (DTO), Data Access Objects (DAO). The DTO's I would want to be as simple as possible, terms like POJO (Plain Old Java Object) and POCO (Plain Old C Object) are used here, simply put they are container objects with very little if any functionality built into them.
The DTO's are basically the building blocks to the whole application, and will marry up the layers. For every object that is modeled in the system, there should be at least one DTO. How you then put these into collections is entirely up to the design of the application. Obviously there are natural One to many relationships floating around, such as Customer has many Orders. But the fundamentals of these objects are what they are. For example, an order has a relationship with a customer, but can also be stand alone and so needs to be separate from the customer object. All Many to Many Relationships should be resolved down into One to Many relationships which is easy when dealing with nested classes.
Presumably there should be CRUD objects that appear within the Data Access Objects category. This is where it gets tricky as you have to manage all the relationships that have been discovered in design and the lifetime models of each. When fetching DTO's back from the DAO the loading options are essential as this can mean the difference between your system running like a dog from over eager loading, or high network traffic from fetching data back and fourth from your application and the store by lazy loading.
I won't go into flags and loading options as others here have done all that.
class OrderDAO
{
public OrderDTO Create(IOrderDTO order)
{
//Code here that will create the actual order and store it, updating the
flelds in the OrderDTO where necessary. One being the GUID field of the new ID.
I stress guid as this means for better scalability.
return OrderDTO
}
}
As you can see the OrderDTO is passed into the Create Method.
For the Create Method, when dealing with brand new nested Objects, there will have to be some code dealing with the marrying up of data that has been stored, for example a customer with old orders, and a new order. The system will have to deal with the fact that some of the operations are update statements, whilst others are Create.
However one piece of the puzzle that is always missed is that of multi-user environments where DTO's (plain Objects) are disconnected from the application and returned back to the DAO for CRUD. This usually involves some Concurrency Control which can be nasty and can get complicated. A simple mechanism such as DateTime or Version number works here, although when doing crud on a nested object, you must develop the rules on what gets updated and in what order, also if an update fails concurrency, you have to decide on whether you fail all the operation or partial.
Why not create separate Order classes? It sounds to me like you're describing a base Order object, which would contain the basic order and customer information (or maybe not even the customer information), and a separate Order object that has line items in it.
In the past, I've done as Niels suggested, and either used boolean flags or enums to describe optionally loading child objects, lists, etc. In Clean Code, Uncle Bob says that these variables and function parameters are excuses that programmers use to not refactor a class or function into smaller, easier to digest pieces.
As for your class design, I'd say that it depends. I assume that an Order could exist without any OrderLines, but could not exist without a Customer (or at least a way to reference the customer, like Niels suggested). If this is the case, why not create a base Order class and a second FullOrder class. Only FullOrder would contain the list of OrderLines. Following that thought, I'd create separate repositories to handle CRUD operations for Order and FullOrder.
If you are interested in domain driven design (DDD) implementation with POCOs along with explanations take a look at the following 2 posts:
http://devtalk.dk/2009/06/09/Entity+Framework+40+Beta+1+POCO+ObjectSet+Repository+And+UnitOfWork.aspx
http://www.primaryobjects.com/CMS/Article122.aspx
There is also a project that implements domain driven patterns (repository, unit of work, etc, etc) for various persistence frameworks (NHibernate, Entity Frameworks, etc, etc) called NCommon

Separating data from the UI code with Linq to SQL entities

If it's important to keep data access 'away' from business and presentation layers, what alternatives or approaches can I take so that my LINQ to SQL entities can stay in the data access layer?
So far I seem to be simply duplicating the classes produced by sqlmetal, and passing those object around instead simply to keep the two layers appart.
For example, I have a table in my DB called Books. If a user is creating a new book via the UI, the Book class generated by sqlmetal seems like a perfect fit although I'm tightly coupling my design by doing so.
What I do is to have all my DataAccess (LINQ-to-SQL in your case) in one project and then I have another business project which uses the DataAccess project, thereby segrating the DataAccess project form the UI layer.
In your example for books, my business layer would have a class called Book:
public class Book
{
private IAuthorRespository _authorRepository = new LinqToSqlAuthorRepository();
private IBookRespository _bookRepository = new LinqToSqlBookRepository();
public int BookId { get { return _bookId; }}
private int _bookId;
public virtual string BookName{get;set;}
public virtual string ISBN {get;set;}
// ...Other properties
public Book()
{
// When creating a new book
_bookId = 0;
}
public Book(int id)
{
// For an existing book
_bookId = id;
Load();
}
protected void Load()
{
BookEntity book = _bookRepository.GetBook(BookId);
BookName = book.BookName;
ISBN = book.ISBN;
}
public void Save()
{
BookEntity book = MapEntityFromThisClass();
_bookRepository.Save(book);
}
public Author GetAuthor()
{
return _authorRepository.GetAuthor();
}
}
This then means that your UI is totally separated from the actual data access and that all of your Book logic is contained sensibly within a class.
You can make this further separated by using IoC with a system such as Microsoft Unity or Castle so that you don't have to write = new LinqToSqlXYZ(); and can instead write something along the lines of IoC.Resolve<IBookRepostory>(); (depending on your implementation). This then means your Book class is not tied down to LINQ-to-SQL anymore either.
Linq to Sql offers a 1:1 mapping between entities and your database tables. It could be argued that the entities themselves are a level of abstraction away from the database, and that is what you are tied down to.
If you are making a 1:1 duplication of the entities offered up by linq to sql, then it may mean that its not worth having them there, because you are still just as tied to those classes as you are to the entities offered by linq to sql.
By creating another layer, you are also elminating the benefits of change tracking provided by linq to sql, meaning you have to copy any changes from your classes into the entities provided by linq to sql to perform data operations.
If you would like to abstract away the DataContext type code from any presentation or business layers, and control the interface to your data more tightly, then the repository pattern is good. You can always have your repository return the entity types created by linq to sql, which means you are not duplicating types, you also get change tracking, but you are still keeping the code that controls the DataContext inside the repository.
You may consider projecting the data into a different class for the benefit of your presentation (a view model), or business logic. This is the route I tend to go down, if I want to use linq to sql, but I don't want a 1:1 mapping between the entities and my view models.

Domain Driven Design (Linq to SQL) - How do you delete parts of an aggregate?

I seem to have gotten myself into a bit of a confusion of this whole DDD\LinqToSql business. I am building a system using POCOS and linq to sql and I have repositories for the aggregate roots.
So, for example if you had the classes Order->OrderLine you have a repository for Order but not OrderLine as Order is the root of the aggregate. The repository has the delete method for deleting the Order, but how do you delete OrderLines?
You would have thought you had a method on Order called RemoveOrderLine which removed the line from the OrderLines collection but it also needs to delete the OrderLine from the underlying l2s table. As there isnt a repository for OrderLine how are you supposed to do it?
Perhaps have specialized public repostories for querying the roots and internal generic repositories that the domain objects actually use to delete stuff within the aggregates?
public class OrderRepository : Repository<Order> {
public Order GetOrderByWhatever();
}
public class Order {
public List<OrderLines> Lines {get; set;} //Will return a readonly list
public RemoveLine(OrderLine line) {
Lines.Remove(line);
//************* NOW WHAT? *************//
//(new Repository<OrderLine>(uow)).Delete(line) Perhaps??
// But now we have to pass in the UOW and object is not persistent ignorant. AAGH!
}
}
I would love to know what other people have done as I cant be the only one struggling with this.... I hope.... Thanks
You call the RemoveOrderLine on the Order which call the related logic. This does not include doing changes on the persisted version of it.
Later on you call a Save/Update method on the repository, that receives the modified order. The specific challenge becomes in knowing what has changed in the domain object, which there are several options (I am sure there are more than the ones I list):
Have the domain object keep track of the changes, which would include keeping track that x needs to be deleted from the order lines. Something similar to the entity tracking might be factored out as well.
Load the persisted version. Have code in the repository that recognizes the differences between the persisted version and the in-memory version, and run the changes.
Load the persisted version. Have code in the root aggregate, that gets you the differences given an original root aggregate.
First, you should be exposing Interfaces to obtain references to your Aggregate Root (i.e. Order()). Use the Factory pattern to new-up a new instance of the Aggregate Root (i.e. Order()).
With that said, the methods on your Aggregate Root contros access to its related objects - not itself. Also, never expose a complex types as public on the aggregate roots (i.e. the Lines() IList collection you stated in the example). This violates the law of decremeter (sp ck), that says you cannot "Dot Walk" your way to methods, such as Order.Lines.Add().
And also, you violate the rule that allows the client to access a reference to an internal object on an Aggregate Root. Aggregate roots can return a reference of an internal object. As long as, the external client is not allowed to hold a reference to that object. I.e., your "OrderLine" you pass into the RemoveLine(). You cannot allow the external client to control the internal state of your model (i.e. Order() and its OrderLines()). Therefore, you should expect the OrderLine to be a new instance to act upon accordingly.
public interface IOrderRepository
{
Order GetOrderByWhatever();
}
internal interface IOrderLineRepository
{
OrderLines GetOrderLines();
void RemoveOrderLine(OrderLine line);
}
public class Order
{
private IOrderRepository orderRepository;
private IOrderLineRepository orderLineRepository;
internal Order()
{
// constructors should be not be exposed in your model.
// Use the Factory method to construct your complex Aggregate
// Roots. And/or use a container factory, like Castle Windsor
orderRepository =
ComponentFactory.GetInstanceOf<IOrderRepository>();
orderLineRepository =
ComponentFactory.GetInstanceOf<IOrderLineRepository>();
}
// you are allowed to expose this Lines property within your domain.
internal IList<OrderLines> Lines { get; set; }
public RemoveOrderLine(OrderLine line)
{
if (this.Lines.Exists(line))
{
orderLineRepository.RemoveOrderLine(line);
}
}
}
Don't forget your factory for creating new instances of the Order():
public class OrderFactory
{
public Order CreateComponent(Type type)
{
// Create your new Order.Lines() here, if need be.
// Then, create an instance of your Order() type.
}
}
Your external client does have the right to access the IOrderLinesRepository directly, via the interface to obtain a reference of a value object within your Aggregate Root. But, I try to block that by forcing my references all off of the Aggregate Root's methods. So, you could mark the IOrderLineRepository above as internal so it is not exposed.
I actually group all of my Aggregate Root creations into multiple Factories. I did not like the approach of, "Some aggregate roots will have factories for complex types, others will not". Much easier to have the same logic followed throughout the domain modeling. "Oh, so Sales() is an aggregate root like Order(). There must be a factory for it too."
One final note is that if have a combination, i.e. SalesOrder(), that uses two models of Sales() and Order(), you would use a Service to create and act on that instance of SalesOrder() as neither the Sales() or Order() Aggregate Roots, nor their repositories or factories, own control over the SalesOrder() entity.
I highly, highly recommend this free book by Abel Avram and Floyd Marinescu on Domain Drive Design (DDD) as it directly answers your questions, in a shrot 100 page large print. Along with how to more decouple your domain entities into modules and such.
Edit: added more code
After struggling with this exact issue, I've found the solution. After looking at what the designer generates with l2sl, I realized that the solution is in the two-way associations between order and orderline. An order has many orderlines and an orderline has a single order. The solution is to use two way associations and a mapping attribute called DeleteOnNull(which you can google for complete info). The final thing I was missing was that your entity class needs to register for Add and Remove events from the l2s entityset. In these handlers, you have to set the Order association on the order line to be null. You can see an example of this if you look at some code that the l2s designer generates.
I know this is a frustrating one, but after days of struggling with it, I've got it working.
As a follow up....
I have switched to using nhibernate (rather than link to sql) but in effect you dont need the repos for the OrderLine. If you just remove the OrderLine from the collection in Order it will just delete the OrderLine from the database (assuming you have done your mapping correctly).
As I am swapping out with in-memory repositories, if you want to search for a particular order line (without knowing the order parent) you can write a linq to nhibernate query that links order to orderline where orderlineid = the value. That way it works when querying from the db and from in memory. Well there you go...