Doctrine2 Best Practice, Should Entities use Services? - datamapper

I asked a similar question a while back: Using the Data Mapper Pattern, Should the Entities (Domain Objects) know about the Mapper? However, it was generic and I'm really interested in how to accomplish a few things with Doctrine2 specifically.
Here's a simple example model: Each Thing can have a Vote from a User, a User may cast more than one Vote but only the last Vote counts. Because other data (Msssage, etc) is related to the Vote, when the second Vote is placed the original Vote can't just be updated, it needs to be replaced.
Currently Thing has this function:
public function addVote($vote)
{
$vote->entity = $this;
}
And Vote takes care of setting up the relationship:
public function setThing(Model_Thing $thing)
{
$this->thing = $thing;
$thing->votes[] = $this;
}
It seems to me that ensuring a User only has the last Vote counted is something the Thing should ensure, and not some service layer.
So to keep that in the Model, the new Thing function:
public function addVote($vote)
{
foreach($this->votes as $v){
if($v->user === $vote->user){
//remove vote
}
}
$vote->entity = $this;
}
So how do I remove the Vote from within the Domain Model? Should I relax Vote::setThing() to accept a NULL? Should I involve some kind of service layer that Thing can use to remove the vote? Once the votes start accumulating, that foreach is going to be slow - should a service layer be used to allow Thing to search for a Vote without having to load the entire collection?
I'm definitely leaning toward using a light service layer; however, is there a better way to handle this type of thing with Doctrine2, or am I heading in the right direction?

I vote for the service layer. I've often struggled with trying to add as much logic on the Entity itself, and simply frustrated myself. Without access to the EntityManager, you're simply not able to perform query logic, and you'll find yourself using a lot of O(n) operations or lazy-loading entire relationship sets when you only need a few records (which is super lame when compared to all the advantages DQL offers).
If you need some assistance getting over the idea that the Anemic Domain Model is always an anti-pattern, see this presentation by Matthew Weier O'Phinney or this question.
And while I could be misinterpreting the terminology, I'm not completely convinced that Entities have to be the only objects allowed in your Domain Model. I would easily consider that the sum of Entity objects and their Services constitutes the Model. I think the anti-pattern arises when you end up writing a service layer that pays little to no attention to separation of concerns.
I've often flirted with the idea of having all my entity objects proxy some methods to the service layer:
public function addVote($vote)
{
$this->_service->addVoteToThing($vote, $thing);
}
However, since Doctrine does not have any kind callback event system on object hydration, I haven't found an elegant way to inject the service object.

My advice would be to put all the query logic into an EntityRepository and then make an interface out of it sort of like:
class BlogPostRepository extends EntityRepository implements IBlogPostRepository {}
that way you can use the interface in your unit-tests for the service objects and no dependency on the EntityManager is required.

Related

Naming conventions for methods which must be called in a specific order?

I have a class that requires some of its methods to be called in a specific order. If these methods are called out of order then the object will stop working correctly. There are a few asserts in the methods to ensure that the object is in a valid state. What naming conventions could I use to communicate to the next person to read the code that these methods need to be called in a specific order?
It would be possible to turn this into one huge method, but huge methods are a great way to create problems. (There are a 2 methods than can trigger this sequence so 1 huge method would also result in duplication.)
It would be possible to write comments that explain that the methods need to be called in order but comments are less useful then clearly named methods.
Any suggestions?
Is it possible to refactor so (at least some of) the state from the first function is passed as a paramter to the second function, then it's impossible to avoid?
Otherwise, if you have comments and asserts, you're doing quite well.
However, "It would be possible to turn this into one huge method" makes it sound like the outside code doesn't need to access the intermediate state in any way. If so, why not just make one public method, which calls several private methods successively? Something like:
FroblicateWeazel() {
// Need to be in this order:
FroblicateWeazel_Init();
FroblicateWeazel_PerformCals();
FroblicateWeazel_OutputCalcs();
FroblicateWeazel_Cleanup();
}
That's not perfect, but if the order is centralised to that one function, it's fairly easy to see what order they should come in.
Message digest and encryption/decryption routines often have an _init() method to set things up, an _update() to add new data, and a _final() to return final results and tear things back down again.

Should I return IEnumerable<T> or IQueryable<T> from my DAL?

I know this could be opinion, but I'm looking for best practices.
As I understand, IQueryable<T> implements IEnumerable<T>, so in my DAL, I currently have method signatures like the following:
IEnumerable<Product> GetProducts();
IEnumerable<Product> GetProductsByCategory(int cateogoryId);
Product GetProduct(int productId);
Should I be using IQueryable<T> here?
What are the pros and cons of either approach?
Note that I am planning on using the Repository pattern so I will have a class like so:
public class ProductRepository {
DBDataContext db = new DBDataContext(<!-- connection string -->);
public IEnumerable<Product> GetProductsNew(int daysOld) {
return db.GetProducts()
.Where(p => p.AddedDateTime > DateTime.Now.AddDays(-daysOld ));
}
}
Should I change my IEnumerable<T> to IQueryable<T>? What advantages/disadvantages are there to one or the other?
It depends on what behavior you want.
Returning an IList<T> tells the caller that they've received all of the data they've requested
Returning an IEnumerable<T> tells the caller that they'll need to iterate over the result and it might be lazily loaded.
Returning an IQueryable<T> tells the caller that the result is backed by a Linq provider that can handle certain classes of queries, putting the burden on the caller to form a performant query.
While the latter gives the caller a lot of flexibility (assuming your repository fully supports it), it's the hardest to test and, arguably, the least deterministic.
One more thing to think about: where is your paging/sorting support? If you are providing paging support within your repository, returning IEnumerable<T> is fine. If you are paging outside of your repository (like in the controller or service layer) then you really want to use IQueryable<T> because you don't want to load the entire dataset into memory before it's paged.
HUUUUGGGE difference. I see this quite a bit.
You build up an IQueryable before it hits the database. The IQueryable only hits the DB once an eager function is called (.ToList() for example) or you actually try to pull values out. IQueryable = lazy.
An IEnumerable will execute your lambda against the DB right away. IEnumerable = eager.
As for which to use with the Repository pattern, I believe it's eager. I usually see ILists being passed but someone else will need to iron that out for you. EDIT - You usually see IEnumerable instead of IQueryable because you don't want layers past your Repository A) determining when the database hit will happen or B) Adding any logic to the joins outside the Repository
There is a very good LINQ video that I enjoy a lot- it hits more than just IEnumerable v IQueryable, but it really has some fantastic insight.
http://channel9.msdn.com/posts/matthijs/LINQ-Tips-Tricks-and-Optimizations-by-Scott-Allen/
You can use IQueryable and accept that someone could create a scenario where a SELECT N+1 could happen. This is a disadvantage, along with the fact that you may end up with code that is specific to your repository implementation in the layers above your repository. The advantage of this is that you are allowing the delegation common operations like paging and sorting to be expressed outside of your respository, therefore alleviating it of such concerns. It is also more flexible if you need to join the data with other database tables, as the query will remain an expression, so can be added to before its resolved into a query and hits the database.
The alternative is to lock down your repository so that it returns materialised lists by calling ToList(). With the example of paging and sorting, you will need to pass in skip, take and a sort expression as parameters to the methods of your repository, and use the parameters to return only a window of results. This means that the repository is taking on the responsibility of paging and sorting, and all of the projection of your data.
This is a bit of a judgement call, do you give your application the power of linq, and have less complexity in the repository, or do you control your data access. For me it depends on the number of queries associated with each entity, and combinations of entities, and where I want to manage that complexity.

Should repositories expose IQueryable to service layer or perform filtering in the implementation?

I'm trying to decide on the best pattern for data access in my MVC application.
Currently, having followed the MVC storefront series, I am using repositories, exposing IQueryable to a service layer, which then applies filters. Initially I have been using LINQtoSQL e.g.
public interface IMyRepository
{
IQueryable<MyClass> GetAll();
}
Implemented in:
public class LINQtoSQLRepository : IMyRepository
{
public IQueryable<MyClass> GetAll()
{
return from table in dbContext.table
select new MyClass
{
Field1 = table.field1,
... etc.
}
}
}
Filter for IDs:
public static class TableFilters
{
public static MyClass WithID(this IQueryable<MyClass> qry, string id)
{
return (from t in qry
where t.ID == id
select t).SingleOrDefault();
}
}
Called from service:
public class TableService
{
public MyClass RecordsByID(string id)
{
return _repository.GetAll()
.WithID(id);
}
}
I ran into a problem when I experimented with implementing the repository using Entity Framework with LINQ to Entities. The filters class in my project contains some more complex operations than the "WHERE ... == ..." in the example above, which I believe require different implementations depending on the LINQ provider. Specifically I have a requirement to perform a SQL "WHERE ... IN ..." clause. I am able to implement this in the filter class using:
string[] aParams = // array of IDs
qry = qry.Where(t => aParams.Contains(t.ID));
However, in order to perform this against Entity Framework, I need to provide a solution such as the BuildContainsExpression which is tied to the Entity Framework. This means I have to have 2 different implementations of this particular filter, depending on the underlying provider.
I'd appreciate any advice on how I should proceed from here.
It seemed to me that exposing an IQueryable from my repository, would allow me to perform filters on it regardless of the underlying provider, enabling me to switch between providers if and when required. However the problem I describe above makes me think I should be performing all my filtering within the repositories and returning IEnumerable, IList or single classes.
Many thanks,
Matt
This is a very popular question. One that I constantly ask myself. I've always felt it best to return IEnumerable rather than IQueryable from a repository.
The purpose of a repository is to encapsulate the database infrastructure so the client need not worry about the data source. However, if you return IQueryable you are at the mercy of the consumer as to what kind of query will get run against your db, and whether they will do something that the LINQ provider doesn't support.
Take paging for example. Lets say you have a Customer entity and your database could have hundreds of thousands of customers. Which code would you rather have your client write?
var customers = repos.GetCustomers().Skip(skipCount).Take(pageSize).ToList();
OR
var customers = repos.GetCustomers(pageIndex, pageSize);
In the first approach you make it impossible for the repository to restrict the number of records retrieved from the data source. Also, your consumer has to calculate the skipCount.
In the second approach you provide a more coarse grained interface to your client. Now your repository can enforce some constraints on the pageSize in order to optimize the query. You also encapsulate the calculation of the skipCount.
However, that being said, in your situation your client is your service. So I suppose the question really comes down to a separation of concerns. Where is it better to perform such validation logic? Well that answer may very well be "in the service". But what about the answer to "Where is it better to contain query logic?". To me the answer is clearly "The Repository". That is its intended area of expertise.

When working with domain models and POCO classes, where do queries go?

I am new to domain models, POCO and DDD, so I am still trying to get my head around a few ideas.
One of the things I could not figure out yet is how to keep my domain models simple and storage-agnostic but still capable of performing some queries over its data in a rich way.
For instance, suppose that I have an entity Order that has a collection of OrdemItems. I want to get the cheapest order item, for whatever reason, or maybe a list of order items that are not currently in stock. What I don't want to do is to retrieve all order items from storage and filter later (too expensive) so I want to end up having a db query of the type "SELECT .. WHERE ITEM.INSTOCK=FALSE" somehow. I don't want to have that SQL query in my entity, or any variation of if that would tie me into a specific platform, like NHibernate queries on Linq2SQL. What is the common solution in that case?
Entities are the "units" of a domain. Repositories and services reference them, not vice versa. Think about it this way: do you carry the DMV in your pocket?
OrderItem is not an aggregate root; it should not be accessible through a repository. Its identity is local to an Order, meaning an Order will always be in scope when talking about OrderItems.
The difficulty of finding a home for the queries leads me to think of services. In this case, they would represent something about an Order that is hard for an Order itself to know.
Declare the intent in the domain project:
public interface ICheapestItemService
{
OrderItem GetCheapestItem(Order order);
}
public interface IInventoryService
{
IEnumerable<OrderItem> GetOutOfStockItems(Order order);
}
Declare the implementation in the data project:
public class CheapestItemService : ICheapestItemService
{
private IQueryable<OrderItem> _orderItems;
public CheapestItemService(IQueryable<OrderItem> orderItems)
{
_orderItems = orderItems;
}
public OrderItem GetCheapestItem(Order order)
{
var itemsByPrice =
from item in _orderItems
where item.Order == order
orderby item.Price
select item;
return itemsByPrice.FirstOrDefault();
}
}
public class InventoryService : IInventoryService
{
private IQueryable<OrderItem> _orderItems;
public InventoryService(IQueryable<OrderItem> orderItems)
{
_orderItems = orderItems;
}
public IEnumerable<OrderItem> GetOutOfStockItems(Order order)
{
return _orderItems.Where(item => item.Order == order && !item.InStock);
}
}
This example works with any LINQ provider. Alternatively, the data project could use NHibernate's ISession and ICriteria to do the dirty work.
Domain objects should be independent of storage, you should use the Repostiory pattern, or DAO to persist the objects. That way you are enforcing separation of concerns, the object itself should not know about how it is stored.
Ideally, it would be a good idea to put query construction inside of the repository, though I would use an ORM inside there.
Here's Martin Fowler's definition of the Repository Pattern.
As I understand this style of design, you would encapsulate the query in a method of an OrderItemRepository (or perhaps more suitably OrderRepository) object, whose responsibility is to talk to the DB on one side, and return OrderItem objects on the other side. The Repository hides details of the DB from consumers of OrderItem instances.
I would argue that it doesn't make sense to talk about "an Order that contains only the OrderItems that are not in stock". An "Order" (I presume) represents the complete list of whatever the client ordered; if you're filtering that list you're no longer dealing with an Order per se, you're dealing with a filtered list of OrderItems.
I think the question becomes whether you really want to treat Orders as an Aggregate Root, or whether you want to be able to pull arbitrary lists of OrderItems out of your data access layer as well.
You've said filtering items after they've come back from the database would be too expensive, but unless you're averaging hundreds or thousands of OrderItems for each order (or there's something else especially intensive about dealing with lots of OrderItems) you may be trying to optimize prematurely and making things more difficult than they need to be. I think if you can leave Order as the aggregate root and filter in your domain logic, your model will be cleaner to work with.
If that's genuinely not the case and you need to filter in the database, then you may want to consider having a separate OrderItem repository that would provide queries like "give me all of the OrderItems for this Order that are not in stock". You would then return those as an IList<OrderItem> (or IEnumerable<OrderItem>), since they're not a full Order, but rather some filtered collection of OrderItems.
In the service layer.

Domain Driven Design (Linq to SQL) - How do you delete parts of an aggregate?

I seem to have gotten myself into a bit of a confusion of this whole DDD\LinqToSql business. I am building a system using POCOS and linq to sql and I have repositories for the aggregate roots.
So, for example if you had the classes Order->OrderLine you have a repository for Order but not OrderLine as Order is the root of the aggregate. The repository has the delete method for deleting the Order, but how do you delete OrderLines?
You would have thought you had a method on Order called RemoveOrderLine which removed the line from the OrderLines collection but it also needs to delete the OrderLine from the underlying l2s table. As there isnt a repository for OrderLine how are you supposed to do it?
Perhaps have specialized public repostories for querying the roots and internal generic repositories that the domain objects actually use to delete stuff within the aggregates?
public class OrderRepository : Repository<Order> {
public Order GetOrderByWhatever();
}
public class Order {
public List<OrderLines> Lines {get; set;} //Will return a readonly list
public RemoveLine(OrderLine line) {
Lines.Remove(line);
//************* NOW WHAT? *************//
//(new Repository<OrderLine>(uow)).Delete(line) Perhaps??
// But now we have to pass in the UOW and object is not persistent ignorant. AAGH!
}
}
I would love to know what other people have done as I cant be the only one struggling with this.... I hope.... Thanks
You call the RemoveOrderLine on the Order which call the related logic. This does not include doing changes on the persisted version of it.
Later on you call a Save/Update method on the repository, that receives the modified order. The specific challenge becomes in knowing what has changed in the domain object, which there are several options (I am sure there are more than the ones I list):
Have the domain object keep track of the changes, which would include keeping track that x needs to be deleted from the order lines. Something similar to the entity tracking might be factored out as well.
Load the persisted version. Have code in the repository that recognizes the differences between the persisted version and the in-memory version, and run the changes.
Load the persisted version. Have code in the root aggregate, that gets you the differences given an original root aggregate.
First, you should be exposing Interfaces to obtain references to your Aggregate Root (i.e. Order()). Use the Factory pattern to new-up a new instance of the Aggregate Root (i.e. Order()).
With that said, the methods on your Aggregate Root contros access to its related objects - not itself. Also, never expose a complex types as public on the aggregate roots (i.e. the Lines() IList collection you stated in the example). This violates the law of decremeter (sp ck), that says you cannot "Dot Walk" your way to methods, such as Order.Lines.Add().
And also, you violate the rule that allows the client to access a reference to an internal object on an Aggregate Root. Aggregate roots can return a reference of an internal object. As long as, the external client is not allowed to hold a reference to that object. I.e., your "OrderLine" you pass into the RemoveLine(). You cannot allow the external client to control the internal state of your model (i.e. Order() and its OrderLines()). Therefore, you should expect the OrderLine to be a new instance to act upon accordingly.
public interface IOrderRepository
{
Order GetOrderByWhatever();
}
internal interface IOrderLineRepository
{
OrderLines GetOrderLines();
void RemoveOrderLine(OrderLine line);
}
public class Order
{
private IOrderRepository orderRepository;
private IOrderLineRepository orderLineRepository;
internal Order()
{
// constructors should be not be exposed in your model.
// Use the Factory method to construct your complex Aggregate
// Roots. And/or use a container factory, like Castle Windsor
orderRepository =
ComponentFactory.GetInstanceOf<IOrderRepository>();
orderLineRepository =
ComponentFactory.GetInstanceOf<IOrderLineRepository>();
}
// you are allowed to expose this Lines property within your domain.
internal IList<OrderLines> Lines { get; set; }
public RemoveOrderLine(OrderLine line)
{
if (this.Lines.Exists(line))
{
orderLineRepository.RemoveOrderLine(line);
}
}
}
Don't forget your factory for creating new instances of the Order():
public class OrderFactory
{
public Order CreateComponent(Type type)
{
// Create your new Order.Lines() here, if need be.
// Then, create an instance of your Order() type.
}
}
Your external client does have the right to access the IOrderLinesRepository directly, via the interface to obtain a reference of a value object within your Aggregate Root. But, I try to block that by forcing my references all off of the Aggregate Root's methods. So, you could mark the IOrderLineRepository above as internal so it is not exposed.
I actually group all of my Aggregate Root creations into multiple Factories. I did not like the approach of, "Some aggregate roots will have factories for complex types, others will not". Much easier to have the same logic followed throughout the domain modeling. "Oh, so Sales() is an aggregate root like Order(). There must be a factory for it too."
One final note is that if have a combination, i.e. SalesOrder(), that uses two models of Sales() and Order(), you would use a Service to create and act on that instance of SalesOrder() as neither the Sales() or Order() Aggregate Roots, nor their repositories or factories, own control over the SalesOrder() entity.
I highly, highly recommend this free book by Abel Avram and Floyd Marinescu on Domain Drive Design (DDD) as it directly answers your questions, in a shrot 100 page large print. Along with how to more decouple your domain entities into modules and such.
Edit: added more code
After struggling with this exact issue, I've found the solution. After looking at what the designer generates with l2sl, I realized that the solution is in the two-way associations between order and orderline. An order has many orderlines and an orderline has a single order. The solution is to use two way associations and a mapping attribute called DeleteOnNull(which you can google for complete info). The final thing I was missing was that your entity class needs to register for Add and Remove events from the l2s entityset. In these handlers, you have to set the Order association on the order line to be null. You can see an example of this if you look at some code that the l2s designer generates.
I know this is a frustrating one, but after days of struggling with it, I've got it working.
As a follow up....
I have switched to using nhibernate (rather than link to sql) but in effect you dont need the repos for the OrderLine. If you just remove the OrderLine from the collection in Order it will just delete the OrderLine from the database (assuming you have done your mapping correctly).
As I am swapping out with in-memory repositories, if you want to search for a particular order line (without knowing the order parent) you can write a linq to nhibernate query that links order to orderline where orderlineid = the value. That way it works when querying from the db and from in memory. Well there you go...