I know this could be opinion, but I'm looking for best practices.
As I understand, IQueryable<T> implements IEnumerable<T>, so in my DAL, I currently have method signatures like the following:
IEnumerable<Product> GetProducts();
IEnumerable<Product> GetProductsByCategory(int cateogoryId);
Product GetProduct(int productId);
Should I be using IQueryable<T> here?
What are the pros and cons of either approach?
Note that I am planning on using the Repository pattern so I will have a class like so:
public class ProductRepository {
DBDataContext db = new DBDataContext(<!-- connection string -->);
public IEnumerable<Product> GetProductsNew(int daysOld) {
return db.GetProducts()
.Where(p => p.AddedDateTime > DateTime.Now.AddDays(-daysOld ));
}
}
Should I change my IEnumerable<T> to IQueryable<T>? What advantages/disadvantages are there to one or the other?
It depends on what behavior you want.
Returning an IList<T> tells the caller that they've received all of the data they've requested
Returning an IEnumerable<T> tells the caller that they'll need to iterate over the result and it might be lazily loaded.
Returning an IQueryable<T> tells the caller that the result is backed by a Linq provider that can handle certain classes of queries, putting the burden on the caller to form a performant query.
While the latter gives the caller a lot of flexibility (assuming your repository fully supports it), it's the hardest to test and, arguably, the least deterministic.
One more thing to think about: where is your paging/sorting support? If you are providing paging support within your repository, returning IEnumerable<T> is fine. If you are paging outside of your repository (like in the controller or service layer) then you really want to use IQueryable<T> because you don't want to load the entire dataset into memory before it's paged.
HUUUUGGGE difference. I see this quite a bit.
You build up an IQueryable before it hits the database. The IQueryable only hits the DB once an eager function is called (.ToList() for example) or you actually try to pull values out. IQueryable = lazy.
An IEnumerable will execute your lambda against the DB right away. IEnumerable = eager.
As for which to use with the Repository pattern, I believe it's eager. I usually see ILists being passed but someone else will need to iron that out for you. EDIT - You usually see IEnumerable instead of IQueryable because you don't want layers past your Repository A) determining when the database hit will happen or B) Adding any logic to the joins outside the Repository
There is a very good LINQ video that I enjoy a lot- it hits more than just IEnumerable v IQueryable, but it really has some fantastic insight.
http://channel9.msdn.com/posts/matthijs/LINQ-Tips-Tricks-and-Optimizations-by-Scott-Allen/
You can use IQueryable and accept that someone could create a scenario where a SELECT N+1 could happen. This is a disadvantage, along with the fact that you may end up with code that is specific to your repository implementation in the layers above your repository. The advantage of this is that you are allowing the delegation common operations like paging and sorting to be expressed outside of your respository, therefore alleviating it of such concerns. It is also more flexible if you need to join the data with other database tables, as the query will remain an expression, so can be added to before its resolved into a query and hits the database.
The alternative is to lock down your repository so that it returns materialised lists by calling ToList(). With the example of paging and sorting, you will need to pass in skip, take and a sort expression as parameters to the methods of your repository, and use the parameters to return only a window of results. This means that the repository is taking on the responsibility of paging and sorting, and all of the projection of your data.
This is a bit of a judgement call, do you give your application the power of linq, and have less complexity in the repository, or do you control your data access. For me it depends on the number of queries associated with each entity, and combinations of entities, and where I want to manage that complexity.
Using Linq I would like to return an object that contains customers and invoices they have.
I understand returning a single type from a method:
public IQueryable<customers> GetCustomers()
{
return from c in customers
select c;
}
But I am having trouble figuring out multiple objects:
public IQueryable<???> GetCustomersWithInvoices()
{
return from c in customers
from inv in c.invoices
select new {c, ci} // or I may specify columns, but rather not.
}
I have a feeling I am approaching this the wrong way. The goal is to call these objects from a controller and pass them up to a view, either direct or using a formViewModel class.
In the second case you are creating an annonymous type which has method scope. To pass an annonymous type outside the method boundary you need to change the return type to object. This however defeats the purpose of the annonymous type (as you lose the strong typing it provides) , requiring reflection to get access to the properties and their values for the said type.
If you want to maintain this structure as your return type you should create a class or struct consisting of properties to hold the customer and invoice values.
You cannot return an anonymous type from a function, they are strictly "inline" classes. You will need to create a concrete type to hold your members if you want to encapsulate them in a function.
Using a view model, as you mentioned, would be a good place to put them.
Here is a scottgu article about anonymous types. From the conclusion of the article:
Anonymous types are a convenient
language feature that enable
developers to concisely define inline
CLR types within code, without having
to explicitly provide a formal class
declaration of the type. Although
they can be used in lots of scenarios,
there are particularly useful when
querying and transforming/shaping data
with LINQ.
There's some good discussion in the comment thread on that page.
If you really want to, you can do this, but it is rather awkward.
public IQueryable<T> GetCustomersWithInvoices(T exampleObject)
{
return from c in customers
from inv in c.invoices
select new {c, ci} // or I may specify columns, but rather not.
}
var exampleObject = new {
Customer c = new Customer(),
Invoice i = new Invoice()
};
var returnedObjectOfAnonymousType = GetCustomersWithInvoices(exampleObject);
In this way, you can take advantage of type inference to get your method to return an anonymous type. You have to use this ugly method of passing in an example object to get it to work. I don't really recommend that you do this, but I believe that this is the only way to do it.
When your in a situation where you need to return two things in a single method, what is the best approach?
I understand the philosophy that a method should do one thing only, but say you have a method that runs a database select and you need to pull two columns. I'm assuming you only want to traverse through the database result set once, but you want to return two columns worth of data.
The options I have come up with:
Use global variables to hold returns. I personally try and avoid globals where I can.
Pass in two empty variables as parameters then assign the variables inside the method, which now is a void. I don't like the idea of methods that have a side effects.
Return a collection that contains two variables. This can lead to confusing code.
Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
This is not entirely language-agnostic: in Lisp, you can actually return any number of values from a function, including (but not limited to) none, one, two, ...
(defun returns-two-values ()
(values 1 2))
The same thing holds for Scheme and Dylan. In Python, I would actually use a tuple containing 2 values like
def returns_two_values():
return (1, 2)
As others have pointed out, you can return multiple values using the out parameters in C#. In C++, you would use references.
void
returns_two_values(int& v1, int& v2)
{
v1 = 1; v2 = 2;
}
In C, your method would take pointers to locations, where your function should store the result values.
void
returns_two_values(int* v1, int* v2)
{
*v1 = 1; *v2 = 2;
}
For Java, I usually use either a dedicated class, or a pretty generic little helper (currently, there are two in my private "commons" library: Pair<F,S> and Triple<F,S,T>, both nothing more than simple immutable containers for 2 resp. 3 values)
I would create data transfer objects. If it is a group of information (first and last name) I would make a Name class and return that. #4 is the way to go. It seems like more work up front (which it is), but makes it up in clarity later.
If it is a list of records (rows in a database) I would return a Collection of some sort.
I would never use globals unless the app is trivial.
Not my own thoughts (Uncle Bob's):
If there's cohesion between those two variables - I've heard him say, you're missing a class where those two are fields. (He said the same thing about functions with long parameter lists.)
On the other hand, if there is no cohesion, then the function does more than one thing.
I think the most preferred approach is to build a container (may it be a class or a struct - if you don't want to create a separate class for this, struct is the way to go) that will hold all the parameters to be returned.
In the C/C++ world it would actually be quite common to pass two variables by reference (an example, your no. 2).
I think it all depends on the scenario.
Thinking from a C# mentality:
1: I would avoid globals as a solution to this problem, as it is accepted as bad practice.
4: If the two return values are uniquely tied together in some way or form that it could exist as its own object, then you can return a single object that holds the two values. If this object is only being designed and used for this method's return type, then it likely isn't the best solution.
3: A collection is a great option if the returned values are the same type and can be thought of as a collection. However, if the specific example needs 2 items, and each item is it's 'own' thing -> maybe one represents the beginning of something, and the other represents the end, and the returned items are not being used interchangably, then this may not be the best option.
2: I like this option the best, if 4, and 3 do not make sense for your scenario. As stated in 3, if you wanted to get two objects that represent the beginning and end items of something. Then I would use parameters by reference (or out parameters, again, depending on how it's all being used). This way your parameters can explicitly define their purpose: MethodCall(ref object StartObject, ref object EndObject)
Personally I try to use languages that allow functions to return something more than a simple integer value.
First, you should distinguish what you want: an arbitrary-length return or fixed-length return.
If you want your method to return an arbitrary number of arguments, you should stick to collection returns. Because the collections--whatever your language is--are specifically tied to fulfill such a task.
But sometimes you just need to return two values. How does returning two values--when you're sure it's always two values--differ from returning one value? No way it differs, I say! And modern languages, including perl, ruby, C++, python, ocaml etc allow function to return tuples, either built-in or as a third-party syntactic sugar (yes, I'm talking about boost::tuple). It looks like that:
tuple<int, int, double> add_multiply_divide(int a, int b) {
return make_tuple(a+b, a*b, double(a)/double(b));
}
Specifying an "out parameter", in my opinion, is overused due to the limitations of older languages and paradigms learned those days. But there still are many cases when it's usable (if your method needs to modify an object passed as parameter, that object being not the class that contains a method).
The conclusion is that there's no generic answer--each situation has its own solution. But one common thing there is: it's not violation of any paradigm that function returns several items. That's a language limitation later somehow transferred to human mind.
Python (like Lisp) also allows you to return any number of
values from a function, including (but not limited to)
none, one, two
def quadcube (x):
return x**2, x**3
a, b = quadcube(3)
Some languages make doing #3 native and easy. Example: Perl. "return ($a, $b);". Ditto Lisp.
Barring that, check if your language has a collection suited to the task, ala pair/tuple in C++
Barring that, create a pair/tuple class and/or collection and re-use it, especially if your language supports templating.
If your function has return value(s), it's presumably returning it/them for assignment to either a variable or an implied variable (to perform operations on, for instance.) Anything you can usefully express as a variable (or a testable value) should be fair game, and should dictate what you return.
Your example mentions a row or a set of rows from a SQL query. Then you reasonably should be ready to deal with those as objects or arrays, which suggests an appropriate answer to your question.
When your in a situation where you
need to return two things in a single
method, what is the best approach?
It depends on WHY you are returning two things.
Basically, as everyone here seems to agree, #2 and #4 are the two best answers...
I understand the philosophy that a
method should do one thing only, but
say you have a method that runs a
database select and you need to pull
two columns. I'm assuming you only
want to traverse through the database
result set once, but you want to
return two columns worth of data.
If the two pieces of data from the database are related, such as a customer's First Name and Last Name, I would indeed still consider this to be doing "one thing."
On the other hand, suppose you have come up with a strange SELECT statement that returns your company's gross sales total for a given date, and also reads the name of the customer that placed the first sale for today's date. Here you're doing two unrelated things!
If it's really true that performance of this strange SELECT statement is much better than doing two SELECT statements for the two different pieces of data, and both pieces of data really are needed on a frequent basis (so that the entire application would be slower if you didn't do it that way), then using this strange SELECT might be a good idea - but you better be prepared to demonstrate why your way really makes a difference in perceived response time.
The options I have come up with:
1 Use global variables to hold returns. I personally try and avoid
globals where I can.
There are some situations where creating a global is the right thing to do. But "returning two things from a function" is not one of those situations. Doing it for this purpose is just a Bad Idea.
2 Pass in two empty variables as parameters then assign the variables
inside the method, which now is a
void.
Yes, that's usually the best idea. This is exactly why "by reference" (or "output", depending on which language you're using) parameters exist.
I don't like the idea of methods that have a side effects.
Good theory, but you can take it too far. What would be the point of calling SaveCustomer() if that method didn't have a side-effect of saving the customer's data?
By Reference parameters are understood to be parameters that contain returned data.
3 Return a collection that contains two variables. This can lead to confusing code.
True. It wouldn't make sense, for instance, to return an array where element 0 was the first name and element 1 was the last name. This would be a Bad Idea.
4 Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
Yes and no. As you say, I wouldn't want to create an object called FirstAndLastNames just to be used by one method. But if there was already an object which had basically this information, then it would make perfect sense to use it here.
If I was returning two of the exact same thing, a collection might be appropriate, but in general I would usually build a specialized class to hold exactly what I needed.
And if if you are returning two things today from those two columns, tomorrow you might want a third. Maintaining a custom object is going to be a lot easier than any of the other options.
Use var/out parameters or pass variables by reference, not by value. In Delphi:
function ReturnTwoValues(out Param1: Integer):Integer;
begin
Param1 := 10;
Result := 20;
end;
If you use var instead of out, you can pre-initialize the parameter.
With databases, you could have an out parameter per column and the result of the function would be a boolean indicating if the record is retrieved correctly or not. (Although I would use a single record class to hold the column values.)
As much as it pains me to do it, I find the most readable way to return multiple values in PHP (which is what I work with, mostly) is using a (multi-dimensional) array, like this:
function doStuff($someThing)
{
// do stuff
$status = 1;
$message = 'it worked, good job';
return array('status' => $status, 'message' => $message);
}
Not pretty, but it works and it's not terribly difficult to figure out what's going on.
I generally use tuples. I mainly work in C# and its very easy to design generic tuple constructs. I assume it would be very similar for most languages which have generics. As an aside, 1 is a terrible idea, and 3 only works when you are getting two returns that are the same type unless you work in a language where everything derives from the same basic type (i.e. object). 2 and 4 are also good choices. 2 doesn't introduce any side effects a priori, its just unwieldy.
Use std::vector, QList, or some managed library container to hold however many X you want to return:
QList<X> getMultipleItems()
{
QList<X> returnValue;
for (int i = 0; i < countOfItems; ++i)
{
returnValue.push_back(<your data here>);
}
return returnValue;
}
For the situation you described, pulling two fields from a single table, the appropriate answer is #4 given that two properties (fields) of the same entity (table) will exhibit strong cohesion.
Your concern that "it might be confusing to create a class just for the purpose of a return" is probably not that realistic. If your application is non-trivial you are likely going to need to re-use that class/object elsewhere anyway.
You should also consider whether the design of your method is primarily returning a single value, and you are getting another value for reference along with it, or if you really have a single returnable thing like first name - last name.
For instance, you might have an inventory module that queries the number of widgets you have in inventory. The return value you want to give is the actual number of widgets.. However, you may also want to record how often someone is querying inventory and return the number of queries so far. In that case it can be tempting to return both values together. However, remember that you have class vars availabe for storing data, so you can store an internal query count, and not return it every time, then use a second method call to retrieve the related value. Only group the two values together if they are truly related. If they are not, use separate methods to retrieve them separately.
Haskell also allows multiple return values using built in tuples:
sumAndDifference :: Int -> Int -> (Int, Int)
sumAndDifference x y = (x + y, x - y)
> let (s, d) = sumAndDifference 3 5 in s * d
-16
Being a pure language, options 1 and 2 are not allowed.
Even using a state monad, the return value contains (at least conceptually) a bag of all relevant state, including any changes the function just made. It's just a fancy convention for passing that state through a sequence of operations.
I will usually opt for approach #4 as I prefer the clarity of knowing what the function produces or calculate is it's return value (rather than byref parameters). Also, it lends to a rather "functional" style in program flow.
The disadvantage of option #4 with generic tuple classes is it isn't much better than returning a collection (the only gain is type safety).
public IList CalculateStuffCollection(int arg1, int arg2)
public Tuple<int, int> CalculateStuffType(int arg1, int arg2)
var resultCollection = CalculateStuffCollection(1,2);
var resultTuple = CalculateStuffTuple(1,2);
resultCollection[0] // Was it index 0 or 1 I wanted?
resultTuple.A // Was it A or B I wanted?
I would like a language that allowed me to return an immutable tuple of named variables (similar to a dictionary, but immutable, typesafe and statically checked). But, sadly, such an option isn't available to me in the world of VB.NET, it may be elsewhere.
I dislike option #2 because it breaks that "functional" style and forces you back into a procedural world (when often I don't want to do that just to call a simple method like TryParse).
I have sometimes used continuation-passing style to work around this, passing a function value as an argument, and returning that function call passing the multiple values.
Objects in place of function values in languages without first-class functions.
My choice is #4. Define a reference parameter in your function. That pointer references to a Value Object.
In PHP:
class TwoValuesVO {
public $expectedOne;
public $expectedTwo;
}
/* parameter $_vo references to a TwoValuesVO instance */
function twoValues( & $_vo ) {
$vo->expectedOne = 1;
$vo->expectedTwo = 2;
}
In Java:
class TwoValuesVO {
public int expectedOne;
public int expectedTwo;
}
class TwoValuesTest {
void twoValues( TwoValuesVO vo ) {
vo.expectedOne = 1;
vo.expectedTwo = 2;
}
}
I am very interested in Linq to SQL with Lazy load feature. And in my project I used AutoMapper to map DB Model to Domain Model (from DB_RoleInfo to DO_RoleInfo). In my repository code as below:
public DO_RoleInfo SelectByKey(Guid Key)
{
return SelectAll().Where(x => x.Id == Key).SingleOrDefault();
}
public IQueryable<DO_RoleInfo> SelectAll()
{
Mapper.CreateMap<DB_RoleInfo, DO_RoleInfo>();
return from role in _ctx.DB_RoleInfo
select Mapper.Map<DB_RoleInfo, DO_RoleInfo>(role);
}
SelectAll method is run well, but when I call SelectByKey, I get the error:
Method “RealMVC.Data.DO_RoleInfo MapDB_RoleInfo,DO_RoleInfo” could not translate to SQL.
Is it that Automapper doesn't support Linq completely?
Instead of Automapper, I tried the manual mapping code below:
public IQueryable<DO_RoleInfo> SelectAll()
{
return from role in _ctx.DB_RoleInfo
select new DO_RoleInfo
{
Id = role.id,
name = role.name,
code = role.code
};
}
This method works the way I want it to.
While #Aaronaught's answer was correct at the time of writing, as often the world has changed and AutoMapper with it. In the mean time, QueryableExtensions were added to the code base which added support for projections that get translated into expressions and, finally, SQL.
The core extension method is ProjectTo1. This is what your code could look like:
using AutoMapper.QueryableExtensions;
public IQueryable<DO_RoleInfo> SelectAll()
{
Mapper.CreateMap<DB_RoleInfo, DO_RoleInfo>();
return _ctx.DB_RoleInfo.ProjectTo<DO_RoleInfo>();
}
and it would behave like the manual mapping. (The CreateMap statement is here for demonstration purposes. Normally, you'd define mappings once at application startup).
Thus, only the columns that are required for the mapping are queried and the result is an IQueryable that still has the original query provider (linq-to-sql, linq-to-entities, whatever). So it is still composable and this will translate into a WHERE clause in SQL:
SelectAll().Where(x => x.Id == Key).SingleOrDefault();
1 Project().To<T>() prior to v. 4.1.0
Change your second function to this:
public IEnumerable<DO_RoleInfo> SelectAll()
{
Mapper.CreateMap<DB_RoleInfo, DO_RoleInfo>();
return from role in _ctx.DB_RoleInfo.ToList()
select Mapper.Map<DB_RoleInfo, DO_RoleInfo>(role);
}
AutoMapper works just fine with Linq to SQL, but it can't be executed as part of the deferred query. Adding ToList() at the end of your Linq query causes it to immediately evaluate the results, instead of trying to translate the AutoMapper segment as part of the query.
Clarification
The notion of deferred execution (not "lazy load") does not make any sense once you've changed the resulting type to something that's not a data entity. Consider these two classes:
public class DB_RoleInfo
{
public int ID { get; set; }
public string Name { get; set; }
}
public class DO_RoleInfo
{
public Role Role { get; set; } // Enumeration type
}
Now consider the following mapping:
Mapper.CreateMap<DB_RoleInfo, DO_RoleInfo>
.ForMember(dest => dest.Role, opt => opt.MapFrom(src =>
(Role)Enum.Parse(typeof(Role), src.Name)));
This mapping is completely fine (unless I made a typo), but let's say you write the SelectAll method in your original post instead of my revised one:
public IQueryable<DO_RoleInfo> SelectAll()
{
Mapper.CreateMap<DB_RoleInfo, DO_RoleInfo>();
return from role in _ctx.DB_RoleInfo
select Mapper.Map<DB_RoleInfo, DO_RoleInfo>(role);
}
This actually kind of works, but by calling itself a "queryable", it lies. What happens if I try to write this against it:
public IEnumerable<DO_RoleInfo> SelectSome()
{
return from ri in SelectAll()
where (ri.Role == Role.Administrator) ||
(ri.Role == Role.Executive)
select ri;
}
Think really hard about this. How could Linq to SQL possibly be able to successfully turn your where into an actual database query?
Linq knows nothing about the DO_RoleInfo class. It doesn't know how to do the mapping backward - in some cases, that may not even possible. Sure, you may look at this code and go "Oh, that's easy, just search for 'Administrator' or 'Executive' in the Name column", but you're the only one who knows that. As far as Linq to SQL is concerned, the query is pure nonsense.
Imagine that somebody gave you these instructions:
Go to the supermarket and bring back the ingredients for making Morton Thompson Turkey.
Unless you've made it before, and most people haven't, your response to that instruction is most likely going to be:
What the hell is that?
You can go to the market, and you can get specific ingredients by name, but you can't evaluate the condition I've given you while you're over there. I have to "un-map" the criteria first. I have to tell you, here are the ingredients we need for this recipe - now go and get them.
To summarize, this is not some simple incompatibility between Linq to SQL and AutoMapper. It is not unique to either of those two libraries. It doesn't matter how you actually do the mapping to a non-entity type - you could just as easily do the mapping manually, and you'd still get the same error, because you are now giving Linq to SQL a set of instructions that are no longer comprehensible, dealing with mysterious classes that don't have an intrinsic mapping to any particular entity type.
This issue is fundamental to the concept of O/R Mapping and deferred query execution. A projection is a one-way operation. Once you project, you can no longer go back to the query engine and say oh by the way, here are some more conditions for you. It's too late. The best you can do is take what it already gave you and evaluate the extra conditions yourself.
Last but not least, I'll leave you with a workaround. If the only thing you want to be able to do from your mapping is filter the rows, you can write this:
public IEnumerable<DO_RoleInfo> SelectRoles(Func<DB_RoleInfo, bool> selector)
{
Mapper.CreateMap<DB_RoleInfo, DO_RoleInfo>();
return _ctx.DB_RoleInfo
.Where(selector)
.Select(dbr => Mapper.Map<DB_RoleInfo, DO_RoleInfo>(dbr));
}
This is a utility method that handles the mapping for you and accepts a filter on the original entity, and not the mapped entity. It might be useful if you have many different kinds of filters but always need to do the same mapping.
Personally, I think you will be better off just writing out the queries properly, by first determining what you need to retrieve from the database, then doing any projections/mappings, and then, finally, if you need to do further filtering (which you shouldn't), then materialize the results with ToList() or ToArray() and write more conditions against the local list.
Don't try to use AutoMapper or any other tool to hide the real entities exposed by Linq to SQL. The domain model is your public interface. The queries you write are an aspect of your private implementation. It's important to understand the difference and maintain a good separation of concerns.
I am new to domain models, POCO and DDD, so I am still trying to get my head around a few ideas.
One of the things I could not figure out yet is how to keep my domain models simple and storage-agnostic but still capable of performing some queries over its data in a rich way.
For instance, suppose that I have an entity Order that has a collection of OrdemItems. I want to get the cheapest order item, for whatever reason, or maybe a list of order items that are not currently in stock. What I don't want to do is to retrieve all order items from storage and filter later (too expensive) so I want to end up having a db query of the type "SELECT .. WHERE ITEM.INSTOCK=FALSE" somehow. I don't want to have that SQL query in my entity, or any variation of if that would tie me into a specific platform, like NHibernate queries on Linq2SQL. What is the common solution in that case?
Entities are the "units" of a domain. Repositories and services reference them, not vice versa. Think about it this way: do you carry the DMV in your pocket?
OrderItem is not an aggregate root; it should not be accessible through a repository. Its identity is local to an Order, meaning an Order will always be in scope when talking about OrderItems.
The difficulty of finding a home for the queries leads me to think of services. In this case, they would represent something about an Order that is hard for an Order itself to know.
Declare the intent in the domain project:
public interface ICheapestItemService
{
OrderItem GetCheapestItem(Order order);
}
public interface IInventoryService
{
IEnumerable<OrderItem> GetOutOfStockItems(Order order);
}
Declare the implementation in the data project:
public class CheapestItemService : ICheapestItemService
{
private IQueryable<OrderItem> _orderItems;
public CheapestItemService(IQueryable<OrderItem> orderItems)
{
_orderItems = orderItems;
}
public OrderItem GetCheapestItem(Order order)
{
var itemsByPrice =
from item in _orderItems
where item.Order == order
orderby item.Price
select item;
return itemsByPrice.FirstOrDefault();
}
}
public class InventoryService : IInventoryService
{
private IQueryable<OrderItem> _orderItems;
public InventoryService(IQueryable<OrderItem> orderItems)
{
_orderItems = orderItems;
}
public IEnumerable<OrderItem> GetOutOfStockItems(Order order)
{
return _orderItems.Where(item => item.Order == order && !item.InStock);
}
}
This example works with any LINQ provider. Alternatively, the data project could use NHibernate's ISession and ICriteria to do the dirty work.
Domain objects should be independent of storage, you should use the Repostiory pattern, or DAO to persist the objects. That way you are enforcing separation of concerns, the object itself should not know about how it is stored.
Ideally, it would be a good idea to put query construction inside of the repository, though I would use an ORM inside there.
Here's Martin Fowler's definition of the Repository Pattern.
As I understand this style of design, you would encapsulate the query in a method of an OrderItemRepository (or perhaps more suitably OrderRepository) object, whose responsibility is to talk to the DB on one side, and return OrderItem objects on the other side. The Repository hides details of the DB from consumers of OrderItem instances.
I would argue that it doesn't make sense to talk about "an Order that contains only the OrderItems that are not in stock". An "Order" (I presume) represents the complete list of whatever the client ordered; if you're filtering that list you're no longer dealing with an Order per se, you're dealing with a filtered list of OrderItems.
I think the question becomes whether you really want to treat Orders as an Aggregate Root, or whether you want to be able to pull arbitrary lists of OrderItems out of your data access layer as well.
You've said filtering items after they've come back from the database would be too expensive, but unless you're averaging hundreds or thousands of OrderItems for each order (or there's something else especially intensive about dealing with lots of OrderItems) you may be trying to optimize prematurely and making things more difficult than they need to be. I think if you can leave Order as the aggregate root and filter in your domain logic, your model will be cleaner to work with.
If that's genuinely not the case and you need to filter in the database, then you may want to consider having a separate OrderItem repository that would provide queries like "give me all of the OrderItems for this Order that are not in stock". You would then return those as an IList<OrderItem> (or IEnumerable<OrderItem>), since they're not a full Order, but rather some filtered collection of OrderItems.
In the service layer.