My take on CQRS is when followed strictly your commands don't return anything (return type void), so my example is really straight forward: How do you retrieve an ID when creating something?
For example, when creating a credit card transaction it seems rather important to return a transaction ID, or when creating a customer it would be much easier if you got the customer you created or the customer ID back so a browser could navigate automatically to that customer page for example.
One solution could be to first ask for an ID and then create the customer or transaction with that ID, but it seems pretty weird.
Does anyone have any experience with this or know how it should be done in the most effective way? Maybe I have misunderstood something?
CQRS is all about fire-and-forget, and since GUIDs are very reliable (low risk of collision) there is no problem sending in a GUID that you generate your self.
The steps would basically be:
Create your command
Generate and assign your identity (GUID) to it
Fire the command
Return the identity earlier generated
Read more about GUIDs on Wikipedia
Integer id's / GUIDs / byte arrays of any size can be reliable enough at practice, but they all do not correspond to the theoretical requirement (collisions happens), while valid theoretical solution exists and can be applied most of the time.
I'd formulate the solution as: in the equal-level system co-operation one's identity should be guaranteed by the system of a higher level. Higher-level system is the one which manages the lifetime of co-operating systems.
Example:
class John
{
private readonly int id;
public John(int id)
{
this.id = id;
}
public void UseSite(Site site)
{
site.CreateAccount(id, "john");
site.SetPassword(id, "john", "123");
/* ... */
}
}
class Site
{
public void CreateAccount(int humanId, string accName) { /* ... */ }
public void SetPassword(int humanId, string accName, string pwd) { /* ... */ }
/* ... */
}
class Program
{
static void Main(string[] args)
{
Site s = new Site();
// It's easy to guarantee the identity while there's only one object
John j = new John(4);
Console.ReadLine();
}
}
Program is the higher-level module. It is responsible to use John and Site correctly. Providing John with an unique identifier is a part of this responsibility.
You will find that it is impossible or very hard to deal with the identity of some real-life systems, like a human. It happens when these systems are on the same level as your system. Typical example is a human and a web-site. Your site will never have a guarantee that the right human requesting the page. In this case you should use the probability-based approach with a reliable hash.
Related
This is a common requirement in many web-based projects: an entity has to show information to another related entity. For example, a book in an e-commerce site has to show relevant information about its author.
Let say I model both the book and author as an entity, how should I implement a feature which display a book and its author's information on the same page.
I can make a call to the BookRepo to retrieve the book's information, and then another call to the AuthorRepo to retrieve the author, using the authorid inside the book entity. This is 2 queries
I can write a query where I join the Book and Author tables together and retrieve both information in 1 query. But which repo does this query goes? Does this break DDD because I am assuming details about the Book and Author entity?
Which is the 'best practices', and what are other ways I can approach this problem?
(I am assuming using the use of standard SQL queries [such as PHP + MySQL], since in EF 4 you would define associations between Book and Author which would solve the problem rather easily).
There is no silver bullet solution to this, but you have a few options.
Your first proposed solution of making a call to two repositories is perfectly valid and happens all the time in practice. For example, it takes over a hundred different services to render an Amazon product page. Each service is responsible for providing data specific to its bounded context. You can create a service called something like BookService which calls the two repositories to return a reporting object or a DTO that has all the data that you need for the particular view. If you feel that performance is due to two repository calls is going to be an issue, you can employ caching, or CQRS to create appropriate read models, but don't jump to those solutions prematurely.
But which repo does this query goes?
I would just add it to the BookRepository, or even a whole new repository called BookDetailsReportingRepository, perhaps a method called GetBookDetails. This method would not return an editable entity, but a reporting object which is a projection of values from multiple entities.
Does this break DDD because I am assuming details about the Book and Author entity?
This does not violate DDD and in my opinion makes it easier to apply. Just regard as the data returned by the aforementioned repository as reporting objects, not entities.
But even though you don't seems to use ORM's you probably have tto populate your entities from your SQL querys and your entities has to relate to each other in some way (collections, navigation properties etc...).
If your domain contains entities that have no associations between each other, I wouldn't call it DDD since you lack some important ingredients like aggregates, value objects, bi-/unidirectional relationships.
But what do I know :-) Maybe you have made your puzzle well and this last piece is to merge entities into a "view" that can be useful for your clients.
Since repositories normally operates on an aggregate-root entity you can have repository methods like ListBooksByAuthor (BookRepository) or ListAuthors (AuthorRepository).
When you want to display complex data from many different aggregates in a web page I recommend using Data Transfer Objects. Let that DTO object be unique for that page or Use Case and being a "view" that displays all (or most of the) data that web page needs.
I also recommend NOT using DTO's everywhere unless you're using web services. Using DTO's together gives both pros and cons. Together with a service layer it gives you a nive anti corruption layer and also gives you the place to inject Book and Author repository. Then from service layer you can assemble and reassemble DTO's (look at AutoMApper or similar...helps you a lot).
BUT to much DTO's everywhere also gives you overhead in maintenance of the application. It adds another layer to maintain.
I prefer to just use it for certain clients/web pages.
I hope you understand what I'm trying to explain :-)
If you look at this page it describes two ways to load and relate the aggregate roots. Linking this back to your example:
The Book class would encapsulate the relevant Author information as a Value Type so when the Book information is displayed on the web page it has all the information on the author it needs. If the users decides to view more information on the Author they can by following a link to an Author Page (whatever the requirement is).
If you have a service method called FindBookByTitle then
loading the Book entity would then load the relevant author information from the BookRepo.
class Author
{
public Author(int authorID, FullName name) { }
int AuthorID { get; }
FullName Name { get; }
List<BookDetails> AuthoredBooks { get; set; }
}
class BookDetails
{
public BookDetails(int ID, string title) { }
int BookID { get; }
string Title { get; }
}
class Book
{
public Book(int ID, string Title) { }
int BookID { get; }
string Title { get; }
List<AuthorDetails> Writers { get; set; }
}
class AuthorDetails
{
public AuthorDetails(int ID, FullName name) { }
int AuthorID { get; }
public FullName fullName { get; }
}
class FullName
{
public FullName(string name, string surname) { }
public string Name { get; }
public string Surname { get; }
}
So, I'm developing some software, and trying to keep myself using TDD and other best practices.
I'm trying to write tests to define the classes and repository.
Let's say I have the classes, Customer, Order, OrderLine.
Now, do I create the Order class as something like
abstract class Entity {
int ID { get; set; }
}
class Order : Entity {
Customer Customer { get; set; }
List<OrderLine> OrderLines { get; set; }
}
Which will serialize nice, but, if I don't care about the OrderLines, or Customer details is not as lightweight as one would like. Or do I just store IDs to items and add a function for getting them?
class Order : Entity {
int CustomerID { get; set; }
List<OrderLine> GetOrderLines() {};
}
class OrderLine : Entity {
int OrderID { get; set; }
}
And how would you structure the repository for something like this?
Do I use an abstract CRUD repository with methods GetByID(int), Save(entity), Delete(entity) that each items repository inherits from, and adds it's own specific methods too, something like this?
public abstract class RepositoryBase<T, TID> : IRepository<T, TID> where T : AEntity<TID>
{
private static List<T> Entities { get; set; }
public RepositoryBase()
{
Entities = new List<T>();
}
public T GetByID(TID id)
{
return Entities.Where(x => x.Id.Equals(id)).SingleOrDefault();
}
public T Save(T entity)
{
Entities.RemoveAll(x => x.Id.Equals(entity.Id));
Entities.Add(entity);
return entity;
}
public T Delete(T entity)
{
Entities.RemoveAll(x => x.Id.Equals(entity.Id));
return entity;
}
}
What's the 'best practice' here?
Entities
Let's start with the Order entity. An order is an autonomous object, which isn't dependent on a 'parent' object. In domain-driven design this is called an aggregate root; it is the root of the entire order aggregate. The order aggregate consists of the root and several child entities, which are the OrderLine entities in this case.
The aggregate root is responsible for managing the entire aggregate, including the lifetime of the child entities. Other components are not allowed to access the child entities; all changes to the aggregate must go through the root. Also, if the root ceases to exist, so do the children, i.e. order lines cannot exist without a parent order.
The Customer is also an aggregate root. It isn't part of an order, it's only related to an order. If an order ceases to exist, the customer doesn't. And the other way around, if a customer ceases to exist, you'll want to keep the orders for bookkeeping purposes. Because Customer is only related, you'll want to have just the CustomerId in the order.
class Order
{
int OrderId { get; }
int CustomerId { get; set; }
IEnumerable<OrderLine> OrderLines { get; private set; }
}
Repositories
The OrderRepository is responsible for loading the entire Order aggregate, or parts of it, depending on the requirements. It is not responsible for loading the customer. If you need the customer, load it from the CustomerRepository, using the CustomerId from the order.
class OrderRepository
{
Order GetById(int orderId)
{
// implementation details
}
Order GetById(int orderId, OrderLoadOptions loadOptions)
{
// implementation details
}
}
enum OrderLoadOptions
{
All,
ExcludeOrderLines,
// other options
}
If you ever need to load the order lines afterwards, you should use the tell, don't ask principle. Tell the order to load its order lines, and which repository to use. The order will then tell the repository the information it needs to know.
class Order
{
int OrderId { get; }
int CustomerId { get; set; }
IEnumerable<OrderLine> OrderLines { get; private set; }
void LoadOrderLines(IOrderRepository orderRepository)
{
// simplified implementation
this.OrderLines = orderRepository.GetOrderLines(this.OrderId);
}
}
Note that the code uses an IOrderRepository to retrieve the order lines, rather than a separate repository for order lines. Domain-driven design states that there should be a repository for each aggregate root. Methods for retrieving child entities belong in the repository of the root and should only be accessed by the root.
Abstract/base repositories
I have written abstract repositories with CRUD operations myself, but I found that it didn't add any value. Abstraction is useful when you want to pass instances of subclasses around in your code. But what kind of code will accept any BaseRepository implementation as a parameter?
Also, the CRUD operations can differ per entity, making a base implementation useless. Do you really want to delete an order, or just set its status to deleted? If you delete a customer, what will happen to the related orders?
My advice is to keep things simple. Stay away from abstraction and generic base classes. Sure, all repositories share some kind of functionality and generics look cool. But do you actually need it?
I would divide my project up into the relevant parts. Data Transfer Objects (DTO), Data Access Objects (DAO). The DTO's I would want to be as simple as possible, terms like POJO (Plain Old Java Object) and POCO (Plain Old C Object) are used here, simply put they are container objects with very little if any functionality built into them.
The DTO's are basically the building blocks to the whole application, and will marry up the layers. For every object that is modeled in the system, there should be at least one DTO. How you then put these into collections is entirely up to the design of the application. Obviously there are natural One to many relationships floating around, such as Customer has many Orders. But the fundamentals of these objects are what they are. For example, an order has a relationship with a customer, but can also be stand alone and so needs to be separate from the customer object. All Many to Many Relationships should be resolved down into One to Many relationships which is easy when dealing with nested classes.
Presumably there should be CRUD objects that appear within the Data Access Objects category. This is where it gets tricky as you have to manage all the relationships that have been discovered in design and the lifetime models of each. When fetching DTO's back from the DAO the loading options are essential as this can mean the difference between your system running like a dog from over eager loading, or high network traffic from fetching data back and fourth from your application and the store by lazy loading.
I won't go into flags and loading options as others here have done all that.
class OrderDAO
{
public OrderDTO Create(IOrderDTO order)
{
//Code here that will create the actual order and store it, updating the
flelds in the OrderDTO where necessary. One being the GUID field of the new ID.
I stress guid as this means for better scalability.
return OrderDTO
}
}
As you can see the OrderDTO is passed into the Create Method.
For the Create Method, when dealing with brand new nested Objects, there will have to be some code dealing with the marrying up of data that has been stored, for example a customer with old orders, and a new order. The system will have to deal with the fact that some of the operations are update statements, whilst others are Create.
However one piece of the puzzle that is always missed is that of multi-user environments where DTO's (plain Objects) are disconnected from the application and returned back to the DAO for CRUD. This usually involves some Concurrency Control which can be nasty and can get complicated. A simple mechanism such as DateTime or Version number works here, although when doing crud on a nested object, you must develop the rules on what gets updated and in what order, also if an update fails concurrency, you have to decide on whether you fail all the operation or partial.
Why not create separate Order classes? It sounds to me like you're describing a base Order object, which would contain the basic order and customer information (or maybe not even the customer information), and a separate Order object that has line items in it.
In the past, I've done as Niels suggested, and either used boolean flags or enums to describe optionally loading child objects, lists, etc. In Clean Code, Uncle Bob says that these variables and function parameters are excuses that programmers use to not refactor a class or function into smaller, easier to digest pieces.
As for your class design, I'd say that it depends. I assume that an Order could exist without any OrderLines, but could not exist without a Customer (or at least a way to reference the customer, like Niels suggested). If this is the case, why not create a base Order class and a second FullOrder class. Only FullOrder would contain the list of OrderLines. Following that thought, I'd create separate repositories to handle CRUD operations for Order and FullOrder.
If you are interested in domain driven design (DDD) implementation with POCOs along with explanations take a look at the following 2 posts:
http://devtalk.dk/2009/06/09/Entity+Framework+40+Beta+1+POCO+ObjectSet+Repository+And+UnitOfWork.aspx
http://www.primaryobjects.com/CMS/Article122.aspx
There is also a project that implements domain driven patterns (repository, unit of work, etc, etc) for various persistence frameworks (NHibernate, Entity Frameworks, etc, etc) called NCommon
I'm trying to decide on the best pattern for data access in my MVC application.
Currently, having followed the MVC storefront series, I am using repositories, exposing IQueryable to a service layer, which then applies filters. Initially I have been using LINQtoSQL e.g.
public interface IMyRepository
{
IQueryable<MyClass> GetAll();
}
Implemented in:
public class LINQtoSQLRepository : IMyRepository
{
public IQueryable<MyClass> GetAll()
{
return from table in dbContext.table
select new MyClass
{
Field1 = table.field1,
... etc.
}
}
}
Filter for IDs:
public static class TableFilters
{
public static MyClass WithID(this IQueryable<MyClass> qry, string id)
{
return (from t in qry
where t.ID == id
select t).SingleOrDefault();
}
}
Called from service:
public class TableService
{
public MyClass RecordsByID(string id)
{
return _repository.GetAll()
.WithID(id);
}
}
I ran into a problem when I experimented with implementing the repository using Entity Framework with LINQ to Entities. The filters class in my project contains some more complex operations than the "WHERE ... == ..." in the example above, which I believe require different implementations depending on the LINQ provider. Specifically I have a requirement to perform a SQL "WHERE ... IN ..." clause. I am able to implement this in the filter class using:
string[] aParams = // array of IDs
qry = qry.Where(t => aParams.Contains(t.ID));
However, in order to perform this against Entity Framework, I need to provide a solution such as the BuildContainsExpression which is tied to the Entity Framework. This means I have to have 2 different implementations of this particular filter, depending on the underlying provider.
I'd appreciate any advice on how I should proceed from here.
It seemed to me that exposing an IQueryable from my repository, would allow me to perform filters on it regardless of the underlying provider, enabling me to switch between providers if and when required. However the problem I describe above makes me think I should be performing all my filtering within the repositories and returning IEnumerable, IList or single classes.
Many thanks,
Matt
This is a very popular question. One that I constantly ask myself. I've always felt it best to return IEnumerable rather than IQueryable from a repository.
The purpose of a repository is to encapsulate the database infrastructure so the client need not worry about the data source. However, if you return IQueryable you are at the mercy of the consumer as to what kind of query will get run against your db, and whether they will do something that the LINQ provider doesn't support.
Take paging for example. Lets say you have a Customer entity and your database could have hundreds of thousands of customers. Which code would you rather have your client write?
var customers = repos.GetCustomers().Skip(skipCount).Take(pageSize).ToList();
OR
var customers = repos.GetCustomers(pageIndex, pageSize);
In the first approach you make it impossible for the repository to restrict the number of records retrieved from the data source. Also, your consumer has to calculate the skipCount.
In the second approach you provide a more coarse grained interface to your client. Now your repository can enforce some constraints on the pageSize in order to optimize the query. You also encapsulate the calculation of the skipCount.
However, that being said, in your situation your client is your service. So I suppose the question really comes down to a separation of concerns. Where is it better to perform such validation logic? Well that answer may very well be "in the service". But what about the answer to "Where is it better to contain query logic?". To me the answer is clearly "The Repository". That is its intended area of expertise.
My first question here so be gentle.
I would like arguments for the following code:
public class Example {
private String name;
private int age;
...
// copy constructor here
public Example(Example e) {
this.name = e.name; // accessing a private attribute of an instance
this.age = e.age;
}
...
}
I believe this breaks the modularity of the instance passed to the copy constructor.
This is what I believe to be correct:
public class Example {
private String name;
private int age;
...
// copy constructor here
public Example(Example e) {
this.setName(e.getName());
this.setAge(e.getAge());
}
...
}
A friend has exposed a valid point of view, saying that in the copy construct we should create the object as fast as possible. And adding getter/setter methods would result in unnecessary overhead.
I stand on a crossroad. Can you shed some light?
Access is class based, not object based.
The rationale for making a member private is that the ther classes should not know the details of implementation outside of well defined API, so as to make the rest of the system tolerate the change of implementation. However, the copy constructor is not "the rst of the system" - it is your own class.
The first example is not copying a private attribute of an instance, because they are bot instances of the same class.
However, if you add access methods/properties, any decent compiler should optimise them away to simple "inlines", in which case the second method is cleaner code (all accesses go via your access function) but both approaches should end us as be equally efficient (probably identical) memberwise copies.
If you really want a copy constructor to be efficient, then a lower level binary copy will be faster than a memberwise copy. But significantly "dirtier".
In general I prefer to access all member fields via properties/accessors as that encapsulates them better, allowing you to change the underlying implementation/storage of a field without having to change any of the code that accesses it, except for the property/accessor itself.
While building by DAL Repository, I stumbled upon a concept called Pipes and Filters. I read about it here, here and saw a screencast from here. I am still not sure how to go about implementing this pattern. Theoretically all sounds good , but how do we really implement this in an enterprise scenario?
I will appreciate, if you have any resources,tips or examples ro explanation for this pattern in context to the data mappers/ORM mentioned in the question.
Thanks in advance!!
Ultimately, LINQ on IEnumerable<T> is a pipes and filters implementation. IEnumerable<T> is a streaming API - meaning that data is lazily returns as you ask for it (via iterator blocks), rather than loading everything at once, and returning a big buffer of records.
This means that your query:
var qry = from row in source // IEnumerable<T>
where row.Foo == "abc"
select new {row.ID, row.Name};
is:
var qry = source.Where(row => row.Foo == "abc")
.Select(row = > new {row.ID, row.Name});
as you enumerate over this, it will consume the data lazily. You can see this graphically with Jon Skeet's Visual LINQ. The only things that break the pipe are things that force buffering; OrderBy, GroupBy, etc. For high volume work, Jon and myself worked on Push LINQ for doing aggregates without buffering in such scenarios.
IQueryable<T> (exposed by most ORM tools - LINQ-to-SQL, Entity Framework, LINQ-to-NHibernate) is a slightly different beast; because the database engine is going to do most of the heavy lifting, the chances are that most of the steps are already done - all that is left is to consume an IDataReader and project this to objects/values - but that is still typically a pipe (IQueryable<T> implements IEnumerable<T>) unless you call .ToArray(), .ToList() etc.
With regard to use in enterprise... my view is that it is fine to use IQueryable<T> to write composable queries inside the repository, but they shouldn't leave the repository - as that would make the internal operation of the repository subject to the caller, so you would be unable to properly unit test / profile / optimize / etc. I've taken to doing clever things in the repository, but return lists/arrays. This also means my repository stays unaware of the implementation.
This is a shame - as the temptation to "return" IQueryable<T> from a repository method is quite large; for example, this would allow the caller to add paging/filters/etc - but remember that they haven't actually consumed the data yet. This makes resource management a pain. Also, in MVC etc you'd need to ensure that the controller calls .ToList() or similar, so that it isn't the view that is controlling data access (otherwise, again, you can't unit test the controller properly).
A safe (IMO) use of filters in the DAL would be things like:
public Customer[] List(string name, string countryCode) {
using(var ctx = new CustomerDataContext()) {
IQueryable<Customer> qry = ctx.Customers.Where(x=>x.IsOpen);
if(!string.IsNullOrEmpty(name)) {
qry = qry.Where(cust => cust.Name.Contains(name));
}
if(!string.IsNullOrEmpty(countryCode)) {
qry = qry.Where(cust => cust.CountryCode == countryCode);
}
return qry.ToArray();
}
}
Here we've added filters on-the-fly, but nothing happens until we call ToArray. At this point, the data is obtained and returned (disposing the data-context in the process). This can be fully unit tested. If we did something similar but just returned IQueryable<T>, the caller might do something like:
var custs = customerRepository.GetCustomers()
.Where(x=>SomeUnmappedFunction(x));
And all of a sudden our DAL starts failing (cannot translate SomeUnmappedFunction to TSQL, etc). You can still do a lot of interesting things in the repository, though.
The only pain point here is that it might push you to have a few overloads to support different calling patterns (with/without paging, etc). Until optional/named parameters arrives, I find the best answer here is to use extension methods on the interface; that way, I only need one concrete repository implementation:
class CustomerRepository {
public Customer[] List(
string name, string countryCode,
int? pageSize, int? pageNumber) {...}
}
interface ICustomerRepository {
Customer[] List(
string name, string countryCode,
int? pageSize, int? pageNumber);
}
static class CustomerRepositoryExtensions {
public static Customer[] List(
this ICustomerRepository repo,
string name, string countryCode) {
return repo.List(name, countryCode, null, null);
}
}
Now we have virtual overloads (as extension methods) on ICustomerRepository - so our caller can use repo.List("abc","def") without having to specify the paging.
Finally - without LINQ, using pipes and filters becomes a lot more painful. You'll be writing some kind of text based query (TSQL, ESQL, HQL). You can obviously append strings, but it isn't very "pipe/filter"-ish. The "Criteria API" is a bit better - but not as elegant as LINQ.