Linq To Sql: Compiled Queries and Extension Methods - linq-to-sql

I'm interessted, how does Linq2Sql handles a compiled query, that returns IQueryable.
If I call an extension method based on a compiled query like "GetEntitiesCompiled().Count()" or "GetEntitiesCompiled().Take(x)". What does Linq2Sql do in the background? This would be very bad, so in this situation I should write a compiled query like "CountEntitiesCompiled".
Does he load the result (in this case "GetEntitiesCompiled()") into the memory (mapped to the entity class like "ToList()")?
So what situations make sense, when the compiled queries return IQueryable, that query is not able to modify, before request to the Sql-Server.
So in my opinion I can just as good return List.
Thanks for answers!

As I understand it - if it can't use the pre-compiled query exactly (because you have composed it further), it just runs it as it would any regular IQueryable query - so it will indeed still issue a SELECT COUNT(1) FROM ... (it shouldn't iterate the entire table / whatever).
But the real answer is: profile it; you can hook .Log to see the TSQL, for example:
myDataContext.Log = Console.Out; // write TSQL to the console
or just use SQL trace to see what goes up and down the wire.

Linq2Sql is not smart enough in such cases. From my experience it always performs compiled part as is. In case of GetEntitiesCompiled().Count() it will fetch all the records and then perform in-memory Count().

Related

Custom `returnFormat` in ColdFusion 10 or 11?

I've a function which is called from different components, .cfms or remotely. It returns the results of a query.
Sometimes the response from this function is manually inspected - a person may want to see the ID of a specific record so they can use it elsewhere.
The provided return formats, being wddx, json, plain all aren't very easily readable for a layman.
I'd love to be able to create a new return format: dump, where the result first writeDumped and then returned to the caller.
I know there'd be more complicated ways of solving this, like writing a function dump, and calling that like a proxy by providing the component, function and parameters so it can call that function and return the results.
However I don't think it's worth going that far. I figured it'd be great if I could just write a new return format, because that's just... intuitive and nice, and I may also be able to use that technique to solve different problems or improve various workflows.
Is there a way to create custom function returnFormats in ColdFusion 10 or 11?
(From comments)
AFAIK, you cannot add a custom returntype to a cffunction, but take a look at OnCFCRequest. Might be able to use it to build something more generic that responds differently whenever a custom URL parameter is passed, ie url.returnformat=yourType. Same net effect as dumping and/or manipulating the result manually, just a little more automated.
From the comments, the return type of the function is query. That being the case, there is simply no need for a custom return format. If you want to dump the query results, do so.
queryVar = objectName.nameOfFunction(arguments);
writeDump (queryVar);

Why use an exception instead of if...else

For example, in the case of "The array index out of bound" exception, why don't we check the array length in advance:
if(array.length < countNum)
{
//logic
}
else
{
//replace using exception
}
My question is, why choose to use an exception? and when to use an exception, instead of if-else
Thanks.
It depends on acceptable practices for a given language.
In Java, the convention is to always check conditions whenever possible and not to use exceptions for flow control. But, for example, in Python not only using exception in this manner is acceptable, but it is also a preferred practice.
They are used to inform the code that calls your code an exceptional condition occurred. Exceptions are more expensive than well formed if/else logic so you use them in exceptional circumstances such as reaching a condition in your code you cannot handle locally, or to support giving the caller of your code the choice of how to handle the error condition.
Usually if you find yourself throwing and catching exceptions in your own function or method, you can probably find a more efficient way of doing it.
There are many answers to that question. As a single example, from Java, when you are using multiple threads, sometimes you need to interrupt a thread, and the thread will see this when an InterruptedException is thrown.
Other times, you will be using an API that throws certain exceptions. You won't be able to avoid it. If the API throws, for example, an IOException, then you can catch it, or let it bubble up.
Here's an example where it would actually be better to use an exception instead of a conditional.
Say you had a list of 10,000 strings. Now, you only want those items which are integers. Now, you know that a very small number of them won't be integers (in string form). So should you check to see if every string is an integer before trying to convert them? Or should you just try to convert them and throw and catch an exception if you get one that isn't an integer? The second way is more efficient, but if they were mostly non-integers then it would be more efficient to use an if-statement.
Most of the time, however, you should not use exceptions if you can replace them with a conditional.
As someone has already said, 'Exceptions' in programming languages are for exceptional cases and not to set logical flow of your program. For example, in the case of given code snippet of your question, you have to see what the enclosing method's or function's intention is. Is checking array.length < countNum part of the business logic or not. If yes, then putting a pair of if/else there is the way to go. If that condition is not part of the business logic and the enclosing method's intention is something else, then write code for that something else and throw exception instead of going the if/else way. For example you develop an application for a school and in your application you have a method GetClassTopperGrades which is responsible for the business logic part which requires to return the highest marks of the student in a certain class. the method/function definition would be something like this:
int GetClassTopperGrades(string classID)
In this case the method's intention is to return the grades, for a valid class, which will always be a positive integer, according to the business logic of the application. Now if someone calls your method and passes a garbage string or null, what should it do? If should throw an exception e.g. ArgumentException or 'ArgumentNullException' because this was an exceptional case in this particular context. The method assumed that always a valid class ID will be passed and NULL or empty string is NOT a valid class ID (a deviation from the business logic).
Apart from that, in some conditions there is no prior knowledge about the outcome of a given code and no defined way to prevent an exceptional situation. For example, querying some remote database, if the network goes down, you don't have any other option there apart from throwing an exception. Would you check network connectivity before issuing every SQL query to the remote database?
There is strong and indisputable reason why to use exceptions - no matter of language. I strongly believe that decision about if to use exceptions or not have nothing to do with particular language used.
Using exceptions is universal method to notify other part of code that something wrong happened in kind of loosely coupled way. Let imagine that if you would like to handle some exceptional condition by using if.. nad else.. you need to insert into different part of your code some arbitrary variables and other stuff which probably would easily led to have spaghetti code soon after.
Let next imagine that you are using any external library/package and it's author decided to put in his/her code other arbitrary way to handle wrong states - it would force you to adjust to its way of dealing with it - for example you would need to check if particular methods returns true or false or whatever. Using exceptions makes handling errors much more easy - you just assume that if something goes wrong - the other code will throw exception, so you just wrap the code in try block and handle possible exception on your own way.

LINQ-SQL reuse - CompiledQuery.Compile

I have been playing about with LINQ-SQL, trying to get re-usable chunks of expressions that I can hot plug into other queries. So, I started with something like this:
Func<TaskFile, double> TimeSpent = (t =>
t.TimeEntries.Sum(te => (te.DateEnded - te.DateStarted).TotalHours));
Then, we can use the above in a LINQ query like the below (LINQPad example):
TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t),
})
This produces the expected output, except, a query per row is generated for the plugged expression. This is visible within LINQPad. Not good.
Anyway, I noticed the CompiledQuery.Compile method. Although this takes a DataContext as a parameter, I thought I would include ignore it, and try the same Func. So I ended up with the following:
static Func<UserQuery, TaskFile, double> TimeSpent =
CompiledQuery.Compile<UserQuery, TaskFile, double>(
(UserQuery db, TaskFile t) =>
t.TimeEntries.Sum(te => (te.DateEnded - te.DateStarted).TotalHours));
Notice here, that I am not using the db parameter. However, now when we use this updated parameter, only 1 SQL query is generated. The Expression is successfully translated to SQL and included within the original query.
So my ultimate question is, what makes CompiledQuery.Compile so special? It seems that the DataContext parameter isn't needed at all, and at this point i am thinking it is more a convenience parameter to generate full queries.
Would it be considered a good idea to use the CompiledQuery.Compile method like this? It seems like a big hack, but it seems like the only viable route for LINQ re-use.
UPDATE
Using the first Func within a Where statment, we see the following exception as below:
NotSupportedException: Method 'System.Object DynamicInvoke(System.Object[])' has no supported translation to SQL.
Like the following:
.Where(t => TimeSpent(t) > 2)
However, when we use the Func generated by CompiledQuery.Compile, the query is successfully executed and the correct SQL is generated.
I know this is not the ideal way to re-use Where statements, but it shows a little how the Expression Tree is generated.
Exec Summary:
Expression.Compile generates a CLR method, wheras CompiledQuery.Compile generates a delegate that is a placeholder for SQL.
One of the reasons you did not get a correct answer until now is that some things in your sample code are incorrect. And without the database or a generic sample someone else can play with chances are further reduced (I know it's difficult to provide that, but it's usually worth it).
On to the facts:
Expression<Func<TaskFile, double>> TimeSpent = (t =>
t.TimeEntries.Sum(te => (te.DateEnded - te.DateStarted).TotalHours));
Then, we can use the above in a LINQ query like the below:
TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t),
})
(Note: Maybe you used a Func<> type for TimeSpent. This yields the same situation as of you're scenario was as outlined in the paragraph below. Make sure to read and understand it though).
No, this won't compile. Expressions can't be invoked (TimeSpent is an expression). They need to be compiled into a delegate first. What happens under the hood when you invoke Expression.Compile() is that the Expression Tree is compiled down to IL which is injected into a DynamicMethod, for which you get a delegate then.
The following would work:
var q = TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent.Compile().DynamicInvoke()
});
This produces the expected output, except, a query per row is
generated for the plugged expression. This is visible within LINQPad.
Not good.
Why does that happen? Well, Linq To Sql will need to fetch all TaskFiles, dehydrate TaskFile instances and then run your selector against it in memory. You get a query per TaskFile likely because they contains one or multiple 1:m mappings.
While LTS allows projecting in memory for selects, it does not do so for Wheres (citation needed, this is to the best of my knowledge). When you think about it, this makes perfect sense: It is likely you will transfer a lot more data by filtering the whole database in memory, then by transforming a subset of it in memory. (Though it creates query performance issues as you see, something to be aware of when using an ORM).
CompiledQuery.Compile() does something different. It compiles the query to SQL and the delegate it returns is only a placeholder Linq to SQL will use internally. You can't "invoke" this method in the CLR, it can only be used as a node in another expression tree.
So why does LTS generate an efficient query with the CompiledQuery.Compile'd expression then? Because it knows what this expression node does, because it knows the SQL behind it. In the Expression.Compile case, it's just a InvokeExpression that invokes the DynamicMethod as I explained previously.
Why does it require a DataContext Parameter? Yes, it's more convenient for creating full queries, but it's also because the Expression Tree compiler needs to know the Mapping to use for generating the SQL. Without this parameter, it would be a pain to find this mapping, so it's a very sensible requirement.
I'm surprised why you've got no answers on this so far. CompiledQuery.Compile compiles and caches the query. That is why you see only one query being generated.
Not only this is NOT a hack, this is the recommended way!
Check out these MSDN articles for detailed info and example:
Compiled Queries (LINQ to Entities)
How to: Store and Reuse Queries (LINQ to SQL)
Update: (exceeded the limit for comments)
I did some digging in reflector & I do see DataContext being used. In your example, you're simply not using it.
Having said that, the main difference between the two is that the former creates a delegate (for the expression tree) and the latter creates the SQL that gets cached and actually returns a function (sort of). The first two expressions produce the query when you call Invoke on them, this is why you see multiple of them.
If your query doesn't change, but only the DataContext and Parameters, and if you plan to use it repeatedly, CompiledQuery.Compile will help. It is expensive to Compile, so for one off queries, there is no benefit.
TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t),
})
This isn't a LinqToSql query, as there is no DataContext instance. Most likely you are querying some EntitySet, which does not implement IQueryable.
Please post complete statements, not statement fragments. (I see invalid comma, no semicolon, no assignment).
Also, Try this:
var query = myDataContext.TaskFiles
.Where(tf => tf.Parent.Key == myParent.Key)
.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t)
});
// where myParent is the source of the EntitySet and Parent is a relational property.
// and Key is the primary key property of Parent.

Will manual Linq-To-Sql mapping with Expressions work?

I have this problem:
The Vehicle type derives from the EntityObject type which has the property "ID".
I think i get why L2S can't translate this into SQL- it does not know that the WHERE clause should include WHERE VehicleId == value. VehicleId btw is the PK on the table, whereas the property in the object model, as above, is "ID".
Can I even win on this with an Expression tree? Because it seems easy enough to create an Expression to pass to the SingleOrDefault method but will L2S still fail to translate it?
I'm trying to be DDD friendly so I don't want to decorate my domain model objects with ColumnAttributes etc. I am happy however to customize my L2S dbml file and add Expression helpers/whatever in my "data layer" in the hope of keeping this ORM-business far from my domain model.
Update:
I'm not using the object initialization syntax in my select statement. Like this:
private IQueryable<Vehicle> Vehicles()
{
return from vehicle in _dc
select new Vehicle() { ID = vehicle.VehicleId };
}
I'm actually using a constructor and from what I've read this will cause the above problem. This is what I'm doing:
private IQueryable<Vehicle> Vehicles()
{
return from vehicle in _dc
select new Vehicle(vehicle.VehicleId);
}
I understand that L2S can't translate the expression tree from the screen grab above because it does not know the mappings which it would usually infer from the object initialization syntax. How can I get around this? Do I need to build a Expression with the attribute bindings?
I have decided that this is not possible from further experience.
L2S simply can not create the correct WHERE clause when a parameterized ctor is used in the mapping projection. It's the initializer syntax in conventional L2S mapping projections which gives L2S the context it needs.
Short answer - use NHibernate.
Short answer: Don't.
I once tried to apply the IQueryable<.IEntity> to Linq2Sql. I got burned bad.
As you said. L2S (and EF too in this regard) doesn't know that ID is mapped to the column VehicleId. You could get around this by refactoring your Vehicle.ID to Vehicle.VehicleID. (Yes, they work if they are the same name). However I still don't recommend it.
Use L2S with the object it provided. Masking an extra layer over it while working with IQueryable ... is bad IMO (from my experience).
Otherway is to do .ToList() after you have done the select statement. This loads all the vehicles into your memory. Then you do the .Where statment against Linq 2 Object collections. Ofcourse this won't be as effecient as L2S handles all of the query and causes larger memory usage.
Long story short. Don't use Sql IQueryable with any object other than the ones it was originally designed for. It just doesn't work (well).

Should I return IEnumerable<T> or IQueryable<T> from my DAL?

I know this could be opinion, but I'm looking for best practices.
As I understand, IQueryable<T> implements IEnumerable<T>, so in my DAL, I currently have method signatures like the following:
IEnumerable<Product> GetProducts();
IEnumerable<Product> GetProductsByCategory(int cateogoryId);
Product GetProduct(int productId);
Should I be using IQueryable<T> here?
What are the pros and cons of either approach?
Note that I am planning on using the Repository pattern so I will have a class like so:
public class ProductRepository {
DBDataContext db = new DBDataContext(<!-- connection string -->);
public IEnumerable<Product> GetProductsNew(int daysOld) {
return db.GetProducts()
.Where(p => p.AddedDateTime > DateTime.Now.AddDays(-daysOld ));
}
}
Should I change my IEnumerable<T> to IQueryable<T>? What advantages/disadvantages are there to one or the other?
It depends on what behavior you want.
Returning an IList<T> tells the caller that they've received all of the data they've requested
Returning an IEnumerable<T> tells the caller that they'll need to iterate over the result and it might be lazily loaded.
Returning an IQueryable<T> tells the caller that the result is backed by a Linq provider that can handle certain classes of queries, putting the burden on the caller to form a performant query.
While the latter gives the caller a lot of flexibility (assuming your repository fully supports it), it's the hardest to test and, arguably, the least deterministic.
One more thing to think about: where is your paging/sorting support? If you are providing paging support within your repository, returning IEnumerable<T> is fine. If you are paging outside of your repository (like in the controller or service layer) then you really want to use IQueryable<T> because you don't want to load the entire dataset into memory before it's paged.
HUUUUGGGE difference. I see this quite a bit.
You build up an IQueryable before it hits the database. The IQueryable only hits the DB once an eager function is called (.ToList() for example) or you actually try to pull values out. IQueryable = lazy.
An IEnumerable will execute your lambda against the DB right away. IEnumerable = eager.
As for which to use with the Repository pattern, I believe it's eager. I usually see ILists being passed but someone else will need to iron that out for you. EDIT - You usually see IEnumerable instead of IQueryable because you don't want layers past your Repository A) determining when the database hit will happen or B) Adding any logic to the joins outside the Repository
There is a very good LINQ video that I enjoy a lot- it hits more than just IEnumerable v IQueryable, but it really has some fantastic insight.
http://channel9.msdn.com/posts/matthijs/LINQ-Tips-Tricks-and-Optimizations-by-Scott-Allen/
You can use IQueryable and accept that someone could create a scenario where a SELECT N+1 could happen. This is a disadvantage, along with the fact that you may end up with code that is specific to your repository implementation in the layers above your repository. The advantage of this is that you are allowing the delegation common operations like paging and sorting to be expressed outside of your respository, therefore alleviating it of such concerns. It is also more flexible if you need to join the data with other database tables, as the query will remain an expression, so can be added to before its resolved into a query and hits the database.
The alternative is to lock down your repository so that it returns materialised lists by calling ToList(). With the example of paging and sorting, you will need to pass in skip, take and a sort expression as parameters to the methods of your repository, and use the parameters to return only a window of results. This means that the repository is taking on the responsibility of paging and sorting, and all of the projection of your data.
This is a bit of a judgement call, do you give your application the power of linq, and have less complexity in the repository, or do you control your data access. For me it depends on the number of queries associated with each entity, and combinations of entities, and where I want to manage that complexity.