Why do I get a NullReferenceException when using ToDictionary on an Entity Framework query? - linq-to-sql

I'm very surprised, it seems my lambda expressions are executed as C# code, instead of being converted to SQL.
If that is really the case, it's a bit sad. For example:
context.Set<Post>().ToDictionary(post => post.Id, post => post.Comments.Count())
This code will apparently load the posts into C# objects first, and then count the comments. I came to that conclusion because in a similar piece of real-world code, I was having a NullReferenceException because post.Comments was null (note that in my code, the posts were loaded without the Comments relation just before executing this line of code).
Using this instead would then be much more efficient:
context.Set<Post>()
.Select(post => new { Key = post.Id, Value = post.Comments.Count() })
.ToDictionary(entry => entry.Key, entry => entry.Value)
Since I believe this code is generic enough to work in any situation, I wonder if
Am I understanding correctly what is happening?
Why hasn't it been implemented as a generic solution for ToDictionary, as it has been for ToArray and ToList?

There is no Queryable.ToDictionary method (check here), so ToDictionary takes context.Set<Post>() as IEnumerable. That means that, as you correctly understood, context.Set<Post>() is first evaluated and then processed in-memory.
That's highly inefficient, because now for each Post, comments are loaded by a separate query, if lazy loading is enabled, otherwise Post.Comments is null.
So projecting to an anonymous type is the only option to do this efficiently.

Related

Maquette cannot read property "class" of undefined

Chrome debug console snapshot
I basically am unsure as to what is causing this error ^^.
I've done a little digginng, and it seemse the previousProperties is passed in as previous.properties by updateDom(). previous, in turn, is passed in by update where it is labeled as just vnode. This VNOde is a valid VNode, but just lacks the properties.
I'm pretty sure I've made everything distinguishable (by setting unique key properties) that would need to be distinguishable, so I don't think that's the problem, although I could be mistaken.
So I had this question, wrote it, did more looking and found my answer before even posting it. I'm still posting this question, and answering it myself in hopes that it might help save someone else some heartache in the future.
In this case, this error is being caused by a projector rendering and receiving an invalid value in return from the renderMaquette function. In my component based framework, I've been using ternary operators to work like if-else statements inside renderMaquetteFunction return blocks. I.E.
function renderMaquette(){
return h('div',
showTitle ?
h('h1', 'My Title')
: []
)
}
Leaving an empty array is perfectly acceptable parameter inside of a hyperscript function, as it will return nothing. However, returning an empty array is not. I.E.
function renderMaquette(){
return showTitle ?
h('h1', 'My Title')
: []
}
This generates an error.

Why is the return function called return?

Why is the return function called return?
The description is:
Inject a value into the monadic type.
The name not only doesn't make sense (to me), it is confusing for people coming from an imperative language where return is a language keyword that returns from the function.
Why is it called that? Because it's usually the very last function in a monadic block of code. Usually the only good reason to use return is to set the final return value from your monadic action.
I too think that this is a very, very poor name choice. But it's not like we can fix it now...
It's purely historical. Most Haskell developers agree it's a bad name. It breaks the principle of least surprise. Quite a few of the older library functions are a bit wonky (the plethora of error handling schemes and a few other typeclass element names come to mind).
As #bheklilr says, there is a restructuring underway which should help:
http://www.haskell.org/haskellwiki/Functor-Applicative-Monad_Proposal
These are good places to start if you are interested in the meta of Haskell:
http://www.haskell.org/haskellwiki/Future_of_Haskell
http://www.haskell.org/haskellwiki/Category:History
The answer is because it returns something. It you use in PHP for example - echo something in it, it returns that text or data. But functions primary power is not in echoing data directly. Their power is in storing data and returning variable/array or similar where are data is stored.
You can also return true or false based on data/calculation. In classes, functions are named methods and do the same thing - return something. In java return can be void (echoed data), or strict data type (boolean for example, or String, Array, etc).
After return function data is not being returned.

LINQ-SQL reuse - CompiledQuery.Compile

I have been playing about with LINQ-SQL, trying to get re-usable chunks of expressions that I can hot plug into other queries. So, I started with something like this:
Func<TaskFile, double> TimeSpent = (t =>
t.TimeEntries.Sum(te => (te.DateEnded - te.DateStarted).TotalHours));
Then, we can use the above in a LINQ query like the below (LINQPad example):
TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t),
})
This produces the expected output, except, a query per row is generated for the plugged expression. This is visible within LINQPad. Not good.
Anyway, I noticed the CompiledQuery.Compile method. Although this takes a DataContext as a parameter, I thought I would include ignore it, and try the same Func. So I ended up with the following:
static Func<UserQuery, TaskFile, double> TimeSpent =
CompiledQuery.Compile<UserQuery, TaskFile, double>(
(UserQuery db, TaskFile t) =>
t.TimeEntries.Sum(te => (te.DateEnded - te.DateStarted).TotalHours));
Notice here, that I am not using the db parameter. However, now when we use this updated parameter, only 1 SQL query is generated. The Expression is successfully translated to SQL and included within the original query.
So my ultimate question is, what makes CompiledQuery.Compile so special? It seems that the DataContext parameter isn't needed at all, and at this point i am thinking it is more a convenience parameter to generate full queries.
Would it be considered a good idea to use the CompiledQuery.Compile method like this? It seems like a big hack, but it seems like the only viable route for LINQ re-use.
UPDATE
Using the first Func within a Where statment, we see the following exception as below:
NotSupportedException: Method 'System.Object DynamicInvoke(System.Object[])' has no supported translation to SQL.
Like the following:
.Where(t => TimeSpent(t) > 2)
However, when we use the Func generated by CompiledQuery.Compile, the query is successfully executed and the correct SQL is generated.
I know this is not the ideal way to re-use Where statements, but it shows a little how the Expression Tree is generated.
Exec Summary:
Expression.Compile generates a CLR method, wheras CompiledQuery.Compile generates a delegate that is a placeholder for SQL.
One of the reasons you did not get a correct answer until now is that some things in your sample code are incorrect. And without the database or a generic sample someone else can play with chances are further reduced (I know it's difficult to provide that, but it's usually worth it).
On to the facts:
Expression<Func<TaskFile, double>> TimeSpent = (t =>
t.TimeEntries.Sum(te => (te.DateEnded - te.DateStarted).TotalHours));
Then, we can use the above in a LINQ query like the below:
TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t),
})
(Note: Maybe you used a Func<> type for TimeSpent. This yields the same situation as of you're scenario was as outlined in the paragraph below. Make sure to read and understand it though).
No, this won't compile. Expressions can't be invoked (TimeSpent is an expression). They need to be compiled into a delegate first. What happens under the hood when you invoke Expression.Compile() is that the Expression Tree is compiled down to IL which is injected into a DynamicMethod, for which you get a delegate then.
The following would work:
var q = TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent.Compile().DynamicInvoke()
});
This produces the expected output, except, a query per row is
generated for the plugged expression. This is visible within LINQPad.
Not good.
Why does that happen? Well, Linq To Sql will need to fetch all TaskFiles, dehydrate TaskFile instances and then run your selector against it in memory. You get a query per TaskFile likely because they contains one or multiple 1:m mappings.
While LTS allows projecting in memory for selects, it does not do so for Wheres (citation needed, this is to the best of my knowledge). When you think about it, this makes perfect sense: It is likely you will transfer a lot more data by filtering the whole database in memory, then by transforming a subset of it in memory. (Though it creates query performance issues as you see, something to be aware of when using an ORM).
CompiledQuery.Compile() does something different. It compiles the query to SQL and the delegate it returns is only a placeholder Linq to SQL will use internally. You can't "invoke" this method in the CLR, it can only be used as a node in another expression tree.
So why does LTS generate an efficient query with the CompiledQuery.Compile'd expression then? Because it knows what this expression node does, because it knows the SQL behind it. In the Expression.Compile case, it's just a InvokeExpression that invokes the DynamicMethod as I explained previously.
Why does it require a DataContext Parameter? Yes, it's more convenient for creating full queries, but it's also because the Expression Tree compiler needs to know the Mapping to use for generating the SQL. Without this parameter, it would be a pain to find this mapping, so it's a very sensible requirement.
I'm surprised why you've got no answers on this so far. CompiledQuery.Compile compiles and caches the query. That is why you see only one query being generated.
Not only this is NOT a hack, this is the recommended way!
Check out these MSDN articles for detailed info and example:
Compiled Queries (LINQ to Entities)
How to: Store and Reuse Queries (LINQ to SQL)
Update: (exceeded the limit for comments)
I did some digging in reflector & I do see DataContext being used. In your example, you're simply not using it.
Having said that, the main difference between the two is that the former creates a delegate (for the expression tree) and the latter creates the SQL that gets cached and actually returns a function (sort of). The first two expressions produce the query when you call Invoke on them, this is why you see multiple of them.
If your query doesn't change, but only the DataContext and Parameters, and if you plan to use it repeatedly, CompiledQuery.Compile will help. It is expensive to Compile, so for one off queries, there is no benefit.
TaskFiles.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t),
})
This isn't a LinqToSql query, as there is no DataContext instance. Most likely you are querying some EntitySet, which does not implement IQueryable.
Please post complete statements, not statement fragments. (I see invalid comma, no semicolon, no assignment).
Also, Try this:
var query = myDataContext.TaskFiles
.Where(tf => tf.Parent.Key == myParent.Key)
.Select(t => new {
t.TaskId,
TimeSpent = TimeSpent(t)
});
// where myParent is the source of the EntitySet and Parent is a relational property.
// and Key is the primary key property of Parent.

Fetching strategy encapsulation for Entity Framework 4.1 and NHibernate

I created a project to test out NHibernate 3+ vs. Entity Framework 4.1, wrapping it in a repository, making it very testable using interfaces etc.
I do not want to expose either ORM outside of the repositories (I do not even expose IQueryables). Everything should be handled in that layer and until I tried to handle fetching in an abstract way, everything was good.
Microsoft's implementation of adding eager loading uses either magic strings (yuck) or Linq expressions (yay) on the Include function. Their syntax follows something like this:
IQueryableThing.Include(o => o.Person);
IQueryableThing.Include(o => o.Company.Contact);
IQueryableThing.Include(o => o.Orders.Select(p => p.LineItem.Cost);
The first will just load the associated person. (parent)
The second will load the associated company and each company's contact. (parent and grandparent).
The third will load all associated orders, line items and costs for each order.
It's a pretty slick implementation.
NHibernate uses a slightly different approach. They still use Linq expressions, but they make heavier use of extension methods (fluent approach).
IQueryableThing.Fetch(o => o.Person);
IQueryableThing.Fetch(o => o.Company).ThenFetch(o => o.Contact);
IQueryableThing.FetchMany(o => o.Orders).ThenFetch(p => p.LineItem).ThenFetch(q => q.Cost);
(I'm not sure I if the third line is the correct syntax)
I can encapsulate a list of expressions in a separate class and then apply those expression to the IQueryable within that class. So what I would need to do is standardize on the Microsoft expression syntax and then translate that into NHibernate's syntax by walking the expression tree and rebuilding each expression.
This is the part that's really tricky. I have to maintain a particular order of operations in order to call the correct function for the IQueryable (must start with either Fetch or FetchMany, with each subsequent being a "ThenFetch" or "ThenFetchMany"), which stops me from using the built-in ExpressionVisitor class.
Edit:
I finally created an expression parser that will take any level of nesting of properties, collections, and selects on collections and produce an array of expressions. Unfortunately, the built in Fetch extensions methods do not take LambdaExpression as a parameter.
The part I am stuck on currently is not being able to use the built in Fetch definitions from nHibernate. It looks like I may have to hit the Remotion library's functions directly or register my own extension methods that will satisfy their parser.
Funky.
Have you tried using NHiberanteUtil.Initialize()? I haven't attempted to do what you are doing, but I think Initialize will work akin to Include().

Group functions of similar functionality

Sometimes I come across this problem where you have a set of functions that obviously belong to the same group. Those functions are needed at several places, and often together.
To give a specific example: consider the filemtime, fileatime and filectime functions. They all provide a similar functionality. If you are building something like a filemanager, you'll probably need to call them one after another to get the info you need. This is the moment that you get thinking about a wrapper. PHP already provides stat, but suppose we don't have that function.
I looked at the php sourcecode to find out how they solved this particular problem, but I can't really find out what's going on.
Obviously, if you have a naive implementation of such a grouping function, say filetimes, would like this:
function filetimes($file) {
return array(
'filectime' => filectime($file)
,'fileatime' => fileatime($file)
,'filemtime' => filemtime($file)
);
}
This would work, but incurs overhead since you would have to open a file pointer for each function call. (I don't know if it's necessary to open a file pointer, but let's assume that for the sake of the example).
Another approach would be to duplicate the code of the fileXtime functions and let them share a file pointer, but this obviously introduces code duplication, which is probably worse than the overhead introduced in the first example.
The third, and probably best, solution I came up with is to add an optional second parameter to the fileXtime functions to supply a filepointer.
The filetimes functions would then look like this:
function filetimes($file) {
$fp = fopen($file, 'r');
return array(
'filectime' => filectime($file, $fp)
,'fileatime' => fileatime($file, $fp)
,'filemtime' => filemtime($file, $fp)
);
}
Somehow this still feels 'wrong'. There's this extra parameter that is only used in some very specific conditions.
So basically the question is: what is best practice in situations like these?
Edit:
I'm aware that this is a typical situation where OOP comes into play. But first off: not everything needs to be a class. I always use an object oriented approach, but I also always have some functions in the global space.
Let's say we're talking about a legacy system here (with these 'non-oop' parts) and there are lots of dependencies on the fileXtime functions.
tdammer's answer is good for the specific example I gave, but does it extend to the broader problem set? Can a solution be defined such that it is applicable to most other problems in this domain?
Use classes, Luke.
I'd rewrite the fileXtime functions to accept either a filename or a file handle as their only parameter. Languages that can overload functions (like C++, C# etc) can use this feature; in PHP, you'd have to check for the type of the argument at run time.
Passing both a filename and a file handle would be redundant, and ambiguous calls could be made:
$fp = fopen('foo', 'r');
$times = file_times('bar', $fp);
Of course, if you want to go OOP, you'd just wrap them all in a FileInfo class, and store a (lazy-loaded?) private file handle there.