How does linq-to-sql generate sql for collection pseudoqueries? - linq-to-sql

My understanding is that the LinqToSql pseudolanguage describes a set using a syntax very similar to SQL and this will allow you to efficiently update a property on a collection of objects:
from b in BugsCollection where b.status = 'closed' set b.status = 'open'
This would update the underlying database using just one SQL statement.
Normally an ORM needs to retieve all of the rows as separate objects, update attributes on each of them and save them individually to the database (at least that's my understanding).
So, how does linq-to-sql avoid having to do this when other orms are not able to avoid it?

The syntax shown in your question is incorrect. LINQ is not intended to have side-effects; it is a query language. The proper way to accomplish what you're looking for is
var x = from b in dataContext.BugsCollection where b.status == "closed";
foreach (var y in x)
y.status = "open";
dataContext.SubmitChanges();
This would generate the single SQL statement that you're talking about. The reason it is able to accomplish this is because of deferred execution - the L2S engine doesn't actually talk to the database until it has to - in this case, because SubmitChanges() was called. L2S then sends the generated SQL statement to the database for execution.

Because LINQ to SQL uses Expression Trees to convert your Query Syntax to actual SQL...it then executes the SQL against the database (rather than pulling all of the data, executing against the in-memory data, and then writing the changes back to the database).
For example, the following Query Syntax:
var records = from r in Records
where r.Property == value
select r
Gets translated first to Lamda Syntax:
Records.Where(r => r.Property == value).Select();
And finally to SQL (via Expression Trees):
SELECT Property, Property2, Property3 FROM Record WHERE Property = #value
...granted, the example doesn't update anything...but the process would be the same for an update query as opposed to a simple select.

Related

CI active record style sql queries

I am new in Code Igniter and like its active record feature now is there any useful steps or tips or any guidness how do i convert my pervoiusly written simple SQL Queries in CI style like this is my perviouly written simple query
SELECT *
FROM hs_albums
WHERE id NOT IN (SELECT album_id
FROM hs_delete_albums
WHERE user_id = 72
AND del_type = 1)
AND ( created = 72
OR club_id IN (SELECT cbs.id
FROM hs_clubs cbs
INNER JOIN hs_club_permissions cbp
ON cbs.id = cbp.club_id
WHERE cbp.user_id = 72
AND cbp.status = 2)
OR group_id IN (SELECT gps.id
FROM hs_groups gps
INNER JOIN hs_group_permissions grp
ON gps.id = grp.group_id
WHERE grp.user_id = 72
AND grp.status = 2)
OR comp_id IN (SELECT cmp.id
FROM hs_companies cmp
INNER JOIN hs_comp_permissions comp
ON cmp.id = comp.comp_id
WHERE comp.user_id = 72
AND comp.status = 2) )
The short answer is: You don't.
CodeIgniter's Active Record implementation is basically a layer on top of SQL that makes writing queries easier by:
Automatically escaping values
Automatically generating the appropriate query syntax for the database, so that the application can be more easily ported between databases (for instance, if you didn't use Active Record to write a query, and then wanted to move from MySQL to PostgreSQL, then you might well need to rewrite the query to make it work with PostgreSQL)
Providing a syntax for queries in PHP directly, thus avoiding the context switching between PHP and SQL.
However, it can't do everything SQL can do, and while I would always try to use ActiveRecord where possible, there comes a point where you're better off forgetting about using it and just using $this->db->query() to write your query directly. In this case, as mamdouh alramadan has said, CodeIgniter doesn't support subqueries so you can't replicate this query using ActiveRecord anyway.
The thing to remember is that ActiveRecord is a tool, not a requirement. If you're using CodeIgniter and aren't using an ORM instead, you should use it for the reasons mentioned above. However, once it starts getting in the way, you should consider whether it would be better practice to write your query manually instead.

Generic SQL for update / insert

I'm writing a DB layer which talks to MS SQL Server, MySQL & Oracle. I need an operation which can update an existing row if it contains certain data, otherwise insert a new row; All in one SQL operation.
Essentially I need to save over existing data if it exists, or add it if it doesn't
Conceptually this is the same as upsert except it only needs to work on a single table. I'm trying to make sure I don't need to delete then insert as this has a performance impact.
Is there generic SQL to do this or do I need vendor specific solutions?
Thanks.
You need vendor specific SQL as MySQL (unlike MS and Oracle) doesn't support MERGE
http://en.wikipedia.org/wiki/Merge_(SQL)
I suspect that sooner rather than later, you're going to need a vendor specific implementation of your DB layer - SQL portability is pretty much a myth as soon as you do anything even slightly advanced.
I am pretty sure this is going to be vendor specific. For SQL Server, you can accomplish this using the MERGE statement.
If you are using SQL Server 2008, use Merge Statement. But keep in mind that if your Insert part has some condition involve, then it cannot be used. In which case you need to write your own way for accomplishing this. And in your case it has to be since you are involving MySQL which does not have a Merge Statement.
Why are you not using an ORM layer (like Entity Framework) for this purpose?
Just some pseudo code(in C#)
public int SaveTask(tblTaskActivity task, bool isInsert)
{
int result = 0;
using (var tmsEntities = new TMSEntities())
{
if (isInsert) //for insert
{
tmsEntities.AddTotblTaskActivities(task);
result = tmsEntities.SaveChanges();
}
else //for update
{
var taskActivity = tmsEntities.tblTaskActivities.Where(i => i.TaskID == task.TaskID).FirstOrDefault();
taskActivity.Priority = task.Priority;
taskActivity.ActualTime = task.ActualTime;
result = tmsEntities.SaveChanges();
}
}
return result;
}
In MySQL you have something similar to merge:
insert ... on duplicate key update ...
MySQL Reference - Insert on duplicate key update

Two LINQ data contexts issue

I'm getting this error when using LINQ2SQL:
The query contains references to items defined on a different data context.
Here's the code:
var instances = (from i in context.List
join j in context.CatsList on i.ListID equals j.ListID
join c in context.Cats on j.CatID equals c.CatID
where c.SID == Current.SID
orderby i.Title
select i).Distinct();
The problem, as far as I can ascertain, is that the Current object is actually a LINQ2SQL object returned from a property executing a different LINQ statement.
So, therefore, LINQ2SQL doesn't like executing a query on the database where the query has to be built from one LINQ statement including another statement's result.
My problem with that is that (I'll try to summarise the issue here) the Current object is retrieved using the same context as the query above and ultimately the Current.SID should simply resolve to an int, so what is the compiler's problem with executing it?
In short, why is it not possible to execute a LINQ query using a previous query's returned object as an argument?
This is a solution to the issue, rather than a direct answer of your final question, but you can probably get by with:
var sid = Current.SID;
var instances = (from i in context.List
join j in context.CatsList on i.ListID equals j.ListID
join c in context.Cats on j.CatID equals c.CatID
where c.SID == sid
orderby i.Title
select i).Distinct();

LinqToSql - Parallel - DataContext and Parallel

In .NET 4 and multicore environment, does the linq to sql datacontext object take advantage of the new parallels if we use DataLoadOptions.LoadWith?
EDIT
I know linq to sql does not parallelize ordinary queries. What I want to know is when we specify DataLoadOption.LoadWith, does it use parallelization to perform the match between each entity and its sub entities?
Example:
using(MyDataContext context = new MyDataContext())
{
DataLaodOptions options =new DataLoadOptions();
options.LoadWith<Product>(p=>p.Category);
return this.DataContext.Products.Where(p=>p.SomeCondition);
}
generates the following sql:
Select Id,Name from Categories
Select Id,Name, CategoryId from Products where p.SomeCondition
when all the products are created, will we have a
categories.ToArray();
Parallel.Foreach(products, p =>
{
p.Category == categories.FirstOrDefault(c => c.Id == p.CategoryId);
});
or
categories.ToArray();
foreach(Product product in products)
{
product.Category = categories.FirstOrDefault(c => c.Id == product.CategoryId);
}
?
No, LINQ to SQL does not. There is little to parallelize on the .NET side. All LINQ to SQL does is translating expression trees to SQL queries. SQL Server will execute those SQL statements, and is able to do this in parallel.
Don't forget that while you can do something like this with your LINQ to SQL LINQ query, it isn't a good idea:
// BAD CODE!!! Don't parallelize a LINQ to SQL query
var q =
from customer in db.Customers.AsParallel()
where customer.Id == 5
select customer;
While this yields the correct results, you won't get the performance improvement you are hoping for. PLINQ isn't able to handle IQueryable objects. Therefore, it will just handle an IQueryable as an IEnumerable (thus iterating it). It will process the db.Customers collection in parallel and use multiple threads to filter the customers. While this sounds okay, this means it will retrieve the complete collection of customers from the database! Without the AsParallel construct, LINQ to SQL would be able to optimize this query by adding the WHERE id = #ID to the SQL. SQL Server would than be able to use indexes (and possibly multiple-threads) to fetch the values.
While there is some room for optimization inside the LINQ to SQL engine when it comes to matching entities to its sub entities, there doesn't seem such optimization in the framework currently (or at least, I wasn't able to find any using Reflector).

Insert/Select with Linq-To-SQL

Is there a way to do an insert/select with Linq that translates to this sql:
INSERT INTO TableA (...)
SELECT ...
FROM TableB
WHERE ...
Yes #bzlm covered it first, but if you prefer something a bit more verbose:
// dc = DataContext, assumes TableA contains items of type A
var toInsert = from b in TableB
where ...
select new A
{
...
};
TableA.InsertAllOnSubmit(toInsert);
dc.SubmitChanges();
I kind of prefer this from a review/maintenance point of view as I think its a bit more obvious what's going on in the select.
In response to the observation by #JfBeaulac :
Please note that this will not generate the SQL shown - so far as I'm aware it's not actually possible to generate directly using Linq (to SQL), you'd have to bypass linq and go straight to the database. Functionally its should achieve the same result in that it will perform the select and will then insert the data - but it will round-trip the data from the server to the client and back so may not be optimal for large volumes of data.
context
.TableA
.InsertAllOnSubmit(
context
.TableB
.Where( ... )
.Select(b => new A { ... })
);