Dynamically searching using LINQ and dynamically created predicates - linq-to-sql

I have some predicates being dynamically built that have the following signature passes through as parameters into a function:
Expression<Func<TblTableR, bool>> TableRPredicate,
Expression<Func<TblTableN, bool>> suspectNamesPredicate,
Expression<Func<TblTableS, bool>> TableSPredicate,
Expression<Func<TblTableI, bool>> suspectTableIPredicate,
I am trying to query using the following:
var registries = (from r in db.TblTableR.Where(TableRPredicate)
join s in db.TblTableS.Where(TableSPredicate)
on r.TblTableRID equals s.TblTableSTableRID into ss
from suspects in ss.DefaultIfEmpty()
join si in db.TblTableI.Where(suspectTableIPredicate)
on suspects.TblTableSIndexCardID equals si.TblTableIID into sisi
from suspectTableI in sisi.DefaultIfEmpty()
join sn in db.TblTableN.Where(suspectNamesPredicate)
on suspectTableI.TblTableIID equals sn.TblTableNIndexCardID into snsn
from suspectNames in snsn.DefaultIfEmpty()
select r.TblTableRID).Distinct();
This has the result of putting any generated "where" clause into the JOIN statement eg:
left outer join tblTableI on tblTableITableRID = tblTableRID
AND (expression created by predicate)
What is actually happening is that the final SQL that is created is incorrect. It is creating the following type of sql
select * from table1 left outer join table2 on field1 = field2
AND field3 = 'CRITERIA'
It is this AND clause on the end that is the problem - ending up returning too many rows. Essentially I would like to get the where clause into the statement and not have it stick the extra condition into the join.
Something like this:
select * from table1 left outer join table2 on field1 = field2
WHERE field3 = 'CRITERIA'
I have tried adding a Where clause in as follows:
...
...
...
select r.TblTableRID).Where(TableRPredicate).Distinct();
but that will not compile because of the generic parameters on each predicate.
If I modify my LINQ query to only select from one table and use a predicate, the WHERE clause is generated correctly.
Any ideas?

(edited post clarification)
Step 1; change the final select to select all three entities into an anonymous type; for me (testing on Northwind), that is:
select new {emp, cust, order};
Step 2; apply your filters to this using the extension method I've added below; for me this filtering looks like:
var qry2 = qry.Where(x => x.emp, employeeFilter)
.Where(x => x.cust, custFilter)
.Where(x => x.order, orderFilter);
Step 3; now select the entity/entities you actually want from this filtered query:
var data = qry2.Select(x => x.order)
And here's the extension method:
static IQueryable<T> Where<T,TValue>(
this IQueryable<T> source,
Expression<Func<T, TValue>> selector,
Expression<Func<TValue, bool>> predicate)
{
var row = Expression.Parameter(typeof (T), "row");
var member = Expression.Invoke(selector, row);
var lambda = Expression.Lambda<Func<T, bool>>(
Expression.Invoke(predicate, member), row);
return source.Where(lambda);
}

Related

Linq to Sql - Count, sub-query and subtraction

I am trying to convert the following T-SQL statement into Linq to Sql but am having trouble with the subtraction from the count. The final select will be a single row and single column (int)
I have done the SQL in two ways (sub-query and by JOIN/GROUP) which both return the same result, although I think the former might be the 'easier' option...
SQL 1 using a sub-query...
SELECT e.Places - ( SELECT COUNT(*) FROM [Event Participants] ep WHERE ep.E__ID = x AND ep.EP_STAT IN ('B','C')) AS AvailablePlaces
From Events e
WHERE e.E__ID = x
SQL 2 using GROUP BY and JOIN...
SELECT e.Places - COUNT(ep.E__ID) AS AvailablePlaces
FROM Events e
JOIN [Event Participants] ep ON e.E__ID = ep.E__ID
WHERE e.E__ID = x AND ep.EP_STAT IN ('B','C')
GROUP BY e.Places
Something like
var array = new string[] { "B", "C" };
var result = (from e in Event where e.E__ID == x
let count = (from ep in Event_Participants
where ep.E__ID == e.E__ID &&
array.Contains(ep.EP_Stat)
select ep).Count()
select e.Places - count
)
.Single();
Depending on your model, it might be possible to use navigation properties in the subquery.

Dapper batch queries instead of a single query executed many times

I'm trying to optimize some queries, and I have this crazy one. The basic idea is I get a bunch of rooms which has some corresponding meetings. I currently run a query to get all the rooms, then foreach room I need to get the meetings, where I do a query for each room. This opens up for a lot of database connections (i.e. 1000 rooms each having to open a connection to pull the meetings), and I'd like to do it as a batch instead. I am using dapper to map my queries to models and I'm trying to use the list parameters described here
SELECT
mm.id,
mm.organizer_name as Organizer,
mm.subject as Subject,
mm.start_time as StartTime,
mm.end_time as EndTime,
(mm.deleted_at IS NOT NULL) as WasCancelled,
(am.interactive = 0 AND am.cancelled_at IS NOT NULL) as WasNoShow,
c.name as name
FROM master_meeting mm
LEFT JOIN master_meeting__exchange mme ON mme.id=mm.id
LEFT JOIN master_meeting__forwarded_exchange mmfe ON mmfe.id=mm.id
LEFT JOIN meeting_instance__exchange mie ON mie.meeting_id=mm.id
LEFT JOIN meeting_instance__forwarded_exchange mife ON mife.meeting_id=mm.id
LEFT JOIN appointment_meta__exchange ame ON mie.item_id=ame.item_id
LEFT JOIN appointment_meta__exchange ame2 ON mife.item_id=ame2.item_id
LEFT JOIN appointment_meta am ON am.id=ame.id
LEFT JOIN appointment_meta am2 ON am2.id=ame2.id
LEFT JOIN calendar c on mie.calendar_id=c.id
WHERE mie.calendar_id = #Id OR mife.calendar_id=#Id
AND mm.start_time BETWEEN #StartTime AND #EndTime
Without going into details of the crazy long join sequence, I currently have to do this query, a lot. It has been written up initially as:
List<Result> resultSet = new List<Result>();
foreach(int id in idList){
resultSet.AddRange(
_queryHandler.Handle(
new MeetingQuery(id, "FixedStartTime", "FixedEndTime")
)
);
}
Which in turn calls this a bunch of times and runs the query:
_connection.Query<Meeting>(sql,
new {
Id = query.id,
StartTime = query.StartTime,
EndTime = query.EndTime
}
);
This obviously requires quite a few database connections, and I'd like to avoid this by having dapper doing multiple queries, but I get the following error if I try to add the parameters as a list which looks like this:
class Parameters {
int Id;
string StartTime;
string EndTime;
}
List<Parameters> parameters = new List<Parameters>();
foreach(int id in idList)
parameters.Add(new Parameters(id, "SameStartTime", "SameEndTime");
Then I would use the list of parameters as this:
_connection.Query<Meeting>(sql,parameters);
The error I get is:
dapper Additional information: An enumerable sequence of parameters (arrays, lists, etc) is not allowed in this context
Firstly, it's possible to reuse a single connection for multiple queries, so you could retrieve all of your data with multiple Dapper "Query" calls using the same connection.
Something like the following (which isn't the exact same query as you showed since I was testing this on my own computer with a local database; it should be easy enough to see how it could be altered to work with your query, though) -
private static IEnumerable<Record> UnbatchedRetrieval(IEnumerable<Parameters> parameters)
{
var allResults = new List<Record>();
using (var conn = GetConnection())
{
foreach (var parameter in parameters)
{
allResults.AddRange(
conn.Query<Record>(
"SELECT Id, Title FROM Posts WHERE Id = #id",
parameter
)
);
}
}
return allResults;
}
public class Parameters
{
public int Id { get; set; }
}
However, if it really is the number of queries that you want to reduce through batching then there isn't anything in Dapper that makes it very easy to do since each parameter must be uniquely named, which won't be the case if you provide multiple instances of a type as the "parameters" value (since there will be "n" Id values that are all called "Id", for example).
You could do something a bit hacky to produce a single query string that will return results from multiple parameter sets, such as the following -
private static IEnumerable<Record> BatchedRetrieval(IEnumerable<Parameters> parameters)
{
using (var conn = GetConnection)
{
var select = "SELECT Id, Title FROM Posts";
var where = "Id = {0}";
var sqlParameters = new DynamicParameters();
var combinedWheres =
"(" +
string.Join(
") OR (",
parameters.Select((parameter, index) =>
{
sqlParameters.Add("id" + index, parameter.Id);
return string.Format(where, "#id" + index);
})
) +
")";
return conn.Query<Record>(
select + " WHERE " + combinedWheres,
sqlParameters
);
}
}
public class Parameters
{
public int Id { get; set; }
}
.. but this feels a bit dirty. It might be an option to explore, though, if you are absolutely sure that performing those queries one-by-one is a performance bottleneck.
Another thing to consider - when you need the data for 1000 different ids, are the start and end times always the same for each of the 1000 queries? If so, then you could possibly change your query to the following:
private static IEnumerable<Record> EfficientBatchedRetrieval(
IEnumerable<int> ids,
DateTime startTime,
DateTime endTime)
{
using (var conn = GetConnection())
{
return conn.Query<Record>(
#"SELECT
mm.id,
mm.organizer_name as Organizer,
mm.subject as Subject,
mm.start_time as StartTime,
mm.end_time as EndTime,
(mm.deleted_at IS NOT NULL) as WasCancelled,
(am.interactive = 0 AND am.cancelled_at IS NOT NULL) as WasNoShow,
c.name as name
FROM master_meeting mm
LEFT JOIN master_meeting__exchange mme ON mme.id=mm.id
LEFT JOIN master_meeting__forwarded_exchange mmfe ON mmfe.id=mm.id
LEFT JOIN meeting_instance__exchange mie ON mie.meeting_id=mm.id
LEFT JOIN meeting_instance__forwarded_exchange mife ON mife.meeting_id=mm.id
LEFT JOIN appointment_meta__exchange ame ON mie.item_id=ame.item_id
LEFT JOIN appointment_meta__exchange ame2 ON mife.item_id=ame2.item_id
LEFT JOIN appointment_meta am ON am.id=ame.id
LEFT JOIN appointment_meta am2 ON am2.id=ame2.id
LEFT JOIN calendar c on mie.calendar_id=c.id
WHERE mie.calendar_id IN #Ids OR mife.calendar_id IN #Ids
AND mm.start_time BETWEEN #StartTime AND #EndTime",
new { Ids = ids, StartTime = startTime, EndTime = endTime }
);
}
}
There may be a problem with this if you call it with large numbers of ids, though, due to the way that Dapper converts the IN clause - as described in https://stackoverflow.com/a/19938414/3813189 (where someone warns against using it with large sets of values).
If that approach fails then it might be possible to do something similar to the temporary table bulk load suggested here: https://stackoverflow.com/a/9947259/3813189, where you get all of the keys that you want data for into a temporary table and then perform a query that joins on to that table for the keys (and then deletes it again after you have the data).

Convert sql query with a join on a subselect to a linq statement

I am trying to convert the following sql query to LINQ statement
SELECT t.*
FROM (
SELECT Unique_Id, MAX(Version) mversion
FROM test
GROUP BY Unique_Id
) m INNER JOIN
test t ON m.Unique_Id = t.Unique_Id AND m.mversion = t.Version
LINQ statement
var testalt = (from altt in CS.test
group altt by altt.Unique_Id into g
join bp in CS.alerts on g.FirstOrDefault().Unique_Id equals bp.Unique_Id
select new ABCBE
{
ABCName= bp.Name,
number = bp.Number,
Unique_Id = g.Key,
Version = g.Max(x=>x.Version)
});
I am getting an error of where clause. Please help
SQL FIDDLE
This is not an easy straight forward conversion but you can accomplish the same thing using linq method syntax. The first query is executed to an expression tree, then you are joining that expression tree from the grouping against CS.alerts. This combines the expression tree from CS.test query into the expression tree of CS.alerts to join the two expression trees.
The expression tree is evaluated to build the query and execute said query upon enumeration. Enumeration in this case is the ToList() call but anything that gets a result from the enumeration will execute the query.
var query1 = CS.test.GroupBy(x => x.Unique_Id);
var joinResult = CS.alerts.Join(query1,
alert => new { ID = alert.Unique_Id, Version = alert.Version },
test => new { ID = test.Key, Version = test.Max(y => y.Version },
(alert, test) => new ABCBE {
ABCName = alert.Name,
number = alert.Number,
Unique_Id = test.Key,
Version = test.Max(y => y.Version)
}).ToList();
Because query1 is still an IQueryable and you are using CS.alerts (which I'm guessing CS is your data context) it should join and build the query to execute upon the ToList() enumeration.

LINQ query help needed for Intersect

LINQ gurus, I am looking for help to write a query...
I have a table with Person records, and it has a nullable ParentID column, so it is kind of self-referencing, where each record might have a Parent.
I am looking for unprocessed rows whose parent rows were processed.
This SQL works fine:
SELECT *
FROM Person
where IsProcessed = 0 and
ParentId in
(
select Id from Person
where IsProcessed = 1
)
I tried a number of LINQ queries, but they failed. Now, I'm trying:
var qParent =
from parent in db.Person
where
parent.IsProcessed == true
select parent.ID;
var qChildren = from child in db.Person
where
child.IsProcessed == false
&& child.ParentId.HasValue
select child.ParentId.Value;
var q2 = qChildren.Intersect(qParent);
This yields SQL with a DISTINCT clause, for some reason, and I am baffled why DISTINCT is generated.
My main question is how to write LINQ for the SQL statement above?
Thanks in advance.
Intersect is a set operation - it is meant to return a set of distinct elements from the intersection. It seems reasonable to me that it would use DISTINCT in the SQL. There could be multiple children with the same parent, for example - Intersect should only return that ID once.
Is there any reason you don't want to use a join here?
var query = from parent in db.Person
where parent.IsProcessed
join child in db.Person.Where(child => !child.IsProcessed)
on parent.ID equals child.ParentId.Value
select child;
The query can be translated literally into :
var parentIds = db.Person.Where(x => x.IsProcessed)
.Select(x => x.Id)
.ToList();
var result = db.Person.Where(x => !x.IsProcessed && parentIds.Contains(x => x.Id))
.ToList();

What's the Linq to SQL equivalent to TOP or LIMIT/OFFSET?

How do I do this
Select top 10 Foo from MyTable
in Linq to SQL?
Use the Take method:
var foo = (from t in MyTable
select t.Foo).Take(10);
In VB LINQ has a take expression:
Dim foo = From t in MyTable _
Take 10 _
Select t.Foo
From the documentation:
Take<TSource> enumerates source and yields elements until count elements have been yielded or source contains no more elements. If count exceeds the number of elements in source, all elements of source are returned.
In VB:
from m in MyTable
take 10
select m.Foo
This assumes that MyTable implements IQueryable. You may have to access that through a DataContext or some other provider.
It also assumes that Foo is a column in MyTable that gets mapped to a property name.
See http://blogs.msdn.com/vbteam/archive/2008/01/08/converting-sql-to-linq-part-7-union-top-subqueries-bill-horst.aspx for more detail.
Use the Take(int n) method:
var q = query.Take(10);
The OP actually mentioned offset as well, so for ex. if you'd like to get the items from 30 to 60, you would do:
var foo = (From t In MyTable
Select t.Foo).Skip(30).Take(30);
Use the "Skip" method for offset.
Use the "Take" method for limit.
#Janei: my first comment here is about your sample ;)
I think if you do like this, you want to take 4, then applying the sort on these 4.
var dados = from d in dc.tbl_News.Take(4)
orderby d.idNews descending
select new
{
d.idNews,
d.titleNews,
d.textNews,
d.dateNews,
d.imgNewsThumb
};
Different than sorting whole tbl_News by idNews descending and then taking 4
var dados = (from d in dc.tbl_News
orderby d.idNews descending
select new
{
d.idNews,
d.titleNews,
d.textNews,
d.dateNews,
d.imgNewsThumb
}).Take(4);
no ? results may be different.
This works well in C#
var q = from m in MyTable.Take(10)
select m.Foo
Whether the take happens on the client or in the db depends on where you apply the take operator. If you apply it before you enumerate the query (i.e. before you use it in a foreach or convert it to a collection) the take will result in the "top n" SQL operator being sent to the db. You can see this if you run SQL profiler. If you apply the take after enumerating the query it will happen on the client, as LINQ will have had to retrieve the data from the database for you to enumerate through it
I do like this:
var dados = from d in dc.tbl_News.Take(4)
orderby d.idNews descending
select new
{
d.idNews,
d.titleNews,
d.textNews,
d.dateNews,
d.imgNewsThumb
};
You would use the Take(N) method.
Taking data of DataBase without sorting is the same as random take
Array oList = ((from m in dc.Reviews
join n in dc.Users on m.authorID equals n.userID
orderby m.createdDate descending
where m.foodID == _id
select new
{
authorID = m.authorID,
createdDate = m.createdDate,
review = m.review1,
author = n.username,
profileImgUrl = n.profileImgUrl
}).Take(2)).ToArray();
I had to use Take(n) method, then transform to list, Worked like a charm:
var listTest = (from x in table1
join y in table2
on x.field1 equals y.field1
orderby x.id descending
select new tempList()
{
field1 = y.field1,
active = x.active
}).Take(10).ToList();
This way it worked for me:
var noticias = from n in db.Noticias.Take(6)
where n.Atv == 1
orderby n.DatHorLan descending
select n;