How to translate an advanced SQL statement with LEFT OUTER JOIN to LINQ? - linq-to-sql

I have a similar SQL statement as shown in this example. For the following table
CREATE TABLE [docs] (
[id] int NOT NULL,
[rev] int NOT NULL,
[content] varchar(200) NOT NULL,
PRIMARY KEY ([id],[rev])
) ;
and the following data
INSERT INTO [docs] ([id], [rev], [content]) VALUES
(1, 1, 'The earth is flat'),
(2, 1, 'One hundred angels can dance on the head of a pin'),
(1, 2, 'The earth is flat and rests on a bull``s horn'),
(1, 3, 'The earth is like a ball.');
the SQL statement
SELECT d1.*
FROM docs AS d1
LEFT OUTER JOIN docs AS d2
ON (d1.id = d2.id AND d1.rev < d2.rev)
WHERE d2.id is null
ORDER BY id;
shows only rows with maximum rev value for each id:
id rev content
1 3 The earth is like a ball.
2 1 One hundred angels can dance on the head of a pin
My question:
How can I translate this statement to LINQ-to-SQL? The problem in my point of view are the AND and the < in the ON clause.

In general, it is best to translate a SQL LEFT JOIN...WHERE ... = null into an EXISTS which in LINQ is Any:
var ans = from d1 in docs
join d2 in docs on d1.id equals d2.id into d2j
where !d2j.Any(d2 => d1.rev < d2.rev)
orderby d1.id
select d1;
But, of course, you can translate it into an explicit LINQ null test:
using (MyDataContext context = new MyDataContext()) {
var query =
from d1 in context.docs
join d2 in context.docs on d1.id equals d2.id into d2j
from d2 in d2j.Where(d2_2 => d1.rev < d2_2.rev).DefaultIfEmpty()
where d2 == null
select d1;
return query.ToArray();
}

I tried to apply #NetMage's recipe but I got stuck at the < condition:
using (MyDataContext context = new MyDataContext())
{
var query =
from d1 in context.docs
join d2 in context.docs on d1.id equals d2.id into jrv
from x in jrv.Where(x => /* ??? */).DefaultIfEmpty()
where x equals null
select x;
return query.ToArray();
}
The lambda expression in Where should be a comparison between d1.rev and d2.rev. How can I do that?

Related

Trying to join on set returning function

I'm trying to join two tables on IDs extracted from json array of one of the tables, I found some topic on lateral joins but I can't get it, I'm failing to implement it on my case.
Or maybe there's other way to do it?
create table jsontable (response jsonb);
insert into jsontable values ('{"SCS":[{"customerId": 100, "referenceId": 215}, {"customerId": 120, "referenceId":544}, {"customerId": 400, "referenceId": 177}]}');
create table message (msg_id integer, status integer, content text);
insert into message values
(544, 1, 'Test'), (134, 1, 'Test2'), (177, 0, 'Test3'), (215, 1, 'Test4');
SELECT m.*
FROM jsontable t
JOIN message m ON m.msg_id = (jsonb_array_elements(t.response -> 'SCS')->>'referenceId')::int
and m.status = 1
https://dbfiddle.uk/?rdbms=postgres_12&fiddle=8b3890efd34199f2356b6abca2f811c2
Of course it throws an ERROR: set-returning functions are not allowed in JOIN conditions
It looks like what you want to join against are the individual objects in the array, not the whole row. So use
SELECT m.*, obj
FROM jsontable t, jsonb_array_elements(t.response -> 'SCS') obj
JOIN message m ON m.msg_id = (obj->>'referenceId')::int AND m.status = 1;
or (a bit more readable imo)
SELECT m.*, obj
FROM jsontable t,
LATERAL jsonb_array_elements(t.response -> 'SCS') obj
JOIN message m ON m.msg_id = (obj->>'referenceId')::int
WHERE m.status = 1;
(updated fiddle)
jsonb_array_elements() :: int returns a set of integer which cannot be equal to one integer m.msg_id.
Try instead :
SELECT m.*
FROM jsontable t
JOIN message m ON m.msg_id IN (SELECT (jsonb_array_elements(t.response -> 'SCS')->>'referenceId')::int)
and m.status = 1

Combined MySQL query help - 2 tables

I have a table rosters and a table members. They are setup like this:
rosters table
id team member
1 1 1
2 1 2
3 1 3
members table
id name last_active
1 Dan 1454815000
2 Ewan 1454817500
3 Cate 1454818369
I need to fetch all rows in rosters where team=1. Then, I need to take all those returned results and use the member column to fetch the largest last_active value from members where id is in that list of results returned from rosters
This is how I would do it in PHP, but I'm sure there's a way to just use a more efficient query.
$rosterList = $db->query('SELECT * FROM rosters WHERE team=1');
$lastActive = 0;
foreach($rosterList as $roster) {
$activity = $db->query('SELECT last_active FROM members WHERE id='.$roster['team']);
if ( $activity > $lastActive )
$lastActive = $activity;
}
if ( $lastActive > time()-60 )
echo 'team is currently online';
It would be nice if it could just return one result with the latest last_active column but if it returns all matches in the members table that would be fine too.
By using an ORDER BY (descending) on that last_actie column, then limiting to just 1 row, you get access to the whole member row.
sqlfiddle demo
MySQL 5.6 Schema Setup:
CREATE TABLE members
(`id` int, `name` varchar(4), `last_active` int)
;
INSERT INTO members
(`id`, `name`, `last_active`)
VALUES
(1, 'Dan', 1454815000),
(2, 'Ewan', 1454817500),
(3, 'Cate', 1454818369)
;
CREATE TABLE rosters
(`id` int, `team` int, `member` int)
;
INSERT INTO rosters
(`id`, `team`, `member`)
VALUES
(1, 1, 1),
(2, 1, 2),
(3, 1, 3)
;
Query 1:
select
m.*
from members m
join rosters r on m.id = r.member
where r.team = 1
order by m.last_active DESC
limit 1
Results:
| id | name | last_active |
|----|------|-------------|
| 3 | Cate | 1454818369 |
You can use following solution:
$result_array = mysql_query("SELECT * FROM rosters as r INNER JOIN members as m on r.member=m.id AND r.team = '1' ORDER BY last_active DESC LIMIT 1");
$lastActive = 0;
if($result_array)
{
while($row = mysql_fetch_row($result_array,MYSQL_ASSOC))
{
$lastActive = $result_array['last_active'];
}
}
if ( !empty($lastActive) && $lastActive > time()-60 )
echo 'team is currently online';
I'm a little confused why you are retrieving rows from members where the id value matches the team value from rosters. Seems like you would want to match on the member column.
You can use a join operation and an aggregate function. To get the largest value of last_active for all members of a given team, using the member column, something like this:
SELECT MAX(m.last_active) AS last_active
FROM members m
JOIN rosters r
ON r.member = m.id
WHERE r.team = 1
To do the equivalent of the original example, using the team column (and again, I don't understand why you would do this, because it doesn't look right):
SELECT MAX(m.last_active) AS last_active
FROM members m
JOIN rosters r
ON r.team = m.id
WHERE r.team = 1
The MAX() aggregate function in the SELECT list, with no GROUP BY clause, causes all of the rows returned to be "collapsed" into a single row. The largest value of last_active from the rows that satisfy the predicates will be returned.
You can see how the join operation works by eliminating the MAX() aggregate...
SELECT m.last_active
, m.id AS member_id
, m.name AS member_name
, r.member
, r.team
, r.id AS roster_id
FROM members m
JOIN rosters r
ON r.member = m.id
WHERE r.team = 1
ORDER BY m.last_active DESC

How to get column name which contains certain string

I have a mysql table with the below columns/values
d1 d2 d3 d4 d5
10 12 9 2 6
i couldn't figure out how to get column name for the matching value... as the query should return d1 if the matched value is 10. i just simply i want the column name to be returned for the matched value
this is the orders listing query.. the problem with this query is that it returns all pending orders for all departments. but if i added AND stage = 'd1' i will work
i just don't know how to get 'd1' or 'd2' 'd3' from the layout i demonstrated
SELECT `phoenix_so`.`id`, `stage`, `so_number`, `service` AS service_id, `cid`, `uid`, `pec`, `customer_name`, `phoenix_so_service`.`name` AS service, `location`, `phoenix_so_priority`.`name` AS pri, `phoenix_so_priority`.`css_class`, `phoenix_so_type`.`name` as type, `phoenix_so_type`.`css_class` AS type_css
FROM (`phoenix_so`)
LEFT JOIN `phoenix_so_service` ON `phoenix_so`.`service` = `phoenix_so_service`.`id`
LEFT JOIN `phoenix_so_priority` ON `phoenix_so`.`priority` = `phoenix_so_priority`.`id`
LEFT JOIN `phoenix_so_type` ON `phoenix_so`.`type` = `phoenix_so_type`.`id`
LEFT JOIN `phoenix_so_roadmap` ON `phoenix_so`.`service` = `phoenix_so_roadmap`.`service_id`
WHERE `inv_access_cdate` = "0000-00-00"

feeding result of one query into another

I tried to simplify my question to a basic example I wrote down below, the actual problem is much more complex so the below queries might not make much sense but the basic concepts are the same (data from one query as argument to another).
Query 1:
SELECT Ping.ID as PingID, Base.ID as BaseID FROM
(SELECT l.ID, mg.DateTime from list l
JOIN mygroup mg ON mg.ID = l.MyGroup
WHERE l.Type = "ping"
ORDER BY l.ID DESC
) Ping
INNER JOIN
(SELECT l.ID, mg.DateTime from list l
JOIN mygroup mg ON mg.ID = l.MyGroup
WHERE l.Type = "Base"
ORDER BY l.ID DESC
) Base
ON Base.DateTime < Ping.DateTime
GROUP BY Ping.ID
ORDER BY Ping.ID DESC;
+--------+--------+
| PingID | BaseID |
+--------+--------+
| 11 | 10 |
| 9 | 8 |
| 7 | 6 |
| 5 | 3 |
| 4 | 3 |
+--------+--------+
// from below I need to replace 11 by PingID above and 10 by BaseID above then the results to show up on as third column above (0 if no results, 1 if results)
Query 2:
SELECT * FROM
(SELECT sl.Data FROM list l
JOIN sublist sl ON sl.ParentID = l.ID
WHERE l.Type = "ping" AND l.ID = 11) Ping
INNER JOIN
(SELECT sl.Data FROM list l
JOIN sublist sl ON sl.ParentID = l.ID
WHERE l.Type = "base" AND l.ID = 10) Base
ON Base.Data < Ping.Data;
How can I do this? Again I'm not sure what kind of advice I will receive but please understand that the Query 2 is in reality over 200 lines and I basically can't touch it so I don't have so much flexibility as I'd like and ideally I'd like to get this working all in SQL without having to script this.
CREATE DATABASE lookback;
use lookback;
CREATE TABLE mygroup (
ID BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
DateTime DateTime
) ENGINE=InnoDB;
CREATE TABLE list (
ID BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
Type VARCHAR(255),
MyGroup BIGINT NOT NULL,
Data INT NOT NULL
) ENGINE=InnoDB;
CREATE TABLE sublist (
ID BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
ParentID BIGINT NOT NULL,
Data INT NOT NULL
) ENGINE=InnoDB;
INSERT INTO mygroup (DateTime) VALUES ("2012-03-09 22:33:19"), ("2012-03-09 22:34:19"), ("2012-03-09 22:35:19"), ("2012-03-09 22:36:19"), ("2012-03-09 22:37:19"), ("2012-03-09 22:38:19"), ("2012-03-09 22:39:19"), ("2012-03-09 22:40:19"), ("2012-03-09 22:41:19"), ("2012-03-09 22:42:19"), ("2012-03-09 22:43:19");
INSERT INTO list (Type, MyGroup, Data) VALUES ("ping", 1, 4), ("base", 2, 2), ("base", 3, 4), ("ping", 4, 7), ("ping", 5, 8), ("base", 6, 7), ("ping", 7, 8), ("base", 8, 3), ("ping", 9, 10), ("base", 10, 2), ("ping", 11, 3);
INSERT INTO sublist (ParentID, Data) VALUES (1, 2), (2, 3), (3, 6), (4, 8), (5, 4), (6, 5), (7, 1), (8, 9), (9, 11), (10, 4), (11, 6);
The simplest way of dealing with this is temporary tables, described here and here. If you create an empty table to store your results (let's call it tbl_temp1) you can to this:
INSERT INTO tbl_temp1 (PingID, BaseID)
SELECT Ping.ID as PingID, Base.ID as BaseID
FROM ...
Then you can query it however you like:
SELECT PingID, BaseID from tbl_temp1 ...
Edited to add:
From the docs for CREATE TEMPORARY TABLE:
You can use the TEMPORARY keyword when creating a table. A TEMPORARY
table is visible only to the current connection, and is dropped
automatically when the connection is closed. This means that two
different connections can use the same temporary table name without
conflicting with each other or with an existing non-TEMPORARY table of
the same name. (The existing table is hidden until the temporary table
is dropped.)
If this were a more flattened query, then there would a straightforward answer.
It is certainly possible to use a derived table as the input to outer queries. A simple example would be:
select
data1,
(select data3 from howdy1 where howdy1.data1 = greetings.data1) data3_derived
from
(select data1 from hello1 where hello1.data2 < 4) as greetings;
where the derived table greetings is used in the inline query. (SQL Fiddle for this simplistic example: http://sqlfiddle.com/#!3/49425/2 )
Following this logic would lead us to assume that you could cast your first query as a derived table of query1 and then recast query2 into the select statement.
For that I constructed the following:
select query1.pingId, query1.baseId,
(SELECT ping.Data pingData FROM
(SELECT sl.Data FROM list l
JOIN sublist sl ON sl.ParentID = l.ID
WHERE l.Type = "ping" AND l.ID = query1.pingId
) Ping
INNER JOIN
(SELECT sl.Data FROM list l
JOIN sublist sl ON sl.ParentID = l.ID
WHERE l.Type = "base" AND l.ID = query1.baseId
) Base
ON Base.Data < Ping.Data)
from
(SELECT Ping.ID as PingID, Base.ID as BaseID FROM
(SELECT l.ID, mg.DateTime from list l
JOIN mygroup mg ON mg.ID = l.MyGroup
WHERE l.Type = "ping"
ORDER BY l.ID DESC
) Ping
INNER JOIN
(SELECT l.ID, mg.DateTime from list l
JOIN mygroup mg ON mg.ID = l.MyGroup
WHERE l.Type = "Base"
ORDER BY l.ID DESC
) Base
ON Base.DateTime < Ping.DateTime
GROUP BY Ping.ID
) query1
order by pingId desc;
where I have inserted query2 into a select clause from query1 and inserted query1.pingId and query1.baseId in place of 11 and 10, respectively. If 11 and 10 are left in place, this query works (but obviously only generates the same data for each row).
But when this is executed, I'm given an error: Unknown column 'query1.pingId'. Obviously, query1 cannot be seen inside the nested derived tables.
Since, in general, this type of query is possible, when the nesting is only 1 level deep (as per my greeting example at the top), there must be logical restrictions as to why this level of nesting isn't possible. (Time to pull out the database theory book...)
If I were faced with this, I'd rewrite and flatten the queries to get the real data that I wanted. And eliminate a couple things including that really nasty group by that is used in query1 to get the max baseId for a given pingId.
You say that's not possible, due to external constraints. So, this is, ultimately, a non-answer answer. Not very useful, but maybe it'll be worth something.
(SQL Fiddle for all this: http://sqlfiddle.com/#!2/bac74/35 )
If you cannot modify query 2 then there is nothing we can suggest. Here is a combination of your two queries with a reduced level of nesting. I suspect this would be slow with a large dataset -
SELECT tmp1.PingID, tmp1.BaseID, IF(slb.Data, 1, 0) AS third_col
FROM (
SELECT lp.ID AS PingID, MAX(lb.ID) AS BaseID
FROM MyGroup mgp
INNER JOIN MyGroup mgb
ON mgb.DateTime < mgp.DateTime
INNER JOIN list lp
ON mgp.ID = lp.MyGroup
AND lp.Type = 'ping'
INNER JOIN list lb
ON mgb.ID = lb.MyGroup
AND lb.Type = 'base'
GROUP BY lp.ID DESC
) AS tmp1
LEFT JOIN sublist slp
ON tmp1.PingID = slp.ParentID
LEFT JOIN sublist slb
ON tmp1.BaseID = slb.ParentID
AND slb.Data < slp.Data;

LINQ To SQL Grouping

Can some kind soul please lend me the Linq To SQL query for the following T-SQL Query
SELECT e.EmployeeName, MIN(a.SignInTime), MAX(a.SignOutTime)
FROM Employee e
LEFT OUTER JOIN Attendance a ON e.Id = a.EmployeeId AND CAST (CONVERT (varchar, a.SignInTime, 106) AS DATETIME) = '28 APR 2009'
GROUP BY e.EmployeeName
The database schema i have is as follows
Employee: {Id: int identity PK, EmployeeName: varchar(200) NOT NULL}
Attendance: {Id: int identity PK, EmployeeId: int FK(Employee(Id)) NOT NULL, SignInTime: DateTime, SignOutTime: DateTime}
NOTE: The Convert gimmick is only used to chop off the time portion in the SignInTime for comparision
This is definitely a challenge. It's ugly, but It does the trick. It returns exactly what you are looking for. In the case of employees without attendance, you get back the name with null times.
var selectedDate = new DateTime(2009,4,28);
var query = from e in db.Employees
join a in db.Attendances on e.Id equals a.EmployeeId into Temp
from t in Temp.DefaultIfEmpty()
where t.SignInTime == null || (t.SignInTime >= selectedDate && t.SignInTime < selectedDate.AddDays(1))
group t by e.EmployeeName into grouped
select new
{
Employee = grouped.Key,
FirstSignInTime = grouped.Min(a => a.SignInTime),
LastSignOutTime = grouped.Max(a => a.SignOutTime)
};
This is the SQL that is emitted by that expression:
DECLARE #p0 DateTime = '2009-04-27 00:00:00.000'
DECLARE #p1 DateTime = '2009-04-28 00:00:00.000'
SELECT MIN([t1].[SignInTime]) AS [FirstSignInTime], MAX([t1].[SignOutTime]) AS [LastSignOutTime], [t0].[EmployeeName] AS [Employee]
FROM [Employee] AS [t0]
LEFT OUTER JOIN [Attendance] AS [t1] ON [t0].[Id] = [t1].[EmployeeId]
WHERE ([t1].[SignInTime] IS NULL) OR (([t1].[SignInTime] >= #p0) AND ([t1].[SignInTime] < #p1))
GROUP BY [t0].[EmployeeName]
This is very close to your original SQL. I added where "t.SignInTime == null" so that it returns employees with attendance, but you can remove that if that's not what you want.
This should work:
from e in db.Employee
join a in db.Attendance
on new { e.Id, Column1 = (DateTime?)Convert.ToDateTime(Convert.ToString(a.SignInTime)) }
equals new { Id = a.EmployeeId, Column1 = "28 APR 2009" } into a_join
from a in a_join.DefaultIfEmpty()
group new {e, a} by new {
e.EmployeeName
} into g
select new {
g.Key.EmployeeName,
Column1 = g.Min(p => p.a.SignInTime),
Column2 = g.Max(p => p.a.SignoutTime)
}