MYSQL join not selecting correct row - mysql

I have two tables
1) outreach
id profile_id url
-------------------------
1 2 www.test.com
2 3 www.google.com
3 4 www.example.com
2). outreach_links
id outreach_id end_date status
------------------------------------
1 1 2016-12-28 00:00:00 Approved
2 1 2016-12-16 00:00:00 Approved
3 1 NUll Pending
4 1 2016-12-11 00:00:00 Approved
I have this SQL Query with Left Join and Conditions that is working fine except I want to select the whole ROW of the MAX end_date that meets the 3 condition. so in this case the first row with end_date = 2016-12-28 00:00:00
select o.*,ol.*,MAX(ol.end_date) as max_date, SUM(ol.status = "Approved" and (ol.end_date > Now() or end_date is null)) as cond1, SUM(ol.status = "Pending") as cond2,
SUM(ol.status = "Approved" and (ol.end_date < Now() and ol.end_date is not null)) as cond3
FROM outreach o
LEFT JOIN outreach_links ol on ol.outreach_id = o.id
WHERE o.profile_id=2
GROUP BY o.id
HAVING (cond1 = 0 and cond2 = 0) or (cond1 = 0 and (cond2 = 1 and cond3 >=1))
ORDER BY ol.end_date desc
but this is the output for this query ( its picking the pending for some reason) >>
+"id": "3"
+"profile_id": "2"
+"url": "www.test.com"
+"outreach_id": "1"
+"end_date": null
+"status": "Pending"
+"max_date": "2016-12-28 00:00:00"
+"cond1": "0"
+"cond2": "1"
+"cond3": "3"
I want to get this instead
+"id": "1"
+"profile_id": "2"
+"url": "www.test.com"
+"outreach_id": "1"
+"end_date": 2016-12-28 00:00:00
+"status": "Approved"
+"max_date": "2016-12-28 00:00:00"
+"cond1": "0"
+"cond2": "1"
+"cond3": "3"
The first row with MAX end date, how can I do that keeping this same Query ??
Thanks

See SQL Select only rows with Max Value on a Column for how to get the row with the max end_date for each outreach_id. Then join with that row to get the latest status.
SELECT o.*, ol1.max_date, ol2.status, SUM(ol.status = "Approved" and (ol.end_date > Now() or end_date is null)) as cond1, SUM(ol.status = "Pending") as cond2,
SUM(ol.status = "Approved" and (ol.end_date < Now() and ol.end_date is not null)) as cond3
FROM outreach o
LEFT JOIN outreach_links AS ol ON ol.outreach_id = o.id
LEFT JOIN (SELECT outreach_id, MAX(end_date) AS max_date
FROM outreach_links
GROUP BY outreach_id) AS ol1 ON ol1.outreach_id = o.id
LEFT JOIN outreach_links ol2 on ol2.outreach_id = o.id AND ol2.end_date = ol1.max_date
WHERE o.profile_id=2
GROUP BY o.id
HAVING (cond1 = 0 and cond2 = 0) or (cond1 = 0 and (cond2 = 1 and cond3 >=1))
ORDER BY ol.end_date desc

Related

Get COUNT/SUM of records based on another column in SQL

How do I get the count/sum of the rows (COUNT () or SUM ()) based on another column (of the Type: weekly or yearly)? I have two tables:
Stores:
Id
Name
Type
1
Store 1
Weekly
2
Store 2
Yearly
3
Store 3
Weekly
4
Store 4
Weekly
Orders:
Id
StoreId
OrderDate
Qty
1
1
2022-01-31
2
2
1
2022-12-31
5*
3
2
2022-01-28
30*
4
2
2022-06-30
50*
5
2
2022-12-31
70*
6
3
2022-06-15
8
7
3
2022-12-27
9*
8
3
2022-12-31
3*
a) If I pass the date range (by weekly,2022-12-26 ~ 2023-01-01), the expected result should look like this:
Id
Name
Count of orders
Total Qty
1
Store 1
1
5
2
Store 2
3
150 (sum by the year when the store's type equals "Yearly": 30+50+70)
3
Store 3
2
12 (sum by the selected week: 9+3)
4
Store 4
0
0
If the Store type is Yearly then all orders will be summed up based on StoreId & year of OrderDate, if Weekly then based on StoreId & selected OrderDate.
b) I tried using CASE in SELECT statement, but no luck, here are part of my codes:
SELECT s.Id,
s.Name,
COUNT(o.Id) AS 'Count of orders',
sum(o.Qty) AS 'Total Qty'
FROM Stores AS s
LEFT JOIN Orders AS o
ON o.StoreId = s.id
AND (OrderDate >= '2022-12-26' AND OrderDate <= '2023-01-01')
GROUP BY s.Id, OrderDate
ORDER BY OrderDate DESC
You could use conditional aggregation as the following:
SELECT s.Id,
s.Name,
COUNT(CASE
WHEN s.Type = 'Yearly' THEN
o.Id
ELSE
CASE
WHEN OrderDate >= '2022-12-26' AND OrderDate <= '2023-01-01' THEN
o.Id
END
END) As 'Count of orders',
SUM(CASE
WHEN s.Type = 'Yearly' THEN
o.Qty
ELSE
CASE
WHEN OrderDate >= '2022-12-26' AND OrderDate <= '2023-01-01' THEN
o.Qty
ELSE
0
END
END) AS 'Total Qty'
FROM Stores AS s
LEFT JOIN Orders AS o
ON o.StoreId = s.id
GROUP BY s.Id, s.Name
ORDER BY MAX(OrderDate) DESC
See demo.
You can do in this way.
Please take note that, type is a keyword in MySQL.
SELECT s.id,
s.name,
s.type,
COUNT(s.name) AS total_count,
SUM(o.qty) AS total_qty
FROM stores s
LEFT JOIN orders o
ON s.id = o.storeid
WHERE (o.orderdate >= '2022-12-26' AND o.orderDate <= '2023-01-01'
AND s.type = 'Weekly')
OR s.type = 'Yearly'
GROUP BY s.id, s.name, s.type
From the description, calculate count(Orders.Id) and sum(Orders.Qty)
Stores.Type = 'Weekly': Orders.OrderDate between #start_date and #end_date
Stores.Type = 'Yearly': Orders.OrderDate in the year of #start_date (...all orders will be summed up based on StoreId & year of OrderDate.)
Thus, the first step is to have where clause to filter out Orders and then aggregate to Store.Id level. Then, 2nd step is to left join from Stores table to the result of first step so that stores without sales in specified date ranges are reported.
set #start_date = '2022-12-26', #end_date = '2023-01-01';
with cte_store_sales as (
select s.Id,
count(o.Id) as order_count,
sum(o.Qty) as total_qty
from stores s
left
join orders o
on s.Id = o.StoreId
where (s.type = 'Weekly' and o.OrderDate between #start_date and #end_date)
or (s.type = 'Yearly' and o.OrderDate between makedate(year(#start_date),1)
and date_sub(date_add(makedate(year(#start_date),1), interval 1 year), interval 1 day))
group by s.Id)
select s.Id,
s.Name,
coalesce(ss.order_count, 0) as "Count of Orders",
coalesce(ss.total_qty, 0) as "Total Qty"
from stores s
left
join cte_store_sales ss
on s.Id = ss.Id
order by s.Id;
Output:
Id|Name |Count of Orders|Total Qty|
--+-------+---------------+---------+
1|Store 1| 1| 5|
2|Store 2| 3| 150| <-- Store sales in year 2022
3|Store 3| 2| 12|
4|Store 4| 0| 0| <-- Report stores without sales
First of all, we shall extract the raw data matching the orderdate table condition, which can be used for the sake of aggregation later. Note,here I treat the date range as inclusive. Therefore, it shall be year 2022 and 2023 for 2022-12-26 ~ 2023-01-01 if the type is yearly.
select s.id id, name,
(case when type='weekly' and orderdate between '2022-12-26' and '2023-01-01' then qty
when type='yearly' and year(orderdate) between year('2022-12-26') and year('2023-01-01') then qty
end) as qt
from Stores s
left join Orders o
on s.id=o.storeid;
-- result set:
# id, name, qt
1, Store 1, 5
2, Store 2, 30
2, Store 2, 50
2, Store 2, 70
3, Store 3,
3, Store 3, 9
3, Store 3, 3
4, Store 4,
The rest is to do the summarisation job using the derived table. Note: Since the column name is not in the group by list, but it's actually unique for a specific storeid, we can use the any_value function to bypass the restriction which might be enforced due to the SQL_MODE system variable.
select id,any_value(name) as'Name',count(qt) as 'Count of orders', ifnull(sum(qt),0) as 'Total Qty'
from
(select s.id id, name,
(case when type='weekly' and orderdate between '2022-12-26' and '2023-01-01' then qty
when type='yearly' and year(orderdate) between year('2022-12-26') and year('2023-01-01') then qty
end) as qt
from Stores s
left join Orders o
on s.id=o.storeid) tb
group by id
order by id
;
-- result set:
# id, Name, Count of orders, Total Qty
1, Store 1, 1, 5
2, Store 2, 3, 150
3, Store 3, 2, 12
4, Store 4, 0, 0

MYSQL Query Select with condition & Join

I have these 3 tables ( with these structure):
outreach
id url profile_id
------------------------------------------
40 www.google.com 2
41 www.yahoo.com 3
42 www.test.com 1
outreach_links
id outreach_id end_date status
-----------------------------------------------
1 41 2016-01-12 Pending
2 40 2016-03-12 Pending
3 40 2016-02-12 Approved
comments
id outreach_id name
----------------------------
1 40
2 40
3 40
and I have this Query:
select o.*,
SUM(if(ol.status = "Approved" and (ol.end_date > now() or end_date is null), 1, 0)) as cond1,
SUM(if(ol.status = "Pending" and (ol.end_date != now() or end_date is null), 1, 0)) as cond2,
SUM(if(ol.status = "Pending" and (ol.end_date < now()), 1, 0)) as cond3
from outreach o
left join outreach_links ol on ol.outreach_id = o.id
where o.profile_id=2
group by o.id
having (cond1 = 0 and cond2 = 0) or (cond1 = 0 and (cond2 = 1 and cond3 >=1)) order by ol.end_date desc
I am trying to fix this Query and make it also select the following:
1). ol.* ONLY if MAX(end_date) and
2). Count(id.comment) count all comments for that particular row
is that possible?
right now here is the output
+"id": "40"
+"profile_id": "2"
+"url": "http://www.google.com"
+"created_at": "2016-12-05 21:55:10"
+"updated_at": "2016-12-05 22:49:56"
+"cond1": "0"
+"cond2": "0"
+"cond3": "5"
I want to add
+"max_date": get me max of end_date and the whole row of the row highlighted
+"Count(comments)": get me all the comments count for this one which is 3
Thanks
Are you trying to get the latest update date? The following query should give you the latest updated date.
However, I do not understand what you are trying to get for cond1, cond2, cond3, and what should be populated as created_date, and updated_date? Can you please give definitions for these fields?
SELECT o.*, ol.*, COUNT(c.id)
FROM outreach o
LEFT JOIN outreach_links ol ON ol.outreach_id = o.id
LEFT JOIN comments c ON c.outreach_id = o.id
WHERE ol.id = (SELECT ol2.id
FROM outreach_links ol2
WHERE ol2.outreach_id = ol.outreach_id
ORDER BY ol2.end_date, ol2.id DESC
LIMIT 1)
OR ol.id IS NULL
GROUP BY o.id, ol.id

How to split SQL query results into columns based on two WHERE conditions and two calculated COUNT fields?

I have the following (simplified) database schema:
Persons:
[Id] [Name]
-------------------
1 'Peter'
2 'John'
3 'Anna'
Items:
[Id] [ItemName] [ItemStatus]
-------------------
10 'Cake' 1
20 'Dog' 2
ItemDocuments:
[Id] [ItemId] [DocumentName] [Date]
-------------------
101 10 'CakeDocument1' '2016-01-01 00:00:00'
201 20 'DogDocument1' '2016-02-02 00:00:00'
301 10 'CakeDocument2' '2016-03-03 00:00:00'
401 20 'DogDocument2' '2016-04-04 00:00:00'
DocumentProcessors:
[PersonId] [DocumentId]
-------------------
1 101
1 201
2 301
I have also set up an SQL fiddle to play with: http://www.sqlfiddle.com/#!3/e6082
The relation logic is the following: every Person can work on zero or infinite number of ItemDocuments (many-to-many); each ItemDocument belongs to exactly one Item (one-to-many). Item has status 1 - Active, 2 - Closed
What I need is a report that fulfills the following requirements:
for each person in Persons table, display count of Items that have ItemDocuments related to this person
the counts should be split in two columns by ItemStatus
the query should be filterable by two optional date periods (using two BETWEEN conditions on ItemDocuments.Date field) and the Item counts should also be split into two periods
if a Person does not have any ItemDocuments assigned, it still should be shown in the results with all count values set to 0
if a Person has more than one ItemDocument for an Item, the Item still should be counted only once
Essentially, here is how the results should look like if I use both periods to NULL (to read all the data):
[PersonName] [Active Items for period 1] [Closed Items for period 1] [Active Items for period 2] [Closed Items for period 2]
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
'Peter' 1 1 1 1
'John' 1 0 1 0
'Anna' 0 0 0 0
While I can create an SQL query for each requirement separately, I have a problem to understand how to combine all of them together into one.
For example, I can split ItemStatus counts in two columns using
COUNT(CASE WHEN t.ItemStatus = 1 THEN 1 ELSE NULL END) AS Active,
COUNT(CASE WHEN t.ItemStatus = 2 THEN 1 ELSE NULL END) AS Closed
and I can filter by two periods (with max/min date constants from MS SQL server specification to avoid NULLs for optional period dates) using
between coalesce(#start1, '1753-01-01') and coalesce(#end1, '9999-12-31')
between coalesce(#start2, '1753-01-01') and coalesce(#end2, '9999-12-31')
but how to combine all of this together, considering also JOINs between tables?
Is there any technique, join or MS SQL Server specific approach to do this in efficient way?
My first attempt seems to work as required but it looks like ugly subquery duplications multiple times:
DECLARE #start1 DATETIME, #start2 DATETIME, #end1 DATETIME, #end2 DATETIME
-- SET #start2 = '2017-01-01'
SELECT
p.Name,
(SELECT COUNT(1)
FROM Items i
WHERE i.ItemStatus = 1 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start1, '1753-01-01') AND COALESCE(#end1, '9999-12-31')
)
) AS Active1,
(SELECT COUNT(*)
FROM Items i
WHERE i.ItemStatus = 2 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start1, '1753-01-01') AND COALESCE(#end1, '9999-12-31')
)
) AS Closed1,
(SELECT COUNT(1)
FROM Items i
WHERE i.ItemStatus = 1 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start2, '1753-01-01') AND COALESCE(#end2, '9999-12-31')
)
) AS Active2,
(SELECT COUNT(*)
FROM Items i
WHERE i.ItemStatus = 2 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start2, '1753-01-01') AND COALESCE(#end2, '9999-12-31')
)
) AS Closed2
FROM Persons p
I'm not absolutely sure if I really got what you want, but you might try this
WITH AllData AS
(
SELECT p.Id AS PersonId
,p.Name AS Person
,id.Date AS DocDate
,id.DocumentName AS DocName
,i.ItemName AS ItemName
,i.ItemStatus AS ItemStatus
,CASE WHEN id.Date BETWEEN COALESCE(#start1, '1753-01-01') AND COALESCE(#end1, '9999-12-31') THEN 1 ELSE 0 END AS InPeriod1
,CASE WHEN id.Date BETWEEN COALESCE(#start2, '1753-01-01') AND COALESCE(#end2, '9999-12-31') THEN 1 ELSE 0 END AS InPeriod2
FROM Persons AS p
LEFT JOIN DocumentProcessors AS dp ON p.Id=dp.PersonId
LEFT JOIN ItemDocuments AS id ON dp.DocumentId=id.Id
LEFT JOIN Items AS i ON id.ItemId=i.Id
)
SELECT PersonID
,Person
,COUNT(CASE WHEN ItemStatus = 1 AND InPeriod1 = 1 THEN 1 ELSE NULL END) AS ActiveIn1
,COUNT(CASE WHEN ItemStatus = 2 AND InPeriod1 = 1 THEN 1 ELSE NULL END) AS ClosedIn1
,COUNT(CASE WHEN ItemStatus = 1 AND InPeriod2 = 1 THEN 1 ELSE NULL END) AS ActiveIn2
,COUNT(CASE WHEN ItemStatus = 2 AND InPeriod2 = 1 THEN 1 ELSE NULL END) AS ClosedIn2
FROM AllData
GROUP BY PersonID,Person

Why MySQL full outer join returns nulls?

Why MySQL full outer join returns nulls?
Hi
I have the following data:
s_id,date,p_id,amount_sold
1, '2015-10-01', 1, 10
2, '2015-10-01', 2, 12
7, '2015-10-01', 1, 11
3, '2015-10-02', 1, 11
4, '2015-10-02', 2, 10
5, '2015-10-15', 1, 22
6, '2015-10-16', 2, 20
8, '2015-10-22', 3, 444
and i want my query to output something like this: (A = sum of amount_sold for p_id=1 for that date,B = sum of amount_sold for p_id=2 for that date)
date,A,B,Difference
'2015-10-01',21,12,9
'2015-10-02',11,10,1
'2015-10-15',22,0,22
'2015-10-01',0,20,-20
I tried with this query, but the order its returning is having NULLS and the output is wrong:
SELECT A.p_id,A.date,sum(A.amount_sold) A,B.Bs, (sum(A.amount_sold) - B.Bs) as difference FROM sales as A
LEFT JOIN (
SELECT SUM( amount_sold ) Bs,p_id,s_id, DATE
FROM sales
WHERE p_id =2
group by date
) as B ON A.s_id = B.s_id
where A.p_id=1 or B.p_id=2
group by A.date, A.p_id
UNION
SELECT A.p_id,A.date,sum(A.amount_sold) A,B.Bs, (sum(A.amount_sold) - B.Bs) as difference FROM sales as A
RIGHT JOIN (
SELECT SUM( amount_sold ) Bs,p_id,s_id, DATE
FROM sales
WHERE p_id =2
group by date
) as B ON A.s_id = B.s_id
where B.p_id=2
group by A.date, A.p_id
It returned:
p_id date A Bs difference
1 2015-10-01 21 NULL NULL
2 2015-10-01 12 12 0
1 2015-10-02 11 NULL NULL
2 2015-10-02 10 10 0
1 2015-10-15 22 NULL NULL
2 2015-10-16 20 20 0
What am i doing wrong here? and what is the correct way of doing it? any help would be appreciated.
A full join isn't needed. You can use conditional aggregation instead:
select
date,
sum(case when p_id = 1 then amount_sold else 0 end) a,
sum(case when p_id = 2 then amount_sold else 0 end) b,
sum(case when p_id = 1 then amount_sold else 0 end)
- sum(case when p_id = 2 then amount_sold else 0 end) difference
from sales
where p_id in (1,2)
group by date

Complex querying on table with multiple userids

I have a table like this:
score
id week status
1 1 0
2 1 1
3 1 0
4 1 0
1 2 0
2 2 1
3 2 0
4 2 0
1 3 1
2 3 1
3 3 1
4 3 0
I want to get all the id's of people who have a status of zero for all weeks except for week 3. something like this:
Result:
result:
id w1.status w2.status w3.status
1 0 0 1
3 0 0 1
I have this query, but it is terribly inefficient on larger datasets.
SELECT w1.id, w1.status, w2.status, w3.status
FROM
(SELECT s.id, s.status
FROM score s
WHERE s.week = 1) w1
LEFT JOIN
(SELECT s.id, s.status
FROM score s
WHERE s.week = 2) w2 ON w1.id=w2.id
LEFT JOIN
(SELECT s.id, s.status
FROM score s
WHERE s.week = 3) w3 ON w1.id=w3.id
WHERE w1.status=0 AND w2.status=0 AND w3.status=1
I am looking for a more efficient way to calculate the above.
select id
from score
where week in (1, 2, 3)
group by id
having sum(
case
when week in (1, 2) and status = 0 then 1
when week = 3 and status = 1 then 1
else 0
end
) = 3
Or more generically...
select id
from score
group by id
having
sum(case when status = 0 then 1 else 0 end) = count(*) - 1
and min(case when status = 1 then week else null end) = max(week)
You can do using not exists as
select
t1.id,
'0' as `w1_status` ,
'0' as `w2_status`,
'1' as `w3_status`
from score t1
where
t1.week = 3
and t1.status = 1
and not exists(
select 1 from score t2
where t1.id = t2.id and t1.week <> t2.week and t2.status = 1
);
For better performance you can add index in the table as
alter table score add index week_status_idx (week,status);
In case of static number of weeks (1-3), group_concat may be used as a hack..
Concept:
SELECT
id,
group_concat(status) as totalStatus
/*(w1,w2=0,w3=1 always!)*/
FROM
tableName
WHERE
totalStatus = '(0,0,1)' /* w1=0,w2=1,w3=1 */
GROUP BY
id
ORDER BY
week ASC
(Written on the go. Not tested)
SELECT p1.id, p1.status, p2.status, p3.status
FROM score p1
JOIN score p2 ON p1.id = p2.id
JOIN score p3 ON p2.id = p3.id
WHERE p1.week = 1
AND p1.status = 0
AND p2.week = 2
AND p2.status = 0
AND p3.week = 3
AND p3.status = 1
Try this, should work