SQL - Two SUMS on the same column need to be mutually exclusive - mysql

I am trying to write a SQL query in MySQL Workbench that will return to me the sums of records I moved to a particular status considering only the latest timestamp for a particular record. I also need to do this without a sub query (or nested select).
Given the below table, I want to know that user with id 1 moved two records to status with id 2. I need to not include in my counts if the same record was moved to two different status ids, but only count the latest status id.
Table
user_id
acted_on_record_id
moved_to_status_id
timestamp
1
1234
2
2022-01-01 19:39:37
1
1234
3
2022-01-02 19:39:37
1
1234
2
2022-01-03 19:39:37
1
5678
2
2022-01-03 19:39:37
Here is the query I have so far:
SELECT t1.user_id, t1.acted_on_record_id,
SUM(DISTINCT IF(t1.moved_to_status_id = 3, 1, 0)) AS pending,
SUM(DISTINCT IF(t1.moved_to_status_id = 2, 1, 0)) AS open,
MAX(t1.timestamp) as timestamp
FROM table1 t1
GROUP BY t1.user_id, t1.acted_on_record_id
This is the result I want:
user_id
acted_on_record_id
pending
open
timestamp
1
1234
0
1
2022-01-03 19:39:37
1
5678
0
1
2022-01-03 19:39:37
However, my query gives me this result:
user_id
acted_on_record_id
pending
open
timestamp
1
1234
1
1
2022-01-03 19:39:37
1
5678
0
1
2022-01-03 19:39:37
It shows a 1 in both pending and 1 in open columns because the SUM IF aggregates are not mutually exclusive or distinct on the acted_on_record_id. Is there a way to have these two aggregates know about each other and only sum the one with the greater timestamp without using a sub query (nested select)?

I eventually figured it out by expanding on a solution here: Retrieving the last record in each group - MySQL
I used a LEFT JOIN to compare the table against itself. This query returned in 1.095 seconds where my prior solution (not posted) using a subquery returned in 15.268 seconds.
SELECT t1.user_id, t1.acted_on_record_id,
SUM(IF(t1.moved_to_status_id = 3, 1, 0)) AS pending,
SUM(IF(t1.moved_to_status_id = 2, 1, 0)) AS open
MAX(t1.timestamp) as timestamp
FROM table1 t1 LEFT JOIN table1 t2
ON (t1.acted_on_record_id = t2.acted_on_record_id AND t1.user_id = t2.user_id AND t1.id < t2.id)
WHERE t2.user_id IS NULL
group by t1.user_id, t1.acted_on_record_id, t1.moved_to_status_id

Related

SQL Query to filter rows using a sub query which queries the same table

I am trying to build a query which filters rows based on other rows as in the following example
TableA
Id status type date
300 approved ACTIVE 11/12/2015 10:00:00
300 approved ACTIVE 11/12/2015 10:10:00
300 approved INACTIVE 11/12/2015 11:00:00
200 approved ACTIVE 11/12/2015 11:10:00
200 approved INACTIVE 11/12/2015 11:10:00
100 approved ACTIVE 11/12/2015 11:10:00
From the above table I am trying to return Ids that have equal number of ACTIVE and INACTIVE types like Id 300 has two rows with type ACTIVE and one row with type INACTIVE, so Id 300 should be excluded from my results where as 200 has 1 active and 1 inactive and it should be included in the results
In the above table status, type can have other values too but I care only the ones listed above ignoring others
So, for TableA final result of the query would be
Id
200
I tried to run the below query but that didnt give me the results I expected
SELECT Id
FROM TableA oa
WHERE oa.type in('ACTIVE','INACTIVE')
and oa.status='APPROVED'
and not EXISTS(SELECT
x.id
,COUNT(*)
FROM (SELECT DISTINCT
id
,c.type
FROM TableA c
WHERE
AND c.type IN ('ACTIVE','INACTIVE')
AND c.status = 'APPROVED'
AND c.Id= ta.Id
)x
GROUP BY x.Id
HAVING COUNT(*) = 1)
Can the above query be corrected to get the required results?
You just need to use SUM and HAVING:
SELECT Id
FROM TableA
WHERE
status = 'approved'
AND type In ('ACTIVE', 'INACTIVE')
GROUP BY Id
HAVING
SUM(CASE WHEN type = 'ACTIVE' THEN 1 ELSE 0 END) =
SUM(CASE WHEN type = 'INACTIVE' THEN 1 ELSE 0 END)

MySQL Query to find users still inside room

Below is my database table, where I will have Check In and Check Out entry records for attending the conference room.
id registration_id roomno day type
1 101 1 2 In
2 103 1 2 In
3 101 1 2 Out
4 105 1 2 In
5 103 1 2 Out
6 101 1 2 In
7 103 1 2 In
8 101 1 2 Out
9 105 1 2 In
10 103 1 2 Out
Now, I want to select those records, which are still attending the conference. Condition is like their last record should be type = In. There can be multiple In/Out entries for each user during a day.
Please let me know the quickest possible MySQL query.
Thanks
Answer which I ended up using:
select * from `registrations_inouts` t
group by t.registration_id
having max(id) = max(case when type = 'In' then id end)
order by rand() limit 1;
Here is one method using not exists:
select *
from t
where t.type = 'In' and
not exists (select 1
from t t2
where t2.registration_id = t.registration_id and t2.type = 'Out'
);
Another method uses conditional aggregation:
select t.registration_id
from t
group by t.registration_id
having max(id) = max(case when type = 'In' then id end);
Note: both of these assume that the ids are assigned sequentially in time, so larger ids are later in time.

MySQL - add column to table and insert "tag" if order is from new customer

I have simple table:
Order_ID Client_ID Date Order_Status
1 1 01/01/2015 3
2 2 05/01/2015 3
3 1 06/01/2015 3
4 2 10/01/2015 3
5 1 12/01/2015 4
6 1 05/02/2015 3
I want to identify orders from new customers which are orders in same month in which that customer made first order with Order_Status = 3
So the output table should look like this:
Order_ID Client_ID Date Order_Status Order_from_new_customer
1 1 01/01/2015 3 yes
2 2 05/01/2015 3 yes
3 1 06/01/2015 3 yes
4 2 10/01/2015 3 yes
5 1 12/01/2015 4 NULL
6 1 05/02/2015 3 no
I wasn't able to successfully figure out the query. Thanks a lot for any help.
Join with a subquery that gets the date of the first order by each customer.
SELECT o.*, IF(MONTH(o.date) = MONTH(f.date) AND YEAR(o.date) = YEAR(f.date),
'yes', 'no') AS order_from_new_customer
FROM orders AS o
JOIN (SELECT Client_ID, MIN(date) AS date
FROM orders
WHERE Order_Status = 3
GROUP BY Client_ID) AS f
ON o.Client_ID = f.Client_ID
Use a CASE statement along with a SELF JOIN like below
select t1.*,
case when t1.Order_Status = 3 and MONTH(t1.`date`) = 1 then 'yes'
when t1.Order_Status = 3 and MONTH(t1.`date`) <> 1 then 'no'
else null end as Order_from_new_customer
from order_table t1 join order_table t2
on t1.Order_ID < t2.Order_ID
and t1.Client_ID = t2.Client_ID;
If your order table gets big, the solutions from Rahul and Barmar will tend to get slow.
I would hope your shop will get many orders and you will run into performance trouble ;-). So I would suggest marking the very first order of a new customer with a tinyint column, and when you have the comfort of a tinyint, you could code it like:
0 : unknown
1 : very first order
2 : order in first month
3 : order in "grown-up" mode.
The very first order you could probably mark easily, everyone loves a bright new customer enough to store this event somehow during first ordering. The other orders you can identify in a background job / cronjob by there "0" for unknown, or you mark your old customers and store the "3" on their orders.
The result-set can be achieved without any table-join or subquery:
select
if(Order_Status<>3,null,if(#first_date:=if(#prev_client_id!=Client_ID,month(date),#first_date)=month(date),"yes","no")) as Order_from_new_customer
,Order_ID,Client_ID,date,Order_Status,#prev_client_id:=client_id
from
t1,
(select #prev_client_id:="",#first_date:="")t
order by Client_ID ,date
One extra column added for computation and order by clause is used.
Verify result at http://sqlfiddle.com/#!9/83c29f/24

Incorrect GROUP by function due to CONCAT

I am using SUM() to count lines with multiple conditions in the following manner
SELECT date(time),
SUM(CASE WHEN crit1 = 1 AND crit2 NOT LIKE CONCAT('%',table1.id,'%') THEN 1 ELSE 0 END)
FROM table1, table2
GROUP BY date(time)
However what I noticed is that # of items that are counted is a multiple of items in the table1. So if there are 100 items that meet crit1 and crit2 criteria, and there are 5 items in table1, it would give me 500 items summed.
I added and removed items from table1 and it proportionally affected the SUM clause, to verify this.
How can it be summed without double counting for every case in CONCAT, or maybe a better way of counting all together?
Data structure:
table1 table2
id name time crit1 crit2
123 A 2013-05-15 05:00:00 1 456
234 B 2013-05-15 05:00:00 2 789
345 C 2013-05-15 05:00:00 1 678
Note: IDs are unique
Desired output:
2013-05-15 2
I believe you want this:
SELECT date(time), COUNT(crit2)
FROM table2
WHERE
table2.crit1 = 1
AND
table2.crit2 NOT IN (SELECT id FROM table1)
GROUP BY date(time)

MySQL order a table by groups

I have a table with the fields id, group, left, level and createdAt.
Every row belongs to a group. The row with level = 0 is the group "leader".
I want to sort the table by the leaders' date, and within each group sort the rows by left. For example, in this table:
Id - Group - Left - Level - CreatedAt
1 1 1 0 00:10
2 1 2 1 00:20
3 2 1 0 00:00
4 1 3 1 00:30
5 2 2 1 00:40
The order should be:
Id - Group - Left - Level - CreatedAt
3 2 1 0 00:00
5 2 2 1 00:40
1 1 1 0 00:10
2 1 2 1 00:20
4 1 3 1 00:30
Because row 3 is the newest group leader, it should be first and followed by all it's group ordered by left. After that is row 1 which is the second most new leader, followed by it's group ordered by left.
Etc..
I hope I explained it clear enough.
Thanks!
Essentially, you need to join your table with the leader's time:
SELECT my_table.*
FROM my_table NATURAL JOIN (
SELECT my_table.Group, MIN(my_table.CreatedAt) AS LeaderTime
FROM my_table
WHERE my_table.Level = 0
GROUP BY my_table.Group
) t
ORDER BY t.LeaderTime, my_table.Left
See it on sqlfiddle.
If you can guarantee that there is an unambiguous leader for every group—e.g. because you have defined a UNIQUE constraint on (Group, Level), which you cannot have because your example contains two records in Group = 1 with Level = 1—then you can avoid the grouping operation:
SELECT my_table.*
FROM my_table JOIN my_table AS leader
ON leader.Group = my_table.Group AND leader.Level = 0
ORDER BY leader.CreatedAt, my_table.Left
In order to sort on the group leader's createdAt, you can join each row to the row describing the group leader, then do the sorting:
SELECT t1.*
FROM my_table t1
INNER JOIN my_table t2
ON t1.Group = t2.Group AND t2.Level = 0
ORDER BY t2.CreatedAt, t1.Left;