I have to do a query and I can't figure it out. I have an actions table ( user_id , action, created_at ), and I need to retrieve all users who performed the same actions as current_user ( in exact order).
ex.
current_user delete 2022/03/19 13:40
current_user add_post 2022/03/19 13:45
current_user write_comment 2022/03/22 13:48
Query result:
user_5 delete 2021/03/15 14:50
user_5 add_post 2021/05/15 13:50
user_5 write_comment 2022/06/06 14:30
user_6 delete 2021/03/15 14:50
user_6 add_post 2021/05/15 13:50
user_6 write_comment 2022/06/06 14:30
( all users with same actions )
You don't stipulate that the exact matching has to be in groups of 3. The following query will identify exact action sequences of 1, 2, 3 or even more then 3:
CREATE TABLE actions(
user_id VARCHAR(50) NOT NULL
,action VARCHAR(50) NOT NULL
,created_at DATE NOT NULL
);
INSERT INTO actions(user_id,action,created_at)
VALUES
('current_user','delete','2022/03/19 13:40')
, ('current_user','add_post','2022/03/19 13:45')
, ('current_user','write_comment','2022/03/22 13:48')
, ('other_user','delete','2022/02/19 13:40')
, ('other_user','add_post','2022/02/19 13:45')
, ('other_user','write_comment','2022/02/22 13:48')
, ('diff_user','delete','2022/02/19 12:40')
, ('diff_user','add_post','2022/02/19 12:42')
, ('diff_user','other_action','2022/02/19 12:45')
, ('diff_user','write_comment','2022/02/22 12:48')
;
It appears (through a comment) that you can use the lag() so I suggest using row_number() to match the action sequences of current user to other users, as follows:
SELECT ou.*
FROM (
SELECT *
, row_number() OVER (
PARTITION BY user_id ORDER BY created_at
) AS rn
FROM actions
WHERE user_id <> 'current_user'
) AS OU
INNER JOIN (
SELECT *
, row_number() OVER (
ORDER BY created_at
) AS rn
FROM actions
WHERE user_id = 'current_user'
) AS CU ON ou.rn = cu.rn
AND ou.action = cu.action
result
+------------+---------------+---------------------+----+
| user_id | action | created_at | rn |
+------------+---------------+---------------------+----+
| diff_user | delete | 2022-02-19 12:40:00 | 1 |
| diff_user | add_post | 2022-02-19 12:42:00 | 2 |
| other_user | delete | 2022-02-19 13:40:00 | 1 |
| other_user | add_post | 2022-02-19 13:45:00 | 2 |
| other_user | write_comment | 2022-02-22 13:48:00 | 3 |
+------------+---------------+---------------------+----+
Now if you really did want to limit this to sequence match of just 3, then you could subsequently count(*) over(partition by user_id) then filter for when that calculation is 3:
SELECT *
FROM (
SELECT ou.*
, count(*) OVER (PARTITION BY ou.user_id) AS cn
FROM (
SELECT *
, row_number() OVER (
PARTITION BY user_id ORDER BY created_at
) AS rn
FROM actions
WHERE user_id <> 'current_user'
) AS OU
INNER JOIN (
SELECT *
, row_number() OVER (
ORDER BY created_at
) AS rn
FROM actions
WHERE user_id = 'current_user'
) AS CU ON ou.rn = cu.rn
AND ou.action = cu.action
) d
WHERE cn = 3
result
+------------+---------------+---------------------+----+----+
| user_id | action | created_at | rn | cn |
+------------+---------------+---------------------+----+----+
| other_user | write_comment | 2022-02-22 13:48:00 | 1 | 3 |
| other_user | add_post | 2022-02-19 13:45:00 | 2 | 3 |
| other_user | delete | 2022-02-19 13:40:00 | 3 | 3 |
+------------+---------------+---------------------+----+----+
for reference db<>fiddle (nb: using postgres as MySQL 8 wasn't available at the time)
btw: you probably also need to introduce the concept of "session" into this logic but that is left for you to consider.
Related
I am writing this query using windows function Row_Number() which will find out duplicates and i am trying to delete those duplicates.
To do this i have written CTE and included window function it and attempting to delete duplicate row. However, i am getting error saying delete is not updatable.
select * from housingdata;
.
.
.
with rownumcte as (
select * ,row_number() over (partition by ParcelID, PropertyAddress,
SalePrice,saledate,LegalReference order by UniqueID) as rownum
from housingdata)
delete
from rownumcte
where rownum>1;
if i use select instead of delete i am getting following output containing duplicates which is 104 rows
Yes CTE are for many things very good, but for your purpose not.
Use instead a INNER JOIN.
CREATE TABLE housingdata (UniqueID int,
ParcelID int
, PropertyAddress varchar(50)
,
SalePrice DECIMAL(10,2)
,saledate Date
,LegalReference int)
INSERT INTO housingdata VALUES (1,1,'test',1.1, NOW(), 1),(2,1,'test',1.1, NOW(), 1)
delete hd
FROM housingdata hd INNER JOIN
(
select UniqueID ,row_number() over (partition by ParcelID, PropertyAddress,
SalePrice,saledate,LegalReference order by UniqueID) as rownum
from housingdata) t1 ON hd.UniqueID = t1.UniqueID
WHERE t1.rownum>1;
SELECT * FROM housingdata
UniqueID | ParcelID | PropertyAddress | SalePrice | saledate | LegalReference
-------: | -------: | :-------------- | --------: | :--------- | -------------:
1 | 1 | test | 1.10 | 2022-02-25 | 1
db<>fiddle here
UPDATE
You could have used also the CTE as joined table
with rownumcte as (
select UniqueID ,row_number() over (partition by ParcelID, PropertyAddress,
SalePrice,saledate,LegalReference order by UniqueID) as rownum
from housingdata)
delete hd
from housingdata hd INNER JOIN rownumcte r ON hd.UniqueID = r.UniqueID
where rownum>1;
SELECT * FROM housingdata
UniqueID | ParcelID | PropertyAddress | SalePrice | saledate | LegalReference
-------: | -------: | :-------------- | --------: | :--------- | -------------:
1 | 1 | test | 1.10 | 2022-02-25 | 1
db<>fiddle here
I need to select all the last messages for each conversation for user with given id.
In case of last message was send to given id, it have to be last message from sender.
Here is the test case without creationDate using messageID:
+-----------+------------+----------+------+
| messageID | fromUserID | toUserID | text |
+-----------+------------+----------+------+
| 1 | 1 | 2 | 'aa' |
| 2 | 1 | 3 | 'ab' |
| 3 | 2 | 1 | 'ac' |
| 4 | 2 | 1 | 'ad' |
| 5 | 3 | 2 | 'ae' |
+-----------+------------+----------+------+
The result for userID=1 have to be messages with text 'ab' and 'ad'.
For now I have this query with all of the last messages of every user to each other, but does not remove, according to my test case, message with id=1 (have to be only with id=2 and id=4).
SELECT
UM.messageID,
UM.fromUserID, UM.toUserID,
UM.text, UM.flags, UM.creationDate
FROM UserMessage AS UM
INNER JOIN
(
SELECT
MAX(messageID) AS maxMessageID
FROM UserMessage
GROUP BY fromUserID, toUserID
) IUM
ON UM.messageID = IUM.maxMessageID
WHERE UM.fromUserID = 1 OR UM.toUserID = 1
ORDER BY UM.messageID DESC
A simple method is
select um.*
from usermessage um
where um.messageid = (select min(um2.messageid)
from usermessage um2
where (um2.fromuserid, touserid) in ( (um.fromuserid, um.touserid), (um.touserid, um.fromuserid) )
);
Or, in MySQL 8+:
select um.*
from (select um.*,
row_number() over (partition by least(um.fromuserid, um.touserid), greatest(um.fromuserid, um.touserid) order by um.messageid desc) as seqnum
from usermessage um
) um
where seqnum = 1;
I'm currently in the process of converting data from one structure to another, and in the process I have to take a status id from the first entry in the group and apply it to the last entry in that same group. I am able to target and update the last item in the group just fine when using a hard-coded value, but I'm hitting a wall when trying to use the status_id from the first entry. Here is an example of the data structure.
-----------------------------------------------------------
| id | ticket_id | status_id | new_status_id | created_at |
-----------------------------------------------------------
| 1 | 10 | NULL | 3 | 2018-06-20 |
| 2 | 10 | 1 | 1 | 2018-06-22 |
| 3 | 10 | 1 | 1 | 2018-06-23 |
| 4 | 10 | 1 | 1 | 2018-06-26 |
-----------------------------------------------------------
So the idea would be to take the new_status_id of ID 1 and apply it to the same field for ID 4.
Here is the query that works when using a hard-coded value
UPDATE Communications_History as ch
JOIN
(
SELECT communication_id, MAX(created_at) max_time, new_status_id
FROM Communications_History
GROUP BY communication_id
) ch2
ON ch.communication_id = ch2.communication_id AND ch.created_at = ch2.max_time
SET ch.new_status_id = 3
But when I use the following query, I get Unknown column ch.communication_id in where clause
UPDATE Communications_History as ch
JOIN
(
SELECT communication_id, MAX(created_at) max_time, new_status_id
FROM Communications_History
GROUP BY communication_id
) ch2
ON ch.communication_id = ch2.communication_id AND ch.created_at = ch2.max_time
SET ch.new_status_id = (
SELECT nsi FROM
(
SELECT new_status_id FROM Communications_History WHERE communication_id = ch.communication_id AND status_id IS NULL
) as ch3
)
Thanks!
So I just figured it out using variables. It turns out the original "solution" only worked when there was one ticket's worth of history in the table, but when all the data was imported, it no longer worked. However, this tweak did seem to fix the issue.
UPDATE Communications_History as ch
JOIN
(
SELECT communication_id, MAX(created_at) max_time, new_status_id
FROM Communications_History
GROUP BY communication_id
) ch2
ON ch.communication_id = ch2.communication_id AND ch.created_at = ch2.max_time
SET ch.new_status_id = ch2.new_status_id;
If I have a table of cases:
CASE_NUMBER | CASE_ID | STATUS | SUBJECT |
----------------------------------------------------------------
3108 | 123456 | Closed_Billable | Something Interesting
3109 | 325124 | Closed_Billable | Broken printer
3110 | 432432 | Open_Assigned | Email not working
And a table of calls:
PARENT_ID | STATUS | DUR(H) | DUR(M) | SUBJECT
---------------------------------------------------------------
123456 | Held | 1 | 30 | Initial discussion
123456 | Cancelled | 0 | 0 | Walk user through
123456 | Held | 0 | 45 | Remote debug session
325124 | Held | 1 | 0 | Consultation
325124 | Held | 1 | 15 | Needs assessment
432432 | Held | 1 | 30 | Support call
And a table of meetings:
PARENT_ID | STATUS | DUR(H) | DUR(M) | SUBJECT
-------------------------------------------------------
123456 | Held | 3 | 15 | On-site work
325124 | Held | 2 | 0 | Un-jam printer
432432 | Held | 1 | 0 | Reconnect network
How do I do a select with these parameters (this is not working code, obviously):
SELECT cases.case_number, cases.subject, calls.subject, meetings.subject
WHERE cases.status="Closed_Billable" AND (calls.status="Held" OR meetings.status="Held)
LEFT JOIN cases
ON cases.case_id = calls.parent_id
LEFT JOIN cases
ON cases.case_id = meetings.parent_id
and end up with a "faked" nested table like:
CASE_NUMBER | CASE SUBJECT | # CALLS | # MEETINGS | CALL SUBJECT | MEETING SUBJECT | DURATION (H) | DURATION (M) | TOTAL
-----------------------------------------------------------------------------------------------------------------------------------------
3108 | Something Interesting | 2 | 1 | | | | | 5.5H
| | | | Initial Discussion | | 1 | 30 |
| | | | Remote Debug Session | | 0 | 45 |
| | | | | On-site work | 3 | 15 |
3109 | Broken printer | 2 | 1 | | | | | 4.25H
| | | | Consultation | | 1 | 0 |
| | | | Needs assessment | | 1 | 15 |
| | | | | Un-jam printer | 2 | 0 |
I've tried joins and subqueries the best I can figure out, but I get repeated entries - for example, each Meeting in a Case will show say 3 times, once for each Call in that case.
I'm stumped! Obviously there's other fields I'm pulling here, and doing COUNTs of Calls and Meetings, and SUMs of their durations, but I'd be happy just to show a table/sub-table like this.
Is it even possible?
Thanks,
David.
Assembling a query result in the exact format you want is .. somewhat of a pain. It can be done, but presentation stuff like that is best left to the application.
That said, this will do what you want:
select case when case_id > floor(case_id) then ''
else case_number
end case_number,
coalesce(q1.c, '') calls,
coalesce(q2.c, '') meetings,
coalesce(calls.subject, '') `call subject`,
coalesce(meetings.subject, '') `meeting subject`,
case when calls.subject is not null then calls.dhour
when meetings.subject is not null then meetings.dhour
else ''
end dhour,
case when calls.subject is not null then calls.dmin
when meetings.subject is not null then meetings.dmin
else ''
end dhour,
coalesce(q3.total, '') total
from
(
select case_number, case_id
from cases where status = 'Closed_Billable'
union select case_number, concat(case_id, '.1')
from cases where status = 'Closed_Billable'
union select case_number, concat(case_id, '.2')
from cases where status = 'Closed_Billable'
) main
left join
(select parent_id, count(*) c
from calls
where status != 'Cancelled'
group by parent_id ) q1
on q1.parent_id = case_id
left join
(select parent_id, count(*) c
from meetings
group by parent_id) q2
on q2.parent_id = case_id
left join
(select parent_id, sum(dhour + m) total
from
(select parent_id, dhour, dmin / 60 m
from calls
where status != 'Cancelled'
union all
select parent_id, dhour, dmin / 60 m
from meetings
) qq
group by parent_id
) q3
on q3.parent_id = case_id
left join calls
on concat(calls.parent_id, '.1') = main.case_id
left join meetings
on concat(meetings.parent_id, '.2') = main.case_id
order by case_id asc
Note, i've renamed your duration fields because i dislike the parenthesis in them.
We have to mangle the case_id a little bit inside the query in order to be able to get you your blank rows / fields - those are what makes the query cumbersome
There's a demo here: http://sqlfiddle.com/#!9/d59d4/21
edited code to work with different schema in comment fiddle
select case when case_id > floor(case_id) then ''
else case_number
end case_number,
coalesce(q1.c, '') calls,
coalesce(q2.c, '') meetings,
coalesce(calls.name, '') `call subject`,
coalesce(meetings.name, '') `meeting subject`,
case when calls.name is not null then calls.duration_hours
when meetings.name is not null then meetings.duration_hours
else ''
end duration_hours,
case when calls.name is not null then calls.duration_minutes
when meetings.name is not null then meetings.duration_minutes
else ''
end duration_hours,
coalesce(q3.total, '') total
from
(
select case_number, id as case_id
from cases where status = 'Closed_Billable'
union select case_number, concat(id, '.1') as case_id
from cases where status = 'Closed_Billable'
union select case_number, concat(id, '.2') as case_id
from cases where status = 'Closed_Billable'
) main
left join
(select parent_id, count(*) c
from calls
where status != 'Cancelled'
group by parent_id ) q1
on q1.parent_id = case_id
left join
(select parent_id, count(*) c
from meetings
group by parent_id) q2
on q2.parent_id = case_id
left join
(select parent_id, sum(duration_hours + m) total
from
(select parent_id, duration_hours, duration_minutes / 60 m
from calls
where status != 'Cancelled'
union all
select parent_id, duration_hours, duration_minutes / 60 m
from meetings
) qq
group by parent_id
) q3
on q3.parent_id = case_id
left join calls
on concat(calls.parent_id, '.1') = main.case_id
left join meetings
on concat(meetings.parent_id, '.2') = main.case_id
order by case_id asc
You can't really get final results like that without some seriously ugly "wrapper" queries, of this sort:
SET #prevCaseNum := 'blahdyblahnowaythisshouldmatchanything';
SET #prevCaseSub := 'seeabovetonotmatchanything';
SELECT IF(#prevCaseNum = CASE_NUMBER, '', CASE_NUMBER) AS CASE_NUMBER
, IF(#prevCaseNum = CASE_NUMBER AND #prevCaseSubject = CASE_SUBJECT, '', CASE_SUBJECT) AS CASE_SUBJECT
, etc.....
, #prevCaseNum := CASE_NUMBER AS prevCaseNum
, #prevCaseSubject = CASE_SUBJECT AS prevCaseSub
, etc....
FROM ( [the real query] ORDER BY CASE_NUMBER, etc....) AS trq
;
And then wrap all that with another select to strip the prevCase fields.
And even this still won't give you the blanks you want on the "upper right".
I have two tables: contacts and client_profiles. A contact has many client_profiles, where client_profiles has foreign key contact_id:
contacts:
mysql> SELECT id,first_name, last_name FROM contacts;
+----+-------------+-----------+
| id | first_name | last_name |
+----+-------------+-----------+
| 10 | THERESA | CAMPBELL |
| 11 | donato | vig |
| 12 | fdgfdgf | gfdgfd |
| 13 | some random | contact |
+----+-------------+-----------+
4 rows in set (0.00 sec)
client_profiles:
mysql> SELECT id, contact_id, created_at FROM client_profiles;
+----+------------+---------------------+
| id | contact_id | created_at |
+----+------------+---------------------+
| 6 | 10 | 2014-10-09 17:17:43 |
| 7 | 10 | 2014-10-10 11:38:01 |
| 8 | 10 | 2014-10-10 12:20:41 |
| 9 | 10 | 2014-10-10 12:24:19 |
| 11 | 12 | 2014-10-10 12:35:32 |
+----+------------+---------------------+
I want to get the latest client_profiles for each contact. That means There should be two results. I want to use subqueries to achieve this. This is the subquery I came up with:
SELECT `client_profiles`.*
FROM `client_profiles`
INNER JOIN `contacts`
ON `contacts`.`id` = `client_profiles`.`contact_id`
WHERE (client_profiles.id =
(SELECT `client_profiles`.`id` FROM `client_profiles` ORDER BY created_at desc LIMIT 1))
However, this is only returning one result. It should return client_profiles with id 9 and 11.
What is wrong with my subquery?
It looks like you were trying to filter twice on the client_profile table, once in the JOIN/ON clause and another time in the WHERE clause.
Moving everything in the where clause looks like this:
SELECT `cp`.*
FROM `contacts`
JOIN (
SELECT
`client_profiles`.`id`,
`client_profiles`.`contact_id`,
`client_profiles`.`created_at`
FROM `client_profiles`
ORDER BY created_at DESC
LIMIT 1
) cp ON `contacts`.`id` = `cp`.`contact_id`
Tell me what you think.
Should be something like maybe:
SELECT *
FROM `client_profiles`
INNER JOIN `contacts`
ON `contacts`.`id` = `client_profiles`.`contact_id`
GROUP BY `client_profiles`.`contact_id`
ORDER BY created_at desc;
http://sqlfiddle.com/#!2/a3f21b/9
You need to prequery the client profiles table grouped by each contact.. From that, re-join to the client to get the person, then again to the client profiles table based on same contact ID, but also matching the max date from the internal prequery using max( created_at )
SELECT
c.id,
c.first_name,
c.last_name,
IDByMaxDate.maxCreate,
cp.id as clientProfileID
from
( select contact_id,
MAX( created_at ) maxCreate
from
client_profiles
group by
contact_id ) IDByMaxDate
JOIN contacts c
ON IDByMaxDate.contact_id = c.id
JOIN client_profiles cp
ON IDByMaxDate.contact_id = cp.contact_id
AND IDByMaxDate.maxCreate = cp.created_at