SQL to get conversations from two tables - mysql

I have two tables, one that stores incoming messages and one that stores outgoing messages. What I would like is to be able to have a conversations view of the messages so that all incoming and outgoing messages from/to the same user id are grouped and the conversations are ordered by the most recent message (in or out)
Outgoing
----------
user_id
time
message
Incoming
----------
user_id
time
message
What I would like is to display the results such as
-> User A 9:10 pm Nice ...
<- User A 8:45 pm Our special is pepperoni!
-> User A 8:00 pm What's your special dish?
<- User B 9:00 pm We open at 5
-> User B 6:56 pm Hello What time to you open?
<- User C 8:43 pm Thanks!
-> User C 4:00 pm Loved the pizza today!!
Any idea how to write a query to do this?
EDIT
If user B then texts back in, the result should be:
-> User B 9:15 pm Ok great!
<- User B 9:00 pm We open at 5
-> User B 6:56 pm Hello What time to you open?
-> User A 9:10 pm Nice ...
<- User A 8:45 pm Our special is pepperoni!
-> User A 8:00 pm What's your special dish?
<- User C 8:43 pm Thanks!
-> User C 4:00 pm Loved the pizza today!!

You need to UNION the two tables and sort (ORDER BY) accordingly:
SELECT
'<-' AS direction, user_id, time, message
FROM
Outgoing
UNION ALL
SELECT
'->', user_id, time, message
FROM
Incoming
ORDER BY
user_id ASC,
time DESC ;
After the additional explanations for the complex ordering:
SELECT
CASE WHEN m.d = 1 THEN '<-' ELSE '->' END AS direction,
m.user_id, m.time, m.message
FROM
( SELECT
u.user_id,
GREATEST( COALESCE(mo.time, mi.time),
COALESCE(mi.time, mo.time) ) AS maxtime
FROM
( SELECT user_id FROM Outgoing
UNION
SELECT user_id FROM Incoming
) AS u
LEFT JOIN
( SELECT user_id, MAX(time) AS time FROM Outgoing GROUP BY user_id
) AS mo
ON mo.user_id = u.user_id
LEFT JOIN
( SELECT user_id, MAX(time) AS time FROM Incoming GROUP BY user_id
) AS mi
ON mi.user_id = u.user_id
) AS b
JOIN
( SELECT 1 AS d, user_id, time, message FROM Outgoing
UNION ALL
SELECT 2 AS d, user_id, time, message FROM Incoming
) AS m
ON m.user_id = b.user_id
ORDER BY
b.maxtime ASC,
m.user_id ASC,
m.time DESC ;

Something like this should get the results by user and time. You would need to handle the display at the application level to show messages per user:
select * from (
select '->' as direction, o.* from outgoing o
union
select '<-' as direction, i.* from incoming i
) M
order by user_id asc, time desc
Sample Output:
| DIRECTION | USER_ID | TIME | MESSAGE |
----------------------------------------------------------------------------------------
| -> | 1 | November, 29 2012 21:10:00+0000 | Nice ... |
| <- | 1 | November, 29 2012 20:45:00+0000 | Our special is pepperoni! |
| -> | 1 | November, 29 2012 20:00:00+0000 | What''s your special dish? |
| <- | 2 | November, 29 2012 21:00:00+0000 | We open at 5 |
| -> | 2 | November, 29 2012 18:56:00+0000 | Hello What time to you open? |
| <- | 3 | November, 29 2012 20:43:00+0000 | Thanks! |
| -> | 3 | November, 29 2012 16:00:00+0000 | Loved the pizza today!! |
Demo: http://www.sqlfiddle.com/#!2/602c1/11

Personally, I am not a big fan of how you have your tables structured. What's an incoming message to one user is an outgoing message to another, meaning you need to duplicate every message in the system in each table.
I would probably just have a single messages table with a to and from field. If you had a single table like this:
message_id (primary key)
from_user_id (indexed)
to_user_id (indexed)
message
time (indexed)
Your query would be simple:
SELECT *
FROM messages
WHERE from_user_id = ? OR to_user_id = ?
ORDER BY time DESC
Note this doesn't give you an easy query for purposes of display in the manner you are showing (you would need to do some post-query data manipulation). But it does give you the most efficient lookup query and prevent you from needing to duplicate the messages twice in your storage.
If you need to stick to the concept of grouped conversations (or even to extend to multi-party messages), then perhaps you could look at having a conversations table and modify your schema to be something like this:
conversations (many-to-many join table)
conversation_id (indexed)
user_id (indexed)
(compound primary key across both fields)
messages
message_id (primary key)
conversation_id (indexed)
sending_user_id
message
time (indexed)
With a query like this
SELECT m.sending_user_id, m.message, m.time
FROM conversations AS c
INNER JOIN messages AS m ON c.conversation_id = m.conversation_id
WHERE c.user_id = ?
ORDER BY c.conversation_id, m.time DESC
Obviously from the result query, if the sending_user_id is equal to the id of the current user it is an outgoing message, otherwise it is a message from one of the other conversation participants.

Why separate tables? You could put those in the same table and add a column of type bit with 1 and 0 representing incoming and outgoing. Then your query is as simple as:
select user_id, time, message, inout from message order by user_id, time
To me the direction is telling you something about the message, either way its still a message.
If you still have to do it the other way then you'll have to do a union but expect poorer performance. The best performance tweak you can give you query is through up front table design.
with message as (
select user_id, time, message, 'incoming' from incoming
union all
select user_id, time, message, 'outgoing' from outgoing
) select * from message order by user_id, time
or something like that...
Also, you should be wary of ordering by a time field. From experience you will find you get unexpected results if two messages come in with the same time. This is especially likely as your example is only granular to the minute, rather than second or microsecond. A better way is to have a numeric PK which is auto-assigned in ascending order. That way if the times are not unique you still have a way of determining order.

Related

MySQL Max of a Date not returning the correct tuple

I have a table "messages", that stores messages sent to people over time, regarding some items.
The structure of the messages table is:
message_id
user_id
date_sent
created_at
For each user, I can have multiple tuples in the table.
Some of these messages are already sent, and some are not sent yet.
I'm trying to get the last created message for each user.
I'm using max(created_at) and a group_by(user_id), but the associated message_id is not the one associated with the max(created_id) tuple.
Table data:
message_id | user_id | date_sent | created_at
----------------------------------------------
1 1 2021-07-01 2021-07-01
2 1 2021-07-02 2021-07-02
3 2 2021-07-01 2021-07-01
4 3 2021-07-04 2021-07-04
5 1 2021-07-22 2021-07-22
6 1 NULL 2021-07-23
7 2 NULL 2021-07-29
8 1 NULL 2021-07-29
9 3 2021-07-29 2021-07-29
My Select:
select * from messages ma right join
( SELECT max(mb.created_at), message_id
FROM `messages` mb WHERE mb.created_at <= '2021-07-24'
group by user_id)
mc on ma.message_id=mc.message_id
the result is
message_id | user_id | date_sent | created_at
----------------------------------------------
5 1 2021-07-22 2021-07-23
3 2 2021-07-01 2021-07-01
4 3 2021-07-04 2021-07-04
I don't know why but for user 1, the message_id returned is not the one associated with the tuple that has the max(created_at).
I was expecting to be: (get the tuple with the max(date_sent) of the select grouped by user_id)
message_id | user_id | date_sent | created_at
----------------------------------------------
6 1 NULL 2021-07-23
3 2 2021-07-01 2021-07-01
4 3 2021-07-04 2021-07-04
Any idea? Any help?
thank you.
You're stumbling over MySQL's notorious nonstandard extension to GROUP BY. It gives you the illusion you can do things you can't. Example
SELECT max(created_at), message_id
FROM messages
GROUP BY user_id
actually means
SELECT max(created_at), ANY_VALUE(message_id)
FROM messages
GROUP BY user_id
where ANY_VALUE() means MySQL can choose any message_id it finds most convenient from among that user's messages. That's not what you want.
To solve your problem, you need first to use a subquery to find the latest created_at date for each user_id. Fiddle.
SELECT user_id, MAX(created_at) created_at
FROM messages
WHERE created_at <= '2021-07-24'
GROUP BY user_id
Then, you need to find the message for the particular user_id created on that date. Use the subquery for that. Fiddle
SELECT a.*
FROM messages a
JOIN (
SELECT user_id, MAX(created_at) created_at
FROM messages
WHERE created_at <= '2021-07-24'
GROUP BY user_id
) b ON a.user_id = b.user_id AND a.created_at = b.created_at
See how that JOIN works? It pulls out the rows matching the latest date for each user.
There's a possible optimization. If
your message_id is an autoincrementing primary key and
you never UPDATE your created_at columns, but only set them to the current date when you INSERT the rows
then the most recent message for each user_id is also the message with the largest message_id. In that case you can use this query instead. Fiddle
SELECT a.*
FROM messages a
JOIN (
SELECT user_id, MAX(message_id) message_id
FROM messages
WHERE created_at <= '2021-07-24'
GROUP BY user_id
) b ON a.message_id=b.message_id
Due to the way primary key indexes work, this can be faster.
You want an ordinary JOIN rather than a RIGHT or LEFT JOIN here: the ordinary JOIN only returns rows that match the ON condition.
Pro tip almost nobody actually uses RIGHT JOIN. When you want that kind of JOIN, use LEFT JOIN. You don't want that kind of join to solve this problem.

Mysql time spent at work by specyfic user

I have a MySQL table like this:
+-----+----------+------------+--------------+-------------+
| id | user_id | added_on | status_enter | status_exit |
+-----+----------+------------+--------------+-------------+
Is it possible to count the time if the data is in other rows?
12:16:16 - 10:44:1
User Date Enter Exit
----------- -------------------- ------ ------
John 2021-06-25 10:44:15 1 0
John 2021-06-25 12:16:16 0 1
Not tested, but SHOULD get what you are looking for. The outer query is only looking for those where a person clocked IN. The 3rd column-based select is a correlated query to whatever the current user is and the ID is greater than the check-in, AND it is the check-out. So its possible a null value here if the person is still clocked-in. I would have an index on this table by (enter, user, exit, id) to help optimize the query.
select
tc.id,
tc.user,
tc.date,
( select min( tc2.date )
from TimeClockTable tc2
where tc.User = tc2.User
and tc.id < tc2.id
and tc2.enter = 0
and tc2.exit = 1 ) EndTime,
( select min( tc2.id )
from TimeClockTable tc2
where tc.User = tc2.User
and tc.id < tc2.id
and tc2.enter = 0
and tc2.exit = 1 ) EndTimeID
from
TimeClockTable tc
where
tc.enter = 1
FEEDBACK
If the date/time stamp is always going to be sequential with the ID as it is added, ie: ID #1234 on July 5 at 10:00am will ALWAYS be before #1235 on July 5 at 10:01am (you would never have an ID 1235 or higher that was BEFORE the date/time of ID #1234), then the above modification to the query should work for you. You are already getting the lowest date/time for the given user in comparison to the first, then calling it a second time to get the minimum ID would correlate to the same end time.
There you go:
SELECT T.user_id AS User,
CAST(T.added_on AS DATE) AS Date,
DATEDIFF(
HOUR,
MIN(T.added_on),
MAX(T.added_on)
) AS TotalWorkTime
FROM WorkTable AS T
GROUP BY T.user_id,
CAST(T.added_on AS DATE)

MySQL Query - data not showing as expected, problem in code

I am trying to query a database for the number of individuals who did not arrive for their booking on a given date. However, the results given are not as expected.
From manual checking, the results for 3rd May 2021 should be displayed as 3. I have a feeling that the customer id's are being added together with the result being displayed rather than just the count of individual customer id's.
select
count(c.CUSTOMER_ID) AS 'No Shows',
date(checkins.POSTDATE) as date
from
customers c, checkins
where
checkins.postdate >= date_sub(curdate(), interval 7 day)
and
(
c.archived = 0
and (
(
(
(
(
(
c.GUID in (
select
sb1.customer_guid
from
schedule_bookings sb1
join schedule_events se1 on sb1.course_guid = se1.course_guid
and sb1.OFFERING_ID in (
'2915911', '3022748', '3020740', '2915949',
'2914398', '2916147', '3022701',
'3020699', '2916185', '2915168',
'2916711', '3022403', '3020455',
'2916785', '2916478', '2915508',
'3022538', '3020582', '2915994',
'2914547', '2916069', '3022648',
'3020658', '2916107', '2915290',
'2928786', '2914729', '3022854',
'3020812', '2914694', '2914659',
'3041801', '2920756', '2920834',
'2920795', '2916223', '3022788',
'3020783', '2916239', '2915013'
)
and sb1.CANCELLED in ('0')
)
)
or (
c.GUID in (
select
sp.customer_guid
from
schedule_participants sp
join schedule_bookings sb2 on sp.BOOKING_ID = sb2.BOOKING_ID
join schedule_events se2 on sb2.course_guid = se2.course_guid
and sb2.OFFERING_ID in (
'2915911', '3022748', '3020740', '2915949',
'2914398', '2916147', '3022701',
'3020699', '2916185', '2915168',
'2916711', '3022403', '3020455',
'2916785', '2916478', '2915508',
'3022538', '3020582', '2915994',
'2914547', '2916069', '3022648',
'3020658', '2916107', '2915290',
'2928786', '2914729', '3022854',
'3020812', '2914694', '2914659',
'3041801', '2920756', '2920834',
'2920795', '2916223', '3022788',
'3020783', '2916239', '2915013'
)
and sb2.CANCELLED in ('0')
)
)
)
)
)
and (
(
(
not (
(
(
select
count(CHECKIN_ID)
from
checkins
where
checkins.CUSTOMER_ID = c.CUSTOMER_ID
) between 1
and 9999
)
)
)
)
)
)
)
and not c.customer_id in (1008, 283429, 2507795)
)
group by date(checkins.POSTDATE)
Here are the results:
+----------+------------+
| No Shows | date |
+----------+------------+
| 30627 | 2021-04-27 |
| 37638 | 2021-04-28 |
| 34071 | 2021-04-29 |
| 33579 | 2021-04-30 |
| 29274 | 2021-05-01 |
| 30135 | 2021-05-02 |
| 48339 | 2021-05-03 |
| 8979 | 2021-05-04 |
+----------+------------+
8 rows in set (8.71 sec)
As you can see, the count is nowhere near as intended.
The query parameters are:
Customer is a participant/bookee on the listed specific offerings (offering_id)
Customer's 'Check-in' count was not between 1 and 9999.
Display these results by count per date.
Can anyone see why this query would be not displaying the results as intended?
Kind Regards
Tom
Lets try to reverse this out some. You are dealing with a very finite set of Offering IDs. How about something like starting with the finite list of offerings you are concerned with and join on from that. Additionally, there does not appear to be any need for the join to the schedule events table. If something is booked, its booked. You are never getting any additional context from the event itself.
So, lets start with a very simplified union. You are looking at the bookings table for the possible customer IDs. Then from the actual participants for those same bookings. My GUESS is not every person doing the actual booking may be a participant, likewise, all participants may not be the booking party.
None of this has to do with the actual final customer, archive status or even the events for the booking. We are just getting people - period. Once you have the people and dates, then get the counts.
select
date(CI.POSTDATE) as date,
count( JustCustomers.customer_guid ) AS 'No Shows'
from
(
select
sb1.customer_guid
from
schedule_bookings sb1
where
sb1.CANCELLED = 0
-- if "ID" are numeric, dont use quotes to imply character
and sb1.OFFERING_ID in
( 2915911, 3022748, 3020740, 2915949,
2914398, 2916147, 3022701, 3020699,
2916185, 2915168, 2916711, 3022403,
3020455, 2916785, 2916478, 2915508,
3022538, 3020582, 2915994, 2914547,
2916069, 3022648, 3020658, 2916107,
2915290, 2928786, 2914729, 3022854,
3020812, 2914694, 2914659, 3041801,
2920756, 2920834, 2920795, 2916223,
3022788, 3020783, 2916239, 2915013
)
UNION
select
sp.customer_guid
from
schedule_bookings sb2
JOIN schedule_participants sp
on sb2.BOOKING_ID = sp.BOOKING_ID
where
sb2.CANCELLED = 0
and sb2.OFFERING_ID in
( 2915911, 3022748, 3020740, 2915949,
2914398, 2916147, 3022701, 3020699,
2916185, 2915168, 2916711, 3022403,
3020455, 2916785, 2916478, 2915508,
3022538, 3020582, 2915994, 2914547,
2916069, 3022648, 3020658, 2916107,
2915290, 2928786, 2914729, 3022854,
3020812, 2914694, 2914659, 3041801,
2920756, 2920834, 2920795, 2916223,
3022788, 3020783, 2916239, 2915013
)
) JustCustomers
JOIN customers c
on JustCustomers.customer_guid = c.customer_id
AND c.archived = 0
AND NOT c.customer_id IN (1008, 283429, 2507795)
JOIN checkins CI
on c.CUSTOMER_ID = CI.CUSTOMER_ID
AND CI.postdate >= date_sub(curdate(), interval 7 day)
group by
date(ci.POSTDATE)
The strange thing I notice though is that you are looking for "No shows", but explicitly looking for those people who DID check in. Now, if you are looking for all people who WERE SUPPOSED to be at a given event, then you are probably looking for where the customer DID NOT check in. If that is the intended case, there would be no check-in date to be associated. If that is the case, I would expect a date in some table such as the EVENT Date... such as going on a cruise, the event is when the cruise is, regardless of who makes it to the ship.
If I am way off, I would suggest you edit your existing post, provide additional detail / clarification.

how to select data from multiple table with variable condition | MySQL

I have two tables in the datbase to store client basic info (name, location, phone number) and another table to store client related transactions (date_sub, profile_sub,isPaid,date_exp,client_id) and i have an html table to view the client basic info and transaction if are available, my problem that i can't get a query to select the client info from table internetClient and from internetclientDetails at the same time, because query is only resulting when client have trans in the detail table. the two table fields are as follow:
internetClient
--------------------------------------------------------
id full_name location phone_number
-------------------------------------------------------
4 Joe Amine beirut 03776132
5 Mariam zoue beirut 03556133
and
internetclientdetails
--------------------------------------------------------------------------
incdid icid date_sub date_exp isPaid sub_price
----------------------------------------------------------------------------
6 4 2018-01-01 2018-01-30 0 2000
7 5 2017-01-01 2017-01-30 0 1000
8 4 2018-03-01 2018-03-30 1 50000
9 5 2018-05-01 2019-05-30 1 90000
// incdid > internetClientDetailsId
// icid> internetClientId
if client have trans in orderdetails, the query should return value like that:
client_id full_name date_sub date_exp isPaid sub_price
-------------------------------------------------------------------------------------
4 Joe Amine 2018-03-01 2018-03-30 1 50000
5 Mariam zoue 2018-05-01 2019-05-30 1 90000
else if the client has no id in internetOrederDetails
--------------------------------------------------------
icid full_name location phone_number
-------------------------------------------------------
4 Joe Amine beirut 03776132
5 Mariam zoue beirut 0355613
Thanks in advance
try with left join. It will display all records from internetClient and related record from internetclientdetails
Select internetClient.id, internetClient.full_name
, internetClient.location, internetClient.phone_number
, internetclientdetails.incdid, internetclientdetails.icid
, internetclientdetails.date_sub, internetclientdetails.date_exp
, internetclientdetails.isPaid, internetclientdetails.sub_price
from internetClient
left join internetclientdetails
on internetClient.id=internetclientdetails.icid group by internetclientdetails.icid order by internetclientdetails.incdid desc
if you want to get records of, only paid clients then you can try the following
Select internetClient.id, internetClient.full_name
, internetClient.location, internetClient.phone_number
, internetclientdetails.icid, internetclientdetails.incdid
, internetclientdetails.date_sub, internetclientdetails.date_exp
, internetclientdetails.isPaid, internetclientdetails.sub_price
from internetClient
left join internetclientdetails
on internetClient.id=internetclientdetails.icid
and internetclientdetails.isPaid=1 group by internetclientdetails.icid
order by internetclientdetails.incdid desc
SUMMARY
We generate a dataset containing just the ICID and max(date_sub) (alias:ICDi) We join this to the InternetClientDetails (ICD) to obtain just the max date record per client. Then left join this to the IC record; ensuring we keep all InternetClient(IC) records; and only show the related max Detail Record.
The below approach should work in most mySQL versions. It does not use an analytic which we could use to get the max date instead of the derived table provided the MySQL version you use supported it.
FINAL ANSWER:
SELECT IC.id
, IC.full_name
, IC.location
, IC.phone_number
, ICD.icid
, ICD.incdid
, ICD.date_sub
, ICD.date_exp
, ICD.isPaid
, ICD.sub_price
FROM internetClient IC
LEFT JOIN (SELECT ICDi.*
FROM internetclientdetails ICDi
INNER JOIN (SELECT max(date_sub) MaxDateSub, ICID
FROM internetclientdetails
GROUP BY ICID) mICD
ON ICDi.ICID = mICD.ICID
AND ICDi.Date_Sub = mICD.MaxDateSub
) ICD
on IC.id=ICD.icid
ORDER BY ICD.incdid desc
BREAKDOWN / EXPLANATION
The below gives us a subset of max(date_Sub) for each ICID in clientDetails. We need to so we can filter out all the records which are not the max date per clientID.
(SELECT max(date_sub) MaxDateSub, ICID
FROM internetclientdetails
GROUP BY ICID) mICD
Using that set we join to the details on the Client_ID's and the max date to eliminate all but the most recent detail for each client. We do this because we need the other detail attributes. This could be done using a join or exists. I prefer the join approach as it seems more explicit to me.
(SELECT ICDi.*
FROM internetclientdetails ICDi
INNER JOIN (SELECT max(date_sub) MaxDateSub, ICID
FROM internetclientdetails
GROUP BY ICID) mICD
ON ICDi.ICID = mICD.ICID
AND ICDi.Date_Sub = mICD.MaxDateSub
) ICD
Finally the full query joins the client to the detail keeping client even if there is no detail using a left join.
COMPONENTS:
You wanted all records from InternetClient (FROM internetClient IC)
You wanted related records from InternetClientDetail (LEFT Join InternetClientDetail ICD) while retaining teh records from InternetClient.
You ONLY wanted the most current record from InternetClientDetail (INNER JOIN InternetClientDetail mICD as a derived table getting ICID and max(date))
Total record count should = total record count in InternetClient which means all relationships must be a 1:1o on the table joins -- one-to-one Optional.

Prior row depends on value of current row

The following query shows login information based on the userId. If a user closes their browser without logging off or the session expires, the logoff_date will remain null as in the example below.
userId logon_date logoff_date
1 2012-01-01 10:00:00 2012-01-01 12:00:00
1 2012-01-01 09:00:00 NULL
Because there is a newer logon_date of 2012-01-01 10:00:00, I know that the user must have killed the session for the login_date of 2012-01-01 09:00:00.
Here is my query:
SELECT userId, logon_date, logoff_date
FROM user_logon
WHERE user_id = 2
What I would like is to count only active sessions. In order to do this, I need to skip the rows where the logoff_date is missing if there is a newer row with the same userId.
How about this:
SELECT l.*
FROM user_logon l
INNER JOIN (SELECT MAX(logon_date) mdate,
user_id
FROM user_logon
GROUP BY user_id) x ON x.user_id = l.user_id
AND l.logon_date = x.mdate
WHERE l.logoff_date IS NULL
?
PS: This query implies that for each date there is only one record for particular user