SQL select with dynamic count of "BETWEEN" conditions based on joined table - mysql

I want to add a messenger to my pet project, but I am having difficulty writing database queries. I use MySQL for this service with Hibernate as ORM. Almost all queries was written in HQL, but in principle I can use native queries.
Messenger can contain group conversations. In addition to writing messages, user can enter the conversation, leave it, clear personal message history. User sees all messages when he has been in a conversation, but he can also clear the history and see only messages after the last clearing.
Below I described the simplified structure of two tables important for this task.
Message table:
ID
text
timestamp
1
first_msg
1609459200
2
second_msg
1609545600
Member_event table:
id
user_id
type
timestamp
1
1
1
1609459100
2
1
3
1609459300
3
1
2
1609459400
4
1
1
1609545500
where type:
1 - user entered the chat,
2 - user leaved the chat,
3 - user cleared his own history of messages in the chat
Is it possible to read all chat messages available to the user with one request?
I have no idea how to check conditions dynamically: WHERE message's timestamps are between all "entered-leaved" cycles and after the last "entered" if not leaved BUT only after the last history clearing. If exists.

I think you could proceed with these steps:
take the union of both tables and consider the records in order of time stamp
Use window functions to determine whether the most recent 1 or 2 type was a 1. We can use a running sum where type 1 adds one and type 2 subtracts one (and 3 does nothing to it). With another window function you could determine whether there is still a type 3 following. The combination of these two informations can be translated to a 1 when the line belongs to an interval that must be collected, and a 0 when not.
Filter the previous result to just get the message records, and only those where the calculation was 1.
Here is the query:
with unified as (
select id, text, timestamp, null as type
from message
union
select id, null, timestamp, type
from member_event
where user_id = 1),
validated as (
select unified.*,
sum(case type when 1 then 1 when 2 then -1 else 0 end)
over (order by timestamp
rows unbounded preceding) *
min(case type when 3 then 0 else 1 end)
over (order by timestamp
rows between current row and unbounded following) valid
from unified
order by timestamp)
select id, text, timestamp
from validated
where type is null and valid = 1
order by timestamp

I do not see, how you could match the Member_event table to the Message_table without an additional FOREIGN_KEY. Are you trying to assign the Messages available to the User via Timestamp?
If so try this:
SELECT * FROM MESSAGE_TABLE m
WHERE m.TIMESTAMP BETWEEN
(SELECT TOP 1 TIMESTAMP FROM MEMBER_EVENT_TABLE WHERE type = 1 ORDER BY TIMESTAMP DESC)
AND (SELECT TOP 1 TIMESTAMP FROM MEMBER_EVENT_TABLE WHERE type != 1 ORDER BY TIMESTAMP DESC)
This at least should show the last Messages between join and clean/leave

Related

SELECT last row grouped by account and assigned number

I have a table "Log"
My game server inserts a record into this table when someone login the server, then inserts a second record when they logout.
What I want to do is create a query to count the number of people logged in.
the main data that gets inserted to the table "Log"
When they Login:
[Type] = 0
[Player1] = Their account ID
[Value2] = a random number which matches the logout row when they logout
[Value3] = 0
When they Logout:
[Type] = 1
[Player1] = Their account ID
[Value2] = a random number which matches the login row when they logout
[Value3] = some random number
Is there a way I can count the last "Player1" of each account and check if "Type" = 0 which means that account is logged in then echo the result.
The result I'm looking for would pull the last record of every account an count them.
Note: everytime an account logs in and out it inserts them 2 records so if 1 account logs in 20 times there would be 40 records in "Log"
One way to do it is to count all rows with type 0 for which there doesn't exist any type 1 row with the same player and a later date:
select count(*) as number_of_logged_in
from log l
where Type = 0 -- 0 meaning log on event
-- and [Value3] = 0 -- maybe this should be included
and not exists (
select 1 from log
where Player1 = l.Player1
and type = 1 -- 1 meaning log out event
and date > l.date
-- and [Value2] = l.[Value2] -- maybe this should be included
);
I found your problem statement a bit confusing as you say you want to count the number of people that are logged in, but then you say I want to count the last of each [Player!] where [Type] is 1 which seems to be the opposite. It's also not clear to me why the random number would be important - if the last recorded type for a user is 0 then they should be considered as logged in, or?
Sample SQL Fiddle with some demo data
I am assuming you want list of the all the logged in players names,so you can try using the ROW_NUMBER() to get what you want,
;WITH CTE AS(
SELECT
Player1 AS LoggedInPlayer,
ROW_NUMBER() OVER (PARTITION BY Player1 ORDER BY datecolumn Asc) As LoggedValue
FROM
yourtable
)
SELECT
*
FROM
CTE
WHERE
LoggedValue = 1
If you know, that all logins and logouts are stored in Log without gaps, you can simply count them and if there's a difference you know, that the player is currently logged in.
SELECT logins.player1, logouts.cnt - logins.cnt
FROM
(select player1, count(*) as cnt from Log where type = 0 group by player1) as logins
LEFT OUTER JOIN
(select player1, count(*) as cnt from Log where type = 1 group by player1) as logouts
ON (logins.player1 = logouts.player1)
WHERE logins.cnt > logouts.cnt or logouts.player1 is null
You need the left outer join, if the player logged in one time and never logged out. Sorry, if you encounter syntax issues. I just wrote this without testing and usually work on a Teradata System and the SQL Dialect there. But as the SQL given here is plain Ansi, it should work on any database.

Count number of action and show last date of action

I am querying an audit database to try and find out how many actions each user has completed and when their last action was.
The query I am using is :
SELECT user_id,
count(id) as actions,
datetime
from auditing
WHERE datetime>='2014-03-01 00:00:00'
GROUP BY user_id
ORDER BY `auditing`.`datetime` DESC
This correctly shows me the total number of items but it does not show the correct last date - the date it does show me it quite random i.e. not at the top or bottom of the list but taken from somewhere in the middle. I checked this for a number of entries produced and they are all wrong and do not reflect the latest action.
How can I get it to show me the last (most recent) event in the above query?
Example:
user_id | actions | datetime
1 | 10 | 2014-07-04 16:10:14
2 | 55 | 2014-07-05 11:15:08
3 | 8 | 2014-07-04 22:19:43
Thanks
You should only SELECT columns that are part of your GROUP BY clause or are a result of an aggregate function. You can and probably should configure your database server so that it would complain about your query. It would say something like:
ERROR 1055 (42000): 'datetime' isn't in GROUP BY
The reason behind it is, that you don't tell the database server which datetime value you want (the earliest, the average, the latest?). So in order to get the last value, try this query:
SELECT user_id, count(id) as actions, max(datetime)
FROM auditing
WHERE datetime>='2014-03-01 00:00:00'
GROUP BY user_id
ORDER BY user_id
You can try with this:
SELECT user_id, COUNT(actions), MAX(datetime)
FROM auditing
WHERE datetime>='2014-03-01 00:00:00'
GROUP BY user_id

Multiple distinct counts with where

I am having an issue creating most efficient query for multiple distinct counts of a column with different where clauses. My MYSQL table looks like this:
id client_id result timestamp
---------------------------------------------------
1 1234566 escalated 2014-01-02 00:00:00
2 1233344 approved 2014-02-03 00:00:00
3 1234566 escalated 2014-01-02 01:00:00
What I am trying to achieve is to build the following data in the return:
Total number of unique client IDs processed from the beginning of time.
Total number of unique client IDs processed escalated from the beginning of time.
Total number of unique client IDs processed approved from the beginning of time.
Count of unique client IDs approved within specified timeframe using between statement on timestamp.
Count of unique client IDs escalated within specified timeframe using between statement on timestamp.
I have thought about running multiple selects, but I think it would be a waste of resources, and possibly if this could be done with a single query it would the best way to handle it, unfortunately my experience is lacking in this area. What I would like would the return to simple contain an alias and the count.
Any help would be appreciated.
You want conditional aggregation, something like:
select count(distinct ClientId) as NumClients,
count(distinct case when result = 'Approved' then ClientId end) as NumApproved,
count(distinct case when result = 'Escalated' then ClientId end) as NumEscalated,
count(distinct case when result = 'Approved' and timestamp between #Time1 and #Time2
then ClientId end) as NumApproved,
count(distinct case when result = 'Escalated' and timestamp between #Time1 and #Time2
then ClientId end) as NumEscalated,
from table t;

Assistance with complex MySQL query (using LIMIT ?)

I wonder if anyone could help with a MySQL query I am trying to write to return relevant results.
I have a big table of change log data, and I want to retrieve a number of record 'groups'. For example, in this case a group would be where two or more records are entered with the same timestamp.
Here is a sample table.
==============================================
ID DATA TIMESTAMP
==============================================
1 Some text 1379000000
2 Something 1379011111
3 More data 1379011111
3 Interesting data 1379022222
3 Fascinating text 1379033333
If I wanted the first two grouped sets, I could use LIMIT 0,2 but this would miss the third record. The ideal query would return three rows (as two rows have the same timestamp).
==============================================
ID DATA TIMESTAMP
==============================================
1 Some text 1379000000
2 Something 1379011111
3 More data 1379011111
Currently I've been using PHP to process the entire table, which mostly works, but for a table of 1000+ records, this is not very efficient on memory usage!
Many thanks in advance for any help you can give...
Get the timestamps for the filtering using a join. For instance, the following would make sure that the second timestamp is in a completed group:
select t.*
from t join
(select timestamp
from t
order by timestamp
limit 2
) tt
on t.timestamp = tt.timestamp;
The following would get the first three groups, no matter what their size:
select t.*
from t join
(select distinct timestamp
from t
order by timestamp
limit 3
) tt
on t.timestamp = tt.timestamp;

How can I find the correct prior status row in this table with a SQL query?

Imagine a workflow for data entry. Some forms come in, they are typed into a system, reviewed, and hopefully approved. However, they can be rejected by a manager and will have to be entered again.
So, an ideal workflow would go like this:
recieved > entered > approved
But this COULD happen:
received > entered > rejected > entered > rejected > approved
At each stage, we record who updated the form to its current status - who entered it, who rejected it, or who approved it. So the forms status table looks like this:
form_id status updated_by updated_at
1 received Bob (timestamp)
1 entered Bob (timestamp)
1 approved Susan (timestamp)
2 received Bob (timestamp)
2 entered Bob (timestamp)
2 rejected Susan (timestamp)
2 entered Carla (timestamp)
2 rejected Susan (timestamp)
2 entered Sam (timestamp)
2 approved Susan (timestamp)
Here's what I'm trying to do: write a rejection report. I want a row for each rejection, and joined to that row, I want to see who did the work that got rejected.
As a human, I can see that, for a given status row with status 'rejected', the row that will tell me who did the faulty work will be the one that
shares the same form_id and
has a prior timestamp closest to the rejection.
But I'm having trouble telling MySQL that.
Can anybody see how to construct this query?
A subselect ended up working for me.
SELECT
`s1`.`form_id`,
(
SELECT
`s2`.`updated_by`
FROM
statuses s2
WHERE
`s2`.`form_id` = `s1`.`form_id`
AND
`s2`.`updated_at` < `s1`.`updated_at`
ORDER BY
`s2`.`updated_at` DESC
LIMIT 1
) AS 'made_rejected_change'
FROM
statuses s1
WHERE
`s1`.`status` = 'rejected'
Another solution that uses subselect (this time not a correlated subquery):
SELECT
w1.*,
w2.entered_by
FROM (
SELECT
wr.form_id,
wr.updated_at AS rejected_at,
wr.updated_by AS rejected_by,
MAX(we.updated_at) AS entered at
FROM workflow wr
INNER JOIN workflow we ON we.status = 'entered'
AND wr.form_id = we.form_id
AND wr.updated_at > we.updated_at
WHERE wr.status = 'rejected'
GROUP BY
wr.form_id,
wr.updated_at,
wr.updated_by
) w1
INNER JOIN workflow w2 ON w1.form_id = w2.form_id
AND w1.entered_at = w2.updated_at
The subselect lists all the rejecters and the immediately preceding entered timestamps. Then the table is joined once again to extract the names corresponding to the entered_at timestamps.
You want to get the rejected timestamp and then figure out the entry that appeared right before it based on the timestamp. I'm assuming that timestamp actually holds a date/time and isn't an SQL server timestamp field (completely different).
declare #rejectedTimestamp timestamp
select #rejectedTimestamp = timestamp
from table
where status = 'rejected'
select top 1 *
from table
where timestamp < #rejectedtimestamp
order by timestamp desc