SELECT last row grouped by account and assigned number - sql-server-2008

I have a table "Log"
My game server inserts a record into this table when someone login the server, then inserts a second record when they logout.
What I want to do is create a query to count the number of people logged in.
the main data that gets inserted to the table "Log"
When they Login:
[Type] = 0
[Player1] = Their account ID
[Value2] = a random number which matches the logout row when they logout
[Value3] = 0
When they Logout:
[Type] = 1
[Player1] = Their account ID
[Value2] = a random number which matches the login row when they logout
[Value3] = some random number
Is there a way I can count the last "Player1" of each account and check if "Type" = 0 which means that account is logged in then echo the result.
The result I'm looking for would pull the last record of every account an count them.
Note: everytime an account logs in and out it inserts them 2 records so if 1 account logs in 20 times there would be 40 records in "Log"

One way to do it is to count all rows with type 0 for which there doesn't exist any type 1 row with the same player and a later date:
select count(*) as number_of_logged_in
from log l
where Type = 0 -- 0 meaning log on event
-- and [Value3] = 0 -- maybe this should be included
and not exists (
select 1 from log
where Player1 = l.Player1
and type = 1 -- 1 meaning log out event
and date > l.date
-- and [Value2] = l.[Value2] -- maybe this should be included
);
I found your problem statement a bit confusing as you say you want to count the number of people that are logged in, but then you say I want to count the last of each [Player!] where [Type] is 1 which seems to be the opposite. It's also not clear to me why the random number would be important - if the last recorded type for a user is 0 then they should be considered as logged in, or?
Sample SQL Fiddle with some demo data

I am assuming you want list of the all the logged in players names,so you can try using the ROW_NUMBER() to get what you want,
;WITH CTE AS(
SELECT
Player1 AS LoggedInPlayer,
ROW_NUMBER() OVER (PARTITION BY Player1 ORDER BY datecolumn Asc) As LoggedValue
FROM
yourtable
)
SELECT
*
FROM
CTE
WHERE
LoggedValue = 1

If you know, that all logins and logouts are stored in Log without gaps, you can simply count them and if there's a difference you know, that the player is currently logged in.
SELECT logins.player1, logouts.cnt - logins.cnt
FROM
(select player1, count(*) as cnt from Log where type = 0 group by player1) as logins
LEFT OUTER JOIN
(select player1, count(*) as cnt from Log where type = 1 group by player1) as logouts
ON (logins.player1 = logouts.player1)
WHERE logins.cnt > logouts.cnt or logouts.player1 is null
You need the left outer join, if the player logged in one time and never logged out. Sorry, if you encounter syntax issues. I just wrote this without testing and usually work on a Teradata System and the SQL Dialect there. But as the SQL given here is plain Ansi, it should work on any database.

Related

SQL select with dynamic count of "BETWEEN" conditions based on joined table

I want to add a messenger to my pet project, but I am having difficulty writing database queries. I use MySQL for this service with Hibernate as ORM. Almost all queries was written in HQL, but in principle I can use native queries.
Messenger can contain group conversations. In addition to writing messages, user can enter the conversation, leave it, clear personal message history. User sees all messages when he has been in a conversation, but he can also clear the history and see only messages after the last clearing.
Below I described the simplified structure of two tables important for this task.
Message table:
ID
text
timestamp
1
first_msg
1609459200
2
second_msg
1609545600
Member_event table:
id
user_id
type
timestamp
1
1
1
1609459100
2
1
3
1609459300
3
1
2
1609459400
4
1
1
1609545500
where type:
1 - user entered the chat,
2 - user leaved the chat,
3 - user cleared his own history of messages in the chat
Is it possible to read all chat messages available to the user with one request?
I have no idea how to check conditions dynamically: WHERE message's timestamps are between all "entered-leaved" cycles and after the last "entered" if not leaved BUT only after the last history clearing. If exists.
I think you could proceed with these steps:
take the union of both tables and consider the records in order of time stamp
Use window functions to determine whether the most recent 1 or 2 type was a 1. We can use a running sum where type 1 adds one and type 2 subtracts one (and 3 does nothing to it). With another window function you could determine whether there is still a type 3 following. The combination of these two informations can be translated to a 1 when the line belongs to an interval that must be collected, and a 0 when not.
Filter the previous result to just get the message records, and only those where the calculation was 1.
Here is the query:
with unified as (
select id, text, timestamp, null as type
from message
union
select id, null, timestamp, type
from member_event
where user_id = 1),
validated as (
select unified.*,
sum(case type when 1 then 1 when 2 then -1 else 0 end)
over (order by timestamp
rows unbounded preceding) *
min(case type when 3 then 0 else 1 end)
over (order by timestamp
rows between current row and unbounded following) valid
from unified
order by timestamp)
select id, text, timestamp
from validated
where type is null and valid = 1
order by timestamp
I do not see, how you could match the Member_event table to the Message_table without an additional FOREIGN_KEY. Are you trying to assign the Messages available to the User via Timestamp?
If so try this:
SELECT * FROM MESSAGE_TABLE m
WHERE m.TIMESTAMP BETWEEN
(SELECT TOP 1 TIMESTAMP FROM MEMBER_EVENT_TABLE WHERE type = 1 ORDER BY TIMESTAMP DESC)
AND (SELECT TOP 1 TIMESTAMP FROM MEMBER_EVENT_TABLE WHERE type != 1 ORDER BY TIMESTAMP DESC)
This at least should show the last Messages between join and clean/leave

I need help to figure out syntax for a view in an existing PostgreSQL database that extracts (unnests) content in a JSON array field

I am working on a JSON array field named session_durations in existing PostgreSQL 11.8 database view. Each field describes the sessionID and the duration (amount of time a program user visits that session). There are 12 possible sessions, an "session" refers here to online lesson in an eHealth treatment program.
This JSON field (session_durations) is populated as the user accesses the session. If user never accesses a session then no data appears in the JSON field for that session (see my examples) -- hence some sessions can be skipped over entirely.
I'd like to use SQL code to unpack this field in order to separate its components. Here are 2 example records:
Record 1: [{"sessionId":"7","duration":1886400},{"sessionId":"8","duration":1710000},{"sessionId":"9","duration":706800}]
Record 2: [{"sessionId":"1","duration":879600},{"sessionId":"2","duration":975600},{"sessionId":"3","duration":9600}]
I'd like to use my View to save duration data (e.g., "duration":879600) from each possible session into 12 new columns for each user session (e.g., "sessionId":"1") named the following:
• S1_duration
• S2_duration
• S3_duration
• S4_duration
• S5_duration
• S6_duration...
• S12_duration
All help would be greatly appreciated!!
Table:
CREATE TABLE users (
id int4 PRIMARY KEY,
session_durations json
);
----some rows of data:
13 [{"sessionId":"1","duration":12699},{"sessionId":"7","duration":1423041},{"sessionId":"8","duration":7598502},{"sessionId":"10","duration":1531229}]
14 [{"sessionId":"1","duration":55812},{"sessionId":"7","duration":2905}]
161 [{"sessionId":"7","duration":1125600},{"sessionId":"8","duration":460800}]
12 [{"sessionId":"1","duration":1520988},{"sessionId":"2","duration":94565},{"sessionId":"6","duration":35468}]
Your solutions worked perfectly! I chose to use the second solution (grouping syntax) for my project. Thanks for your patience -- and the online demo examples!
This should do:
WITH usd AS (
SELECT
us.id AS user_id,
(sd->>'sessionId')::int AS session_id,
(sd->>'duration')::int AS duration
FROM users us,
LATERAL json_array_elements(us.session_durations) AS sd
)
SELECT
users.id AS user_id,
(SELECT duration FROM usd WHERE user_id = users.id AND session_id = 1) AS "S1_duration",
(SELECT duration FROM usd WHERE user_id = users.id AND session_id = 2) AS "S2_duration",
(SELECT duration FROM usd WHERE user_id = users.id AND session_id = 3) AS "S3_duration",
…
(SELECT duration FROM usd WHERE user_id = users.id AND session_id = 12) AS "S12_duration"
FROM users;
(online demo)
Alternatively, using grouping and some filtered aggregate (dealing better with potential duplicates):
SELECT
user_id,
MAX(duration) FILTER (WHERE session_id = 1) AS "S1_duration",
MAX(duration) FILTER (WHERE session_id = 2) AS "S2_duration",
MAX(duration) FILTER (WHERE session_id = 3) AS "S3_duration",
…
MAX(duration) FILTER (WHERE session_id = 12) AS "S12_duration"
FROM (
SELECT
us.id AS user_id,
(sd->>'sessionId')::int AS session_id,
(sd->>'duration')::int AS duration
FROM users us,
LATERAL json_array_elements(us.session_durations) AS sd
) AS usd
GROUP BY user_id;
(online demo)

Hive Query that returns distinct value that each User has

I have a mysql table-
User Value
A 1
A 12
A 3
B 4
B 3
B 1
C 1
C 1
C 8
D 34
D 1
E 1
F 1
G 56
G 1
H 1
H 3
C 3
F 3
E 3
G 3
I need to run a query which returns 2nd distinct value that each user has.
Means if any 2 values are accessed by each user , then based on the occurrence, pick the 2nd distinct value.
So as above 1 & 3 is being accessed by each User. Occurrence of 1 is
more than 3 , so 2nd distinct will be 3
So I thought first I will get all distinct user.
create table temp AS Select distinct user from table;
Then I will have an outer query-
Select value from table where value in (...)
In programmatically way , I can iterate through each of the value user contains like Map but in Hive query I just couldn't write that.
This will return the second most frequented value from your list that spans all users. There isn't one of these values in the table which I expect is a typo in the data. In real data you will likely have muliple ties that you need to figure out how to handle.
Select value as second_distinct from
(select value, rank() over (order by occurrences desc) as rank
from
(SELECT value, unique_users, max(count_users) as count_users, count(value) as occurrences
from
(select value, size(collect_set(user) over (partition by value))
as count_users from my_table
) t
left outer join
(select count(distinct user) as unique_users from my_table
) t2 on (1=1)
where unique_users=count_users
group by value, unique_users
) t3
) t4
where rank = 2;
This works. It returns NULL because there is only value that visited every user (value of 1). Value 3 is not a solution because not every user has seen that value in your data. I expect you intended that three should be returned but again it doesn't span all the users (user D did not see value 3).
Not sure how #invoketheshell's answer was marked correct; it doesn't run and it needs 6 MR jobs. This will get you there in 4 and is less code.
Query:
select value
from (
select value, value_count, rank() over (order by value_count desc) rank
from (
select value, count(value) value_count
from (
select value, num_users, max(num_users) over () max_users
from (
select value
, size(collect_set(user) over (partition by value)) num_users
from db.table ) x ) y
where num_users = max_users
group by value ) z ) f
where rank = 2
Output:
3
EDIT: Let me clarify my solution as there seems to be some confusion. The OP's example says
"So as above 1 & 3 is being accessed by each User ... "
As my comment below the question suggests, in the example given, user D never accesses value 3. I made the assumption that this was a typo and added this to the dataset and then added another 1 as well to make there be more 1's than 3's. So my code correctly returns 3, which was the desired output. If you run this script on the actual dataset it will also produce the correct output which is nothing because there isn't a "2nd Distinct". The only time it could produce an incorrect value, is if there was no one specific number that was accessed by all users, which illustrates the point I was trying to make to #invoketheshell: if there is no single number that every user has accessed, running a query with 6 map-reduce jobs is an absurd way to find that out. Since we are using Hive I believe it would be fair to assume that if this problem were a "real-world" problem, it would most likely be executed on at least 100's of TBs of data (probably more). I the interest of preserving time and resources, it would behoove an individual to at least check that one number had been accessed by all users before running a massive query whose analysis hinges on that assumption being true.

Get entry with max value in MySQL

I've got a MySQL database with lots of entris of highscores for a game. I would like to get the "personal best" entry with the max value of score.
I found a solution that I thought worked, until I got more names in my database, then it returnes completely different results.
My code so far:
SELECT name, score, date, version, mode, custom
FROM highscore
WHERE score =
(SELECT MAX(score)
FROM highscore
WHERE name = 'jonte' && gamename = 'game1')
For a lot of values, this actually returns the correct value as such:
JONTE 240 2014-04-28 02:52:33 1 0 2053
It worked fine with a few hundred entries, some with different names. But when I added new entries and swapped name to 'gabbes', for the new names I instead get a list of multiple entries. I don't see the logic here as the entries in the database seem quite identical with some differences in data.
JONTE 176 2014-04-28 11:03:46 1 0 63
GABBES 176 2014-04-28 11:09:12 1 0 3087
The above has two entires, but sometimes it may also return 10-20 entries in a row too.
Any help?
If you want the high score for each person (i.e. personal best) you can do this...
SELECT name, max(score)
FROM highscore
WHERE gamename = 'game1'
GROUP BY name
Alternatively, you can do this...
SELECT name, score, date, version, mode, custom
FROM highscore h1
WHERE score =
(SELECT MAX(score)
FROM highscore h2
WHERE name = h1.name && gamename = 'game1')
NOTE: In your SQL, your subclause is missing the name = h1.name predicate.
Note however, that this second option will give multiple rows for the same person if they recorded the same high score multiple times.
The multiple entries are returned because multiple entries have the same high score. You can add LIMIT 1 to get only a single entry. You can choose which entry to return with the ORDER BY clause.

How can I find the correct prior status row in this table with a SQL query?

Imagine a workflow for data entry. Some forms come in, they are typed into a system, reviewed, and hopefully approved. However, they can be rejected by a manager and will have to be entered again.
So, an ideal workflow would go like this:
recieved > entered > approved
But this COULD happen:
received > entered > rejected > entered > rejected > approved
At each stage, we record who updated the form to its current status - who entered it, who rejected it, or who approved it. So the forms status table looks like this:
form_id status updated_by updated_at
1 received Bob (timestamp)
1 entered Bob (timestamp)
1 approved Susan (timestamp)
2 received Bob (timestamp)
2 entered Bob (timestamp)
2 rejected Susan (timestamp)
2 entered Carla (timestamp)
2 rejected Susan (timestamp)
2 entered Sam (timestamp)
2 approved Susan (timestamp)
Here's what I'm trying to do: write a rejection report. I want a row for each rejection, and joined to that row, I want to see who did the work that got rejected.
As a human, I can see that, for a given status row with status 'rejected', the row that will tell me who did the faulty work will be the one that
shares the same form_id and
has a prior timestamp closest to the rejection.
But I'm having trouble telling MySQL that.
Can anybody see how to construct this query?
A subselect ended up working for me.
SELECT
`s1`.`form_id`,
(
SELECT
`s2`.`updated_by`
FROM
statuses s2
WHERE
`s2`.`form_id` = `s1`.`form_id`
AND
`s2`.`updated_at` < `s1`.`updated_at`
ORDER BY
`s2`.`updated_at` DESC
LIMIT 1
) AS 'made_rejected_change'
FROM
statuses s1
WHERE
`s1`.`status` = 'rejected'
Another solution that uses subselect (this time not a correlated subquery):
SELECT
w1.*,
w2.entered_by
FROM (
SELECT
wr.form_id,
wr.updated_at AS rejected_at,
wr.updated_by AS rejected_by,
MAX(we.updated_at) AS entered at
FROM workflow wr
INNER JOIN workflow we ON we.status = 'entered'
AND wr.form_id = we.form_id
AND wr.updated_at > we.updated_at
WHERE wr.status = 'rejected'
GROUP BY
wr.form_id,
wr.updated_at,
wr.updated_by
) w1
INNER JOIN workflow w2 ON w1.form_id = w2.form_id
AND w1.entered_at = w2.updated_at
The subselect lists all the rejecters and the immediately preceding entered timestamps. Then the table is joined once again to extract the names corresponding to the entered_at timestamps.
You want to get the rejected timestamp and then figure out the entry that appeared right before it based on the timestamp. I'm assuming that timestamp actually holds a date/time and isn't an SQL server timestamp field (completely different).
declare #rejectedTimestamp timestamp
select #rejectedTimestamp = timestamp
from table
where status = 'rejected'
select top 1 *
from table
where timestamp < #rejectedtimestamp
order by timestamp desc