Select count of dates older than x days

Select count of dates older than x days - mysql

My example table:
+----------+---------------------+
| username | time |
+----------+---------------------+
| john | 2013-02-04 17:39:43 |
| john | 2013-02-03 00:21:31 |
| peter | 2013-02-02 15:04:53 |
| grace | 2013-02-02 03:57:43 |
| peter | 2013-02-03 15:36:15 |
+----------+---------------------+
This table registers activities from users. I need to count the number of users whose last activity date was more than 30 days ago.
I had developed this query:
SELECT
username,
MAX(time),
DATEDIFF(NOW(), MAX(time)) as diff
FROM tracking
GROUP BY username
HAVING diff > 30
Which effectively returns the list of users whose activities are more than 30 days ago, along with the date of that last activity.
But I need the count of this list, not the list itself. Is there any way I can count the list?
NOTES:
I can only rely on SQL statements, I can't use PHP or ASP or anything else.
I can't use STORED PROCEDURES.
I don't need performance, as this statement will only be run once in a while.

Here is a relatively simple way:
SELECT count(distinct username) -
count(distinct case when DATEDIFF(NOW(), time) <= 30 then username end) as numusers
FROM tracking
This takes the total number of users and subtracts the count of the ones with activity in the last 30 days.

Just like that?
Select count(*) as Num FROM
(
SELECT
username,
MAX(time),
DATEDIFF(NOW(), MAX(time)) as diff
FROM tracking
GROUP BY username
HAVING diff > 30
)

Related

Query to get the count of logins by a user within a set time interval from previous login

I want to get a count of how many times a user logs in within, let's say, 5 hours from the previous login.
So something like new_login - old_login < 5 hours.
The login table would have user_id and time_accessed.
This query is to get the count of user logins within a day. I can't figure out how to compare the different times within the same column within the same statement:
SELECT user_id, date(time_accessed), count(user_id) AS login_within_5_hour_period
FROM login
GROUP BY user_id, date(time_accessed)
ORDER BY time_accessed;
Sample input
+---------+---------------------+
| user_id | time_accessed |
+---------+---------------------+
| 1 | 2020-02-19 09:00:00 |
| 1 | 2020-02-19 12:00:00 |
| 1 | 2020-02-19 13:00:00 |
| 1 | 2020-02-19 19:00:00 |
+---------+---------------------+
Sample ouput
+---------+---------------------+----------------------------+
| user_id | date(time_accessed) | login_within_5_hour_period |
+---------+---------------------+----------------------------+
| 1 | 2020-02-19 | 3 |
| 1 | 2020-02-19 | 1 |
+---------+---------------------+----------------------------+

In order to compare different times, you need to join the table with itself.
The following query will find the number of logins by the user within 5 hours, excluding the current login. If you want to include the current login in the count, change this l1.time_accessed > l2.time_accessed to l1.time_accessed >= l2.time_accessed.
SELECT l1.user_id, l1.time_accessed, COUNT(l2.user_id) AS login_within_5_hour_period
FROM logins l1
LEFT JOIN logins l2
ON l1.user_id = l2.user_id
AND l1.time_accessed > l2.time_accessed
AND TIME_TO_SEC(TIMEDIFF(l1.time_accessed, l2.time_accessed)) / 3600 <= 5
GROUP BY l1.user_id, l1.time_accessed;
This second query will return a single result, showing the number of logins by the user within 5 hours of the time specified.
SELECT l1.user_id, l1.time_accessed, COUNT(l2.user_id) AS login_within_5_hour_period
FROM logins l1
LEFT JOIN logins l2
ON l1.user_id = l2.user_id
AND l1.time_accessed > l2.time_accessed
AND TIME_TO_SEC(TIMEDIFF(l1.time_accessed, l2.time_accessed)) / 3600 <= 5
WHERE l1.time_accessed = '2020-02-19 19:00:00'
GROUP BY l1.user_id, l1.time_accessed;
Working example: https://www.db-fiddle.com/f/g7jDYqoKn38iQTFuPjej9m/1

SQL query that gives me the percentage of users that fail to run a game per day

I'd really appreciate help on a SQL query I've been struggling to write.
Background:
Everytime a user plays a game, a record gets created in the table game_runs, along with their user_id and run_date (a MySQL timestamp).
When the user successfully plays the game, they get a score greater than 0.
If the game failed to run (e.g. maybe it crashed), the score is 0
The table looks something like this:
id | run_date | user_id | score
-------------------------------------------------------
1 | 2020-02-02 00:20:00 | 10 | 0 |
2 | 2020-02-02 01:50:10 | 10 | 40 |
3 | 2020-02-02 03:40:20 | 11 | 80 |
4 | 2020-02-03 03:20:14 | 20 | 80 |
5 | 2020-02-03 12:20:14 | 21 | 0 |
6 | 2020-02-04 06:20:42 | 50 | 0 |
7 | 2020-02-04 11:15:00 | 50 | 0 |
8 | 2020-02-04 12:10:46 | 51 | 70 |
9 | 2020-02-05 00:15:00 | 60 | 0 |
10 | 2020-02-05 01:10:40 | 61 | 0 |
I would like to find out what percent of users fail to run the game per day.
In the above example, here's what I'm hoping I can generate:
date | percent_users_who_failed_to_run_the_game
-------------------------------------------------------------
2020-02-02 | 0
2020-02-03 | 0.5
2020-02-04 | 0.5
2020-02-05 | 1
Notice how on 2020-02-02, the percent of users who failed to run the game is 0% (i.e. everyone succeeded at least once). This is because on 2020-02-02, there were three runs:
id=1: user_id 10 failed to run it initially (score=0)
id=2: user_id 10 succeeds the second time (score=40)
id=3: user_id 11 succeeds
Since both users were successful that day, the percent of users that failed was 0%.
I'd love any input on how to get started. I am using mySQL v8+ so have access to window functions if that is necessary (my research tells me that they may help, but have been unable to write a query that does this).
I think the right logic would be something along the lines of finding out the % of users that have a MAX(score) = 0 but unsure how to write the query.
I hope that wasn't too unclear - I really appreciate you reading thus far, and any pointers will be so helpful.
Thank you!

I think you need to do this in two steps. The first step is to get the maximum score per user per day:
SELECT CAST(Run_Date AS DATE) AS RunDate,
User_ID,
MAX(Score) AS Score
FROM YourTable
GROUP BY CAST(Run_Date AS DATE), User_ID;
Then you can put this in a subquery and calculate your percentage:
SELECT RunDate,
COUNT(CASE WHEN Score = 0 THEN 1 END) / SUM(1.0) AS Failed_Percent
FROM ( SELECT CAST(Run_Date AS DATE) AS RunDate,
User_ID,
MAX(Score) AS Score
FROM YourTable
GROUP BY CAST(Run_Date AS DATE), User_ID
) AS t
GROUP BY RunDate;
Example on SQL Fiddle
You can also achieve this without a subquery using COUNT(DISTINCT):
SELECT CAST(Run_Date AS DATE) AS RunDate,
1 - (1.0 * COUNT(DISTINCT CASE WHEN Score > 0 THEN User_ID END)
/ COUNT(DISTINCT User_id)) AS Failed_Percent
FROM YourTable
GROUP BY CAST(Run_Date AS DATE);
Example on SQL Fiddle
This is really doing the reverse logic, but the result is the same. The relevant parts are:
COUNT(DISTINCT CASE WHEN Score > 0 THEN User_ID END)
This gets the total number of distinct users that ran the game successfully on any given date, Then
COUNT(DISTINCT User_id)
Gives the total number of users that logged a record on that date. The former divided by the latter gives the percent of successful users, so we then need to minus this from 1 to get the percent of failed. I have multiplied one of the counts by 1.0 to implicitly convert it to a decimal to avoid integer division
I would expect the first query to be more efficient, but I could be wrong.

You can do this without a subquery:
select date(run_date) as dte,
1 - count(distinct case when score > 0 then user_id end)) / count(distinct user_id)
from t
group by dte;
This counts the number of users who successfully ran the game each day. 1 - <this amount> is the number who were unsuccessful.

MySQL: Count of Maximum Occurrences of Distinct Value

Im having a problem with an aggregate function in mysql.
As an example I have this table layout. It gets filled with data every x minutes.
Company | Employee | Room | Temperature
---------------------------------------
A | Mike | 301 | 20
A | Mike | 301 | 30
A | Mike | 301 | 30
A | Mike | 402 | 40
A | Lisa | 402 | 10
Now in my query I'm grouping Company + Employee into one result and I'm looking for the count of the maximum occurrences of the Room value while still aggregating other values like temperature.
SELECT
Company,
Employee,
??? as Room,
AVG(Temperature) as Temperature
FROM
example_table
GROUP BY
Company,
Employee
In this example the room 301 appears 3 times for Mike which should output 3 in the aggregate function. Since the data is on a set interval it is basically the maximum length of a stay in a room for this employee. I'm looking for a result like this
Company | Employee | Room | Temperature
---------------------------------------
A | Mike | 3 | 30
A | Lisa | 1 | 10
I feel like I'm missing something but so far I have found nothing which worked in a query for me. I can group_concant the rooms and solve this with 2 lines of code in php, but the actual data is gigabytes which I don't want to send to a script. Performance of the database query doesn't matter. MySQL 8 is available.
edit: I've changed the example to make sure COUNT(DISTINCT Room) doesn't accidentally give the correct result, because it's not what I'm looking for.

SELECT Company, Employee
, MAX(roomOccurrence) AS Room
, AVG(roomTemp * roomOccurrence) AS Temperature
FROM ( SELECT Company, Employee, Room
, COUNT(*) AS roomOccurrence, AVG(Temperature) AS roomTemp
FROM example_table
GROUP BY Company, Employee, Room
) AS subQ
GROUP BY Company, Employee
;
Note the outer temperature average weights the temperature average from the inner query.
Alternatively, you could SUM the temps in the subquery...and then divide the SUM of the SUM by the SUM of the room COUNT; but's it should be the same either way. I would at best expect minor performance differences, and not sure if either way would be consistently faster.

Average time difference between rows in database

Using MySQL, I have a table that keep track of user visit:
USER_ID | TIMESTAMP
--------+----------------------
1 | 2014-08-11 14:37:36
2 | 2014-08-11 12:37:36
3 | 2014-08-07 16:37:36
1 | 2014-07-14 15:34:36
1 | 2014-07-09 14:37:36
2 | 2014-07-03 14:37:36
3 | 2014-05-23 15:37:36
3 | 2014-05-13 12:37:36
Time is not important, more concern about answer to "how many days between entries"
How do I go about figuring how the average number of days between entries through SQL queries?
For example, the output should look like something like:
(output is just a sample, not reflection of the data table above)
USER_ID | AVG TIME (days)
--------+----------------------
1 | 2
2 | 3
3 | 1

MySQL has no direct "get something from a previous row" capabilities. Easiest workaround is to use a variable to store that "previous" value:
SET last = null;
SELECT user_id, AVG(diff)
FROM (
SELECT user_id, IF(last IS NULL, 0, timestamp - last) AS diff, #last := timestamp
FROM yourtable
ORDER BY user_id, timestamp ASC
) AS foo
GROUP BY user_id
The inner query does your "difference from previous row" calculations, and the outer query does the averaging.

In mysql: how can I select the most recently added row when selecting by MAX if two values are equal (application is a games high score table)

I am trying to construct a highscore table from entries in a table with the layout
id(int) | username(varchar) | score(int) | modified (timestamp)
selecting the highest scores per day for each user is working well using the following:
SELECT id, username, MAX( score ) AS hiscore
FROM entries WHERE DATE( modified ) = CURDATE( )
Where I am stuck is that in some cases plays may achieve the same score multiple times in the same day, in which case I need to make sure that it is always the earliest one that is selected because 2 scores match will be the first to have reached that score who wins.
if my table contains the following:
id | username | score | modified
________|___________________|____________|_____________________
1 | userA | 22 | 2014-01-22 08:00:14
2 | userB | 22 | 2014-01-22 12:26:06
3 | userA | 22 | 2014-01-22 16:13:22
4 | userB | 15 | 2014-01-22 18:49:01
The returned winning table in this case should be:
id | username | score | modified
________|___________________|____________|_____________________
1 | userA | 22 | 2014-01-22 08:00:14
2 | userB | 22 | 2014-01-22 12:26:06
I tried to achieve this by adding ORDER BY modified desc to the query, but it always returns the later score. I tried ORDER BY modified asc as well, but I got the same result

This is the classic greatest-n-per-group problem, which has been answered frequently on StackOverflow. Here's a solution for your case:
SELECT e.*
FROM entries e
JOIN (
SELECT DATE(modified) AS modified_date, MAX(score) AS score
FROM entries
GROUP BY modified_date
) t ON DATE(e.modified) = t.modified_date AND e.score = t.score
WHERE DATE(e.modified) = CURDATE()

I think this would works for you and is the simplest way:
SELECT username, MAX(score), MIN(modified)
FROM entries
GROUP BY username
This returns this in your case:
"userB";22;"2014-01-22 12:26:06"
"userA";22;"2014-01-22 08:00:14"
However, I think what you want (in your example would be wrong) the most recent row. To do it, you need this:
SELECT username, MAX(score), MAX(modified)
FROM entries
GROUP BY username
Which returns:
"userB";22;"2014-01-22 18:49:01"
"userA";22;"2014-01-22 16:13:22"

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Select count of dates older than x days - mysql

Here is a relatively simple way: SELECT count(distinct username) - count(distinct case when DATEDIFF(NOW(), time) <= 30 then username end) as numusers FROM tracking This takes the total number of users and subtracts the count of the ones with activity in the last 30 days.

Just like that? Select count(*) as Num FROM ( SELECT username, MAX(time), DATEDIFF(NOW(), MAX(time)) as diff FROM tracking GROUP BY username HAVING diff > 30 )

Related

Query to get the count of logins by a user within a set time interval from previous login

SQL query that gives me the percentage of users that fail to run a game per day

MySQL: Count of Maximum Occurrences of Distinct Value

Average time difference between rows in database

In mysql: how can I select the most recently added row when selecting by MAX if two values are equal (application is a games high score table)

Categories

Resources