SQL subquery in SELECT clause - mysql

I'm trying to find admin activity within the last 30 days.
The accounts table stores the user data (username, password, etc.)
At the end of each day, if a user had logged in, it will create a new entry in the player_history table with their updated data. This is so we can track progress over time.
accounts table:
id
username
admin
1
Michael
4
2
Steve
3
3
Louise
3
4
Joe
0
5
Amy
1
player_history table:
id
user_id
created_at
playtime
0
1
2021-04-03
10
1
2
2021-04-04
10
2
3
2021-04-05
15
3
4
2021-04-10
20
4
5
2021-04-11
20
5
1
2021-05-12
40
6
2
2021-05-13
55
7
3
2021-05-17
65
8
4
2021-05-19
75
9
5
2021-05-23
30
10
1
2021-06-01
60
11
2
2021-06-02
65
12
3
2021-06-02
67
13
4
2021-06-03
90
The following query
SELECT a.`username`, SEC_TO_TIME((MAX(h.`playtime`) - MIN(h.`playtime`))*60) as 'time' FROM `player_history` h, `accounts` a WHERE h.`created_at` > '2021-05-06' AND h.`user_id` = a.`id` AND a.`admin` > 0 GROUP BY h.`user_id`
Outputs this table:
Note that this is just admin activity, so Joe is not included in this data.
from 2021-05-06 to present (yy-mm-dd):
username
time
Michael
00:20:00
Steve
00:10:00
Louise
00:02:00
Amy
00:00:00
As you can see this from data, Amy's time is shown as 0 although she has played for 10 minutes in the last month. This is because she only has 1 entry starting from 2021-05-06 so there is no data to compare to. It is 0 because 10-10 = 0.
Another flaw is that it doesn't include all activity in the last month, basically only subtracts the highest value from the lowest.
So I tried fixing this by comparing the highest value after 2021-05-06 to their most previous login before the date. So I modified the query a bit:
SELECT a.`Username`, SEC_TO_TIME((MAX(h.`playtime`) - (SELECT MAX(`playtime`) FROM `player_history` WHERE a.`id` = `user_id` AND `created_at` < '2021-05-06'))*60) as 'Time' FROM `player_history` h, `accounts` a WHERE h.`created_at` >= '2021-05-06' AND h.`user_id` = a.`id` AND a.`admin` > 0 GROUP BY h.`user_id`
So now it will output:
username
time
Michael
00:50:00
Steve
00:50:00
Louise
00:52:00
Amy
00:10:00
But I feel like this whole query is quite inefficient. Is there a better way to do this?

I think you want lag():
SELECT a.username,
SEC_TO_TIME(SUM(h.playtime - COALESCE(h.prev_playtime, 0))) as time
FROM accounts a JOIN
(SELECT h.*,
LAG(playtime) OVER (PARTITION BY u.user_id ORDER BY h.created_at) as prev_playtime
FROM player_history h
) h
ON h.user_id = a.id
WHERE h.created_at > '2021-05-06' AND
a.admin > 0
GROUP BY a.username;
In addition to the LAG() logic, note the other changes to the query:
The use of proper, explicit, standard, readable JOIN syntax.
The use of consistent columns for the SELECT and GROUP BY.
The removal of single quotes around the column alias.
The removal of backticks; they just clutter the query, making it harder to write and to read.

Related

Want to find the amount generated by a single event in one month followed by several events

I want to find the revenue generated by a particular event.
If I would say, event1 pitched on 8/1/2020 to customers A and B, event 2 pitched on 8/15/2020 to customer B & C, event 3 pitched on 8/30/2020. Then to find the revenue generated by event1, we need to find A customer and B customer pitched again for that month or not. If yes then consider the transaction date just before the date when the customer is pitched again. In the given example, A customer pitched again on 08/30/2020 and B customer pitched on 8/15/2020then then to calculate for event1 we need to consider the transaction of customer A till 8/29/2020 from the 8/1/2020 and Customer B till 8/14/2020 from the 8/1/2020.
Event Table:
EventID CID Date
123 1 01-12-2020
123 2 01-12-2020
123 3 01-12-2020
345 2 05-12-2020
345 4 05-12-2020
456 1 07-12-2020
456 4 07-12-2020
567 1 08-12-2020
Transaction Table:
UID Tran_Date Amount
1 03-12-2020 10
1 04-12-2020 20
1 07-12-2020 30
1 09-12-2020 40
2 03-12-2020 10
2 07-12-2020 30
2 07-12-2020 40
2 09-12-2020 30
3 07-12-2020 30
3 07-12-2020 40
3 09-12-2020 30
Output Table:
EventID CID Sum
123 1 30
456 1 30
567 1 40
123 2 10
456 2 100
123 3 100
One option uses window function lead() to get the date of the "next" event, then brings the transactions with a join, and finally aggregates:
select
e.eventid,
e.cid,
coalesce(sum(t.amount), 0) total_amount
from (
select e.*, lead(date) over(partition by cid order by date) lead_date
from events e
) e
left join transactions t
on t.uid = e.cid
and t.tran_date >= e.date
and (t.tran_date < e.lead_date or e.lead_date is null)
group by e.eventid, e.cid
Note that window functions are available in MySQL 8.0 only. In earlier versions, you can emulate lead() with a correlated subquery.

Select weekly average of user usage, only for some users (mysql)

I have 2 tables, and I want to show a weekly TOTAL average of data usage for users who started using the application 10 weeks ago. (in that week)
Table 1 is called "users"
user_id user_name user_date
1 a 2020-05-01
2 b 2020-05-03
3 c 2020-06-01
4 d 2020-06-06
5 e 2020-06-09
Table 2 is called "data_tbl"
data_id user_id date_used data_used
1 1 2020-05-09 7
2 1 2020-05-09 12
3 2 2020-05-12 100
4 2 2020-05-20 177
5 1 2020-05-21 78
6 2 2020-05-29 33
7 1 2020-06-01 44
8 2 2020-06-01 123
9 1 2020-06-03 62
Consider 10 weeks ago is between 2020-05-01 and 2020-05-08
So the 2 users we are interested in in that case is user_id 1 and 2 (a and b)
We consider first week from 05-01 to 05-08
Second week from 2020-05-08 to 2020-05-15
Third week from 2020-05-15 to 2020-05-22
Forth week from 2020-05-22 to 2020-05-29 and so on
For week 1 we would have average usage = 0
For week 2 we would have average usage (7+12+100)/3=39
For week 3 we would have average usage (177+78)/2=127
For week 4 we would have average usage 33
For week 5 we would have average usage (44+123+62)/3=76
I really don't know how to start, if I should do a join, or a select in select with average.
I tested something like: (but no success)
SELECT AVG(data_used),
FROM data_tbl
LEFT JOIN users ON data_tbl.user_id=users.user_id
WHERE users.user_date>= "2020-05-01" AND users.user_date<="2020-05-08"
GROUP BY date
ORDER BY date;
You can achieve this easily with YEARWEEK() function
However what you want to achieve is not totally clear for me because the results you want don't really match your data.
Example:
SELECT YEARWEEK(SYSDATE()) AS Actual_Week,
YEARWEEK(user_date) User_Date_Week,
YEARWEEK(SYSDATE()) - YEARWEEK(user_date) AS diff_weeks ,
u.*
FROM users u
Returns
Actual_Week User_Date_Week diff_weeks user_id user_name user_date
202029 202017 12 1 a 2020-05-01
202029 202018 11 2 b 2020-05-03
202029 202022 7 3 c 2020-06-01
202029 202022 7 4 d 2020-06-06
202029 202023 6 5 e 2020-06-09
So you can see that user 1 is 12 weeks ago, and user 2 is 11 week ago. And you assume they are 10 weeks ago, which is incorrect. Sames goes with your date_used in data_tbl.
So I'll just put you on the right path, it should then be easy to adapt following your needs...
Do something like this
SELECT YEARWEEK(d.date_used), AVG(data_used)
FROM users u
INNER JOIN data_tbl d ON u.user_id = d.user_id
WHERE (YEARWEEK(SYSDATE()) - YEARWEEK(u.user_date)) BETWEEN 11 AND 12
GROUP BY YEARWEEK(d.date_used)
Returns
YEARWEEK(d.date_used) AVG(data_used)
202018 9.5
202019 100
202020 127.5
202021 33
202022 76.3333
You can see that the numbers you expect are there, but that they are others. And this result seems correct to me, the results in your question were wrong.
Notice that to get the results for user 1 and 2, I specified
WHERE (YEARWEEK(SYSDATE()) - YEARWEEK(u.user_date)) BETWEEN 11 AND 12
If you want the user of 10 weeks ago, just do
WHERE (YEARWEEK(SYSDATE()) - YEARWEEK(u.user_date)) = 10
And to conclude :
you might want to change the mode of YEARWEEK(), if the weeks should start on Monday, Sunday, or other options. Modes are well described here, with plenty of examples
If you also want the weeks without data in your results (so always 0), you have to use a Calendar table. There are plenty of examples on SO.

Mysql select result in one currency

I have to create a reports in one currency. I need to do query in MySQL without using PHP process. but unable to figure it out.
There is a table called currency_exchange_rate table as follows, (exchange rate in LKR to other currency).this table is updating like one record for each currency in LKR in every month
exchange_rates
id currency_id start_date exchange_rate
1 5 2017-01-2 155
2 4 2017-01-3 25
3 6 2017-01-3 53
4 5 2017-02-1 156
5 4 2017-02-1 24
6 6 2017-02-1 54
There is a project table as follows
pro_id name value currency_id status_id owner_id date
1 studio1 500 5 1 44 2017-01-20
2 lotus 120 5 1 42 2017-01-21
3 auro 300 4 2 45 2017-01-21
4 studio2 400 6 1 44 2017-01-22
5 holland 450 4 3 46 2017-02-05
6 studio3 120 4 3 47 2017-02-06
7 studio4 400 6 3 48 2017-02-06
how to generate reports in one currency(DKK but exchange rate in LKR) like status wise,monthly total, total by owner, etc..
and we have to consider currency id,currency to be convert and exchange rate for the month for those currency types to get relevant value for project row.
hope you are clear about my scenario. your help is much appreciated.
I don't need every report. just want a sql for convert values in project table using exchange rates table or status wise report as follows
status_id value_in_one_currency
1 xxxx
2 xxxx
3 xxxx
Try this:
SELECT A.status_id, A.`value` * B.exchange_rate `value_in_one_currency`
FROM project A JOIN exchange_rates B
ON A.currency_id=C.currency_id
AND DATE_FORMAT(A.`date`,'%m-%Y')=DATE_FORMAT(B.`start_date`,'%m-%Y');
See MySQL Join Made Easy for some insight.
This is what I finalize:
I took currency_id=5 as the final currency to be converted
SELECT A.*,C.exchange_rate AS DKK,D.exchange_rate AS LKR, (order_value * D.exchange_rate /C.exchange_rate ) AS `converted_value`
FROM projects A
LEFT JOIN exchange_rates C ON (DATE_FORMAT(C.start_date,'%Y-%m')=DATE_FORMAT(A.`date`,'%Y-%m') AND C.currency_id=5)
LEFT JOIN exchange_rates D ON DATE_FORMAT(D.start_date,'%Y-%m')=DATE_FORMAT(A.`date`,'%Y-%m') AND D.currency_id=A.currency_id

mysql group by day and count then filter only the highest value for each day

I'm stuck on this query. I need to do a group by date, card_id and only show the highest hits. I have this data:
date card_name card_id hits
29/02/2016 Paul Stanley 1345 12
29/02/2016 Phil Anselmo 1347 16
25/02/2016 Dave Mustaine 1349 10
25/02/2016 Ozzy 1351 17
23/02/2016 Jhonny Cash 1353 13
23/02/2016 Elvis 1355 15
20/02/2016 James Hethfield 1357 9
20/02/2016 Max Cavalera 1359 12
My query at the moment
SELECT DATE(card.create_date) `day`, `name`,card_model_id, count(1) hits
FROM card
Join card_model ON card.card_model_id = card_model.id
WHERE DATE(card.create_date) >= DATE(DATE_SUB(NOW(), INTERVAL 1 MONTH)) AND card_model.preview = 0
GROUP BY `day`, card_model_id
;
I want to group by date, card_id and filter the higher hits result showing only one row per date. As if I run a max(hits) with group by but I won't work
Like:
date card_name card_id hits
29/02/2016 Phil Anselmo 1347 16
25/02/2016 Ozzy 1351 17
23/02/2016 Elvis 1355 15
20/02/2016 Max Cavalera 1359 12
Any light on that will be appreciated. Thanks for reading.
Here is one way to do this. Based on your sample data (not the query):
select s.*
from sample s
where s.hits = (select max(s2.hits)
from sample s2
where date(s2.date) = date(s.date)
);
Your attempted query seems to have no relationship to the sample data, so it is unclear how to incorporate those tables (the attempted query has different columns and two tables).

MySQL Select Last n Rows For List of ID'S

Fixture Table
uid home_uid away_uid winner date season_division_uid
1 26 6 6 2013-07-30 18
2 8 21 8 2013-06-30 18
3 6 8 8 2013-06-29 18
4 21 26 21 2013-05-20 18
5 6 26 6 2013-04-19 18
This table contains hundreds of rows.
Currently I have a query to select all the teams in a division, i.e.
SELECT team_uid
FROM Season_Division_Team
WHERE season_division_uid='18'
which lists the rows of team uid's i.e. [6,26,8,21,26].
Now for each of the unique team ids, I would like to return the last 3 winner values, ordered by the date column, that they were involved in (they could be an away_uid or home_uid).
So the returned value example would be:
team_id winner date
6 6 2013-07-30
6 8 2013-06-29
6 26 2013-04-19
26 6 2013-07-30
26 21 2013-05-20
26 6 2013-04-19
Any ideas? Thank you
Im not sure how to get it direct, a query like
select * from Season_division_Team where
`date >= (select min(`date`) from
(select `date` from season_division_team order by date desc limit 3))
and (home_uid = 6 or away_uid = 6)
Thats not going to be a good query. But only way i can think of currently
Its hard to get the 3rd largest value from SQL Example
the sub query is trying to get the date where the last win occured, and then getting all dates after that where the team played.
EDIT:
SELECT * FROM Season_Division_Team WHERE winner = 6 ORDER BY `date` DESC LIMIT 3
that sounds more like your latter comment