MySQL join with AVG calculation - mysql

Is this possible?
I have a table which stores articles
I have another table which stores users ratings of the articles, each row is a rating which references the article
I want to get the average rating score for articles in the past 30 days:
SELECT `ex`.`article_id` , `ex`.`date` , `ex`.`title` , `r`.AVG(`rating`)
FROM `exclusive` `ex`
JOIN `ratings` `r` ON `ex`.`article_id` = `r`.`article_id`
WHERE `ex`.`date` > NOW( ) - INTERVAL 30 DAY
As you can see I'm trying to reference the 'rating' with the AVG function which is causing the issue. I think the issue is that the rating needs to be calculated before the select is made some I'm beginning to doubt if it's possible?

You have to indicate how the data should be grouped, to indicate which groups to use for the average calculation, e.g.:
SELECT `ex`.`article_id` , `ex`.`title` , AVG(r.rating)
FROM `exclusive` `ex`
JOIN `ratings` `r` ON `ex`.`article_id` = `r`.`article_id`
WHERE `ex`.`date` > NOW( ) - INTERVAL 30 DAY
GROUP BY ex.article_id

Keep in mind that in order to average the rating you should aggregate itÅ› dimensions, mainly by date, say grouping the last 30 days as you request, but to do so you should avoid aggregating by each review (title) and each date. Try this:
SELECT `ex`.`article_id` , AVG( case when `ex`.`date` ` > NOW( ) - INTERVAL 30 DAY
then `r`.`rating` else 0 end) as 30_day_rating_average FROM `exclusive` `ex` JOIN `ratings` `r` ON `ex`.`article_id` = `r`.`article_id` group by 1
You can also get a column with article name, instead of just article id.

Your syntax is a little off:
Change
..., r.AVG(rating)
To
..., AVG(r.rating)
And add a group by clause at the end of your query:
...
GROUP BY 1, 2, 3
It should look like:
SELECT `ex`.`article_id` , `ex`.`date` , `ex`.`title` , AVG(`r`.`rating`)
FROM `exclusive` `ex`
JOIN `ratings` `r` ON `ex`.`article_id` = `r`.`article_id`
WHERE `ex`.`date` > NOW( ) - INTERVAL 30 DAY
GROUP BY 1, 2, 3

Related

MySQL - Display all dates including zero data nested join

I'm trying to display all dates in a month, and also in the reservation detail, I only have check_in_date and check_out_date, so I have to create left join inside a left join, below is my script
SELECT
*
FROM
(
SELECT
#dt:= DATE_ADD( #dt, interval 1 day ) myDate
FROM
(
SELECT
#dt := '2020-01-31'
) vars, tb_dummy
LIMIT 29
) JustDates
LEFT JOIN
(
SELECT
DATE_FORMAT(d.myDate2,'%Y-%m-%d') AS `myDate2`,
COALESCE(count(rdt.reservation_detail_id), 0) AS `RNS`,
FORMAT(SUM(rdt.subtotal_amount/COALESCE(DATEDIFF(DATE(DATE(rdt.check_out_date)), DATE(rdt.check_in_date)), 0)), 2) AS `REVENUE`,
FORMAT(SUM(rdt.subtotal_amount/COALESCE(DATEDIFF(DATE(DATE(rdt.check_out_date)), DATE(rdt.check_in_date)), 0))/COALESCE(count(rdt.reservation_detail_id), 0), 2) AS `AVGREV`
FROM
(
SELECT
#dt:= DATE_ADD( #dt, interval 1 day ) myDate2
FROM
(
SELECT
#dt := '2020-01-31'
) vars2, tb_dummy
LIMIT 29
) d
LEFT JOIN
tb_reservation_detail rdt
ON d.myDate2 BETWEEN DATE(rdt.check_in_date) AND DATE(DATE(rdt.check_out_date) - INTERVAL 1 DAY)
INNER JOIN
tb_reservation R
ON rdt.reservation_id = R.reservation_id
WHERE
rdt.reservation_status_id <> 3
AND
R.property_id = 57
GROUP BY d.myDate2
ORDER BY d.myDate2 ASC
) Resv
ON
JustDates.myDate = Resv.myDate2
ORDER BY
JustDates.myDate ASC
when i run it only return dates from the left table like : Left join result
but when I change
SELECT
*
FROM
(
SELECT
#dt:= DATE_ADD( #dt, interval 1 day ) myDate
FROM
(
SELECT
#dt := '2020-01-31'
) vars, tb_dummy
LIMIT 29
) JustDates
**LEFT JOIN**
(
to
SELECT
*
FROM
(
SELECT
#dt:= DATE_ADD( #dt, interval 1 day ) myDate
FROM
(
SELECT
#dt := '2020-01-31'
) vars, tb_dummy
LIMIT 29
) JustDates
**RIGHT JOIN**
(
it returns data from the right table like this: Right join result
What is wrong with my code?
welcome to StackOverflow. I think your problem is that you don't quite understand the difference between RIGHT JOIN and LEFT JOIN. Check out this StackOverflow post that goes over the differences.
As far as wanting to display all of the dates in a month, here's a link to an answer I posted that I believe does what you want it to. In my answer I provide an example query that contains a derived table you can select from and then LEFT JOIN your tables to so it will show all the days in the month regardless if there is data in your tables for a given day or not.
Hope this helps.

mysql group by day with count multi types of records

i have a table with id | type | publishedon
type may be 1,2,3 or 4 (int) value
i want to select posts for every day
now i'm using
SELECT FROM_UNIXTIME( `publishedon` , "%Y-%m-%d" ) AS `day` , count( id ) AS listings,
TYPE FROM posts
WHERE (
FROM_UNIXTIME( publishedon ) >= SUBDATE( NOW( ) , 30 )
)
GROUP BY `day`
the result
day listings
2013-09-02 17
2013-09-05 105
i want make listings filed more detailed like
day type_1 type_2 type_3 type_4
2013-09-02 10 4 6 3
2013-09-05 6 4 1 3
You simply need to put all your type values:
SELECT
FROM_UNIXTIME( `publishedon` , "%Y-%m-%d" ) AS `day`,
count(id) AS listings,
(SELECT COUNT(id) FROM `posts` WHERE `type`=1 AND FROM_UNIXTIME(`publishedon`, "%Y-%m-%d")=`day`) AS `type_1`,
(SELECT COUNT(id) FROM `posts` WHERE `type`=2 AND FROM_UNIXTIME(`publishedon`, "%Y-%m-%d")=`day`) AS `type_2`,
(SELECT COUNT(id) FROM `posts` WHERE `type`=3 AND FROM_UNIXTIME(`publishedon`, "%Y-%m-%d")=`day`) AS `type_3`,
(SELECT COUNT(id) FROM `posts` WHERE `type`=4 AND FROM_UNIXTIME(`publishedon`, "%Y-%m-%d")=`day`) AS `type_4`
FROM
`posts`
WHERE
FROM_UNIXTIME(`publishedon`) >= SUBDATE(NOW(), 30)
GROUP BY
`day`
but in fact, that will work slow since there are functions in conditions. If it is only a formatting matter, it's better to act like:
SELECT
FROM_UNIXTIME(`publishedon`, "%Y-%m-%d" ) AS `day`,
`type`,
count( id ) AS listings,
FROM
`posts`
WHERE
-- this should be better evaluated in application
-- since will not produce index using too:
FROM_UNIXTIME(`publishedon`) >= SUBDATE(NOW(), 30)
GROUP BY
`day`,
`type`
and then create desired formatting inside application.

mysql get sum of hours / minutes / seconds

I have a table with: userid and timestamp each time a user opens a page a new field is inserted.
I am trying to get the total amount of hours / minutes / days / weeks that appear in a 1 month interval for multiple users.
I have tried a bunch of different queries but each have ended up terribly inefficient.
Ideally I'd like to end up with something like:
userid | minutes | hours | days | weeks
1 10080 168 7 1
2 1440 24 1 0
Hopefully someone can shed some light on how to do this.
Below is a query that I tried:
SELECT
w.time AS `week`,
d.time AS `day`,
h.time AS `hour`,
m.time AS `minutes`
FROM (
SELECT
SUM( t.time ) AS `time`
FROM (
SELECT
COUNT( DISTINCT WEEK( `timestamp` ) ) AS `time`
FROM table
WHERE
userid = "1"
AND
`timestamp` > DATE_SUB( NOW( ) , INTERVAL 1 MONTH )
GROUP BY MONTH( `timestamp` )
) t
) w,
(
SELECT
SUM( t.time ) AS `time`
FROM (
SELECT
COUNT( DISTINCT DAY( `timestamp` ) ) AS `time`
FROM table
WHERE
userid = "52"
AND
`timestamp` > DATE_SUB( NOW( ) , INTERVAL 1 MONTH )
GROUP BY MONTH( `timestamp` )
) t
) d,
(
SELECT
SUM( t.timestamp ) AS `time`
FROM (
SELECT
COUNT( DISTINCT HOUR( `timestamp` ) ) AS `time`
FROM table
WHERE
userid = "1"
AND
`timestamp` > DATE_SUB( NOW( ) , INTERVAL 1 MONTH )
GROUP BY DAY( `timestamp` )
) t
) h,
(
SELECT
SUM( t.timestamp ) AS `time`
FROM (
SELECT
COUNT( DISTINCT MINUTE( `timestamp` ) ) AS `time`
FROM table
WHERE
userid = "1"
AND
`timestamp` > DATE_SUB( NOW( ) , INTERVAL 1 MONTH )
GROUP BY HOUR( `timestamp` )
) t
) m
It seems awfully excessive for this task, maybe someone has something better?
It's not clear to me what you want to "total".
If you want to determine whether a user had a "hit" (or whatever transaction it is you are storing in the table) at any given minute within the month), and then you want to count the number of "minute periods" within a month that a user had a hit:
SELECT t.userid
, COUNT(DISTINCT DATE_FORMAT(t.timestamp,'%Y-%m-%d %H:%i')) AS minutes
, COUNT(DISTINCT DATE_FORMAT(t.timestamp,'%Y-%m-%d %H' )) AS hours
, COUNT(DISTINCT DATE_FORMAT(t.timestamp,'%Y-%m-%d' )) AS days
, COUNT(DISTINCT DATE_FORMAT(t.timestamp,'%X-%V' )) AS weeks
FROM mytable t
WHERE t.timestamp >= '2012-06-01'
AND t.timestamp < '2012=07-01'
GROUP BY t.userid
What this is doing is taking each timestamp, and putting it into a "bucket", by chopping off the seconds, chopping off the minutes, chopping off the time, etc.
Basically, we're taking a timestamp (e.g. '2012-07-25 23:15:30') and assigning it to
minute '2012-07-25 23:15'
hour '2012-07-25 23'
day '2012-07-25'
A timestamp of '2012-07-25 23:25:00' would get assigned to
minute '2012-07-25 23:25'
hour '2012-07-25 23'
day '2012-07-25'
Then we go through and count the number of distinct buckets we assigned a timestamp to. If that's all the hits for this user in the month, the query would return a 2 for minutes, and a 1 for all other period counts.
For a user with a single hit within the month, all the counts for that user will be a 1.
For a user that has all their "hits" within exactly the same minute, the query will again return a 1 for all the counts.
(For a user with no "hits" within a month, no row will be returned. (You'd need to join another row source to get a list of users, if you wanted to return zero counts.)
For a user with a "hit" every second within a single day, this query will return counts like that shown for userid 2 in your example.
This result set gives you a kind of an indication of a user's activity for a month... how many "minute periods" within a month the user was active.
The largest value that could be returned for "days" would be the number of days in the month. The largest possible value to be returned for "hours" would be 24 times the number of days in the month times. The largest possible value returned for "minutes" would be 1440 times the number of days in the month.
But again, it's not entirely clear to me what result set you want to return. But this seems like a much more reasonable result set than the one from the previously "selected" answer.
SELECT userid, SUM(MINUTE(timestamp)) AS minutes, SUM(MINUTE(timestamp))/60 AS hours, SUM(MINUTE(timestamp))/(60*24) AS days, SUM(MINUTE(timestamp))/(60*24*7) AS weeks
FROM Table
GROUP BY userid
If neccesary, use ROUND(SUM(MINUTE(timestamp)), 0) if you want integer numbers.

Group by count()

I'm trying to make the following query work:
SELECT
DATE_FORMAT( date, '%Y %m' ) AS `Month`,
COUNT( schedule_id ) AS `Shifts`,
COUNT(user_id) AS `Users`
FROM
schedule
GROUP BY
`Month`, `Shifts`
It should give a frequency table stating how many users work a certain amount of shifts, per month (e.g. in Dec. there were 10 users working 20 shifts, 12 users working 15 shifts etc).
MySQL can't group on a COUNT() though, so the query breaks. How can I make this work?
Try this:
SELECT
`Month`, `Shifts`, COUNT(`User`) `Users`
FROM (
SELECT -- select nr of shifts per user
DATE_FORMAT( date, '%Y %m' ) AS `Month`,
user_id AS `User`,
COUNT( schedule_id ) AS `Shifts`
FROM
schedule
GROUP BY
`Month`, `User`
) s
GROUP BY `Month`, `Shifts`
Inner query returns month, user and shifts count. In outer query you can group by shifts.
Use subquery to get counts per some idetifier ( column id in example ), then join it with original query
SELECT ... FROM schedule sh JOIN ( SELECT id, COUNT( schedule_id ) AS Shifts FROM schedule ) AS cnt ON cnt.id = sh.id GROUP BY ..., cnt.Shifts
SELECT
y
, m
, Shifts
, COUNT(*) AS Users
FROM
( SELECT
YEAR(date) AS y
, MONTH(date) AS m
, user_id
, COUNT(*) AS Shifts
FROM
schedule
GROUP BY
YEAR(date), MONTH(date), user_id
) AS grp
GROUP BY
y
, m
, Shifts

Mysql nested query optimization

I have a table that logs various transactions for a CMS. It logs the username, action, and time. I have made the following query to tell me how many transactions each user made in the past two days, but it is so slow its faster for me to send a bunch of separate querys at this point. Am I missing a fundamental rule for writing nested queries?
SELECT DISTINCT
`username`
, ( SELECT COUNT(*)
FROM `ActivityLog`
WHERE `username`=`top`.`username`
AND `time` > CURRENT_TIMESTAMP - INTERVAL 2 DAY
) as `count`
FROM `ActivityLog` as `top`
WHERE 1;
You could use:
SELECT username
, COUNT(*) AS count
FROM ActivityLog
WHERE time > CURRENT_TIMESTAMP - INTERVAL 2 DAY
GROUP BY username
An index on (username, time) would be helpful regarding speed.
If you want users with 0 transcations (the last 2 days), use this:
SELECT DISTINCT
act.username
, COALESCE(grp.cnt, 0) AS cnt
FROM ActivityLog act
LEFT JOIN
( SELECT username
, COUNT(*) AS count
FROM ActivityLog
WHERE time > CURRENT_TIMESTAMP - INTERVAL 2 DAY
GROUP BY username
) AS grp
ON grp.username = act.username
or, if you have a users table:
SELECT
u.username
, COALESCE(grp.cnt, 0) AS cnt
FROM users u
LEFT JOIN
( SELECT username
, COUNT(*) AS count
FROM ActivityLog
WHERE time > CURRENT_TIMESTAMP - INTERVAL 2 DAY
GROUP BY username
) AS grp
ON grp.username = u.username
Another way, similar to yours, would be:
SELECT username
, SUM(IF(time > CURRENT_TIMESTAMP - INTERVAL 2 DAY, 1, 0))
AS count
FROM ActivityLog
GROUP BY username
or even this (because true=1 and false=0 for MySQL):
SELECT username
, SUM(time > CURRENT_TIMESTAMP - INTERVAL 2 DAY)
AS count
FROM ActivityLog
GROUP BY username
No need for nesting...
SELECT `username`, COUNT(`username`) as `count` FROM `ActivityLog` WHERE `time` > CURRENT_TIMESTAMP - INTERVAL 2 DAY GROUP BY `username`
Also don't forget to add an INDEX on time if you want to make it even faster