MYSQL max() and group by error:only_full_group_by - mysql

I have question about a MySQL query that is logging error's since updating the MySQL-5.7.
The error is the "only_full_group_by" which is will spoken off on stackoverflow.
In many answers it's stated not to disable this option but improve your sql query.
The query that I'm using is returning the minimum and maximum values of a counter per hour.
SELECT MAX( counter ) AS max,
MIN( counter ) AS min,
DATE_FORMAT(date_time, '%H:%i') AS dt
FROM table1
WHERE date_time >= NOW() - INTERVAL 1 DAY
GROUP BY YEAR(date_time), MONTH(date_time), DAY(date_time), HOUR(date_time)
as I understand from the error message I'm missing one of the items from the SELECT cause in the GROUP BY cause. But however I restort/remove/add items I'm not getting the result I got before the upgrade to MySQL-5.7.
I tried to subquery the main query to improve the SQL query. But somehow I can't recreate the results.
What is it I'm missing?

MySQL isn't able to determine the functional dependence ... between the expressions in the GROUP BY clause, and the expressions in the SELECT list.
The non-aggregate expression in the SELECT list (DATE_FORMAT(date_time, '%H:%i') includes a minutes component. The GROUP BY clause is going to collapse the rows into groups by just hour. So the value of the minutes is indeterminate... we know it's going to come from some row in the group, but there's no guarantee which one.
(The question reference to ONLY_FULL_GROUP_BY seems to indicate that we've got some understanding of indeterminate values...)
The easiest (fewest) changes fix would be to wrap that expression in a MIN or MAX function.
SELECT MAX(t.counter) AS `max`
, MIN(t.counter) AS `min`
, MIN(DATE_FORMAT(t.date_time,'%H:%i')) AS `dt`
FROM table1 t
WHERE t.date_time >= NOW() - INTERVAL 1 DAY
GROUP
BY YEAR(t.date_time)
, MONTH(t.date_time)
, DAY(t.date_time)
, HOUR(t.date_time)
ORDER
BY YEAR(t.date_time)
, MONTH(t.date_time)
, DAY(t.date_time)
, HOUR(t.date_time)
If we want rows returned in a particular order, we should include an ORDER BY clause, and not rely on MySQL-specific extension or behavior of GROUP BY (which may disappear in future releases.)
It's a bit odd to be doing a GROUP BY year, month, day and not including those values in the SELECT list. (It's not invalid to do that, just kind of strange. The conditions in the WHERE clause are guaranteeing that we don't have more than 24 hours span for date_time.
My preference would to do the GROUP BY on the same expression as the non-aggregate in the SELECT list. If I ever needed more than 24 hours, I'd include the date component:
SELECT MAX(t.counter) AS `max`
, MIN(t.counter) AS `min`
, DATE_FORMAT(t.date_time,'%Y-%m-%d %H:00') + INTERVAL 0 DAY AS `dt`
FROM table1 t
WHERE t.date_time >= NOW() - INTERVAL 1 DAY
GROUP
BY DATE_FORMAT(t.date_time,'%Y-%m-%d %H:00') + INTERVAL 0 DAY
ORDER
BY DATE_FORMAT(t.date_time,'%Y-%m-%d %H:00') + INTERVAL 0 DAY
--or--
if we always know it's just one day's worth of date_time, and we only want to return the hour, then we can group by just the hour. The same expression as in the SELECT list.
SELECT MAX(t.counter) AS `max`
, MIN(t.counter) AS `min`
, DATE_FORMAT(t.date_time,'%H:00') AS `dt`
FROM table1 t
WHERE t.date_time >= NOW() - INTERVAL 1 DAY
GROUP
BY DATE_FORMAT(t.date_time,'%H:00')
, DATE_FORMAT(t.date_time,'%Y-%m-%d %H')
ORDER
BY DATE_FORMAT(t.date_time,'%Y-%m-%d %H')

SELECT MAX( counter ) AS max,
MIN( counter ) AS min,
YEAR(date_time) AS g_year,
MONTH(date_time)AS g_month,
DAY(date_time) AS g_day,
HOUR(date_time) AS g_hour
FROM table1
WHERE date_time >= NOW() - INTERVAL 1 DAY
GROUP BY g_year, g_month, g_day, g_hour
Or you can get rid of redundant data if you always do it for 1 day:
SELECT MAX( counter ) AS max,
MIN( counter ) AS min,
DAY(date_time) AS g_day,
HOUR(date_time) AS g_hour
FROM table1
WHERE date_time >= NOW() - INTERVAL 1 DAY
GROUP BY g_day, g_hour

Related

Avg function not returning proper value

I expect this query to give me the avg value from daily active users up to date and grouped by month (from Oct to December). But the result is 164K aprox when it should be 128K. Why avg is not working? Avg should be SUM of values / number of current month days up to today.
SELECT sq.month_year AS 'month_year', AVG(number)
FROM
(
SELECT CONCAT(MONTHNAME(date), "-", YEAR(DATE)) AS 'month_year', count(distinct id_user) AS number
FROM table1
WHERE date between '2020-10-01' and '2020-12-31 23:59:59'
GROUP BY EXTRACT(year_month FROM date)
) sq
GROUP BY 1
Ok guys thanks for your help. The problem was that on the subquery I was pulling the info by month and not by day. So I should pull the info by day there and group by month in the outer query. This finally worked:
SELECT sq.day_month, AVG(number)
FROM (SELECT date(date) AS day_month,
count(distinct id_user) AS number
FROM table_1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY 1
) sq
GROUP BY EXTRACT(year_month FROM day_month)
Do not use single quotes for column aliases!
SELECT sq.month_year, AVG(number)
FROM (SELECT CONCAT(MONTHNAME(date), '-', YEAR(DATE)) AS month_year,
count(distinct id_user) AS number
FROM table1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY month_year
) sq
GROUP BY 1;
Note the fixes to the query:
The GROUP BY uses the same columns as the SELECT. Your query should return an error (although it works in older versions of MySQL).
The date comparisons have been simplified.
No single quotes on column aliases.
Note that the outer query is not needed. I assume it is there just to illustrate the issue you are having.

SQL date not relative

I have a table in which i store every 15 minutes result of a cron job, which is nothing more than a timestamp, a population number and an id.
I am trying to query it as following.
SELECT ROUND(AVG(`population`),0) AS population, DATE(`time`) AS date
FROM `swg_servertracker`
WHERE `time` >= DATE(NOW()) - INTERVAL 7 DAY
GROUP BY DATE(`time`)
DESC
LIMIT 7
What it does it creates an daily average, and grabs the last 7 entries. Sadly in was not in the right order, so i flipped it to ascending. My problem is when i inverse (asc) it, it skips today, and goes back an extra day (today is the 3rd of october, which is not taken in the equation when i use the ascending)
I tried to set the where statement to just now - interval 168 hours (which is also 7 days but then relative back) which had no result either on this. Still it skips today and just goes back 7 days from on yesterday.
SELECT ROUND(AVG(`population`),0) AS population, DATE(`time`) AS date
FROM `swg_servertracker`
WHERE `time` >= NOW() - INTERVAL 168 HOUR
GROUP BY DATE(`time`)
ASC
LIMIT 7
So is there a way I can take today in account as well?
You select 8 records instead 7 records. If you want to select 7 latest records, you must use "greater than" sign instead "greater than or equal" sign.
SELECT ROUND(AVG(`population`),0) AS population, DATE(`time`) AS date
FROM `swg_servertracker`
WHERE `time` > NOW() - INTERVAL 7 DAY
GROUP BY DATE(`time`)
ASC
LIMIT 7
You can get the result-set in a Derived table, and do a sorting on the results again.
Note that, in MySQL, Aliases defined in the Select clause can be used in Group By, Having and Order By clauses. So, I have aliased DATE(time) to avoid re-computation in Group by and Order By.
You can do this instead:
SELECT dt.population,
dt.date1 AS date
FROM (
SELECT ROUND(AVG(`population`),0) AS population,
DATE(`time`) AS date1
FROM `swg_servertracker`
WHERE `time` >= DATE(NOW()) - INTERVAL 7 DAY
GROUP BY date1
ORDER BY date1 DESC
LIMIT 7
) AS dt
ORDER BY dt.date1 ASC

Count number of entries in time interval 1 that appear in time interval 2 - SQL

I am new here and tried to look up the answer to my question but couldn't find anything on it. I am currently learning how to work with SQL queries and am wondering how I can count the amount of unique values that appear in two time intervals?
I have two columns; one is the timestamp while the other is a customer id. What I want to do is to check, for example, the amount of customers that appear in time interval A, let's say January 2014 - February 2014. I then want to see how many of these also appear in another time interval that i specify, for example February 2014-April 2014. If the total sample were 2 people who both bought something in january while only one of them bought something else before the end of April, the count would be 1.
I am a total beginner and tried the query below but it obviously won't return what I want because each entry only having one timestamp makes it not possible to be in two intervals.
SELECT
count(customer_id)
FROM db.table
WHERE time >= date('2014-01-01 00:00:00')
AND time < date('2014-02-01 00:00:00')
AND time >= date('2014-02-01 00:00:00')
AND time < date('2014-05-01 00:00:00')
;
Try this.
select count(distinct t.customer_id) from Table t
INNER JOIN Table t1 on t1.customer_id = t.customer_id
and t1.time >= '2014-01-01 00:00:00' and t1.time<'2014-02-01 00:00:00'
where t.time >='2014-02-01 00:00:00' and t.time<'2014-05-01 00:00:00'
Here's one method of doing this with conditional grouping in an inner-select.
Select Case
When GroupBy = 1 Then 'January - February 2014'
When GroupBy = 2 Then 'February - April 2014'
End As Period,
Count (Customer_Id) As Total
From
(
SELECT Customer_Id,
Case
When Time Between '2014-01-01' And '2014-02-01' Then 1
When Time Between '2014-02-01' And '2014-04-01' Then 2
Else -1
End As GroupBy
From db.Table
) D
Where GroupBy <> -1
Group By GroupBy
Edit: Sorry, misread the question. This will show you those that overlap those two time ranges:
Select Count(Customer_Id)
From db.Table t1
Where Exists
(
Select Customer_Id
From db.Table t2
Where t1.customer_id = t2.customer_id
And t2.Time Between '2014-02-01' And '2014-04-01'
)
And t1.Time Between '2014-01-01' And '2014-02-01'

extract no of click month wise

I am using MySQL. Here is my schema:
bannerstatclick(idBannerStats: integer, Time: Timestamp, idCampaignBanner :char(36))
I am trying to write a query to select the total no of click month wise by using count on idCampaignBanner.
this will not work it will give an error invalid use of group function.
iwill also try this using having clause but it also not work...
SELECT count(idCampaignBanner) AS TotalClicks ,max(`Time`) AS maxdate,(min(`Time`) + INTERVAL 30 DAY)as monthly
FROM newradium.BannerStatsClick
WHERE Time BETWEEN max(`Time`) AND ( max(`Time`)- INTERVAL 30 DAY)
Something like this should work (you need group by clause if you do aggregation)
select count(idCampaignBanner), MONTH(`Time`) as m
from newradium.BannerStatsClick
group by m
SELECT
count(idCampaignBanner) AS TotalClicks
, max(`Time`) AS maxdate
, (min(`Time`) + INTERVAL 30 DAY)as monthly
FROM newradium.BannerStatsClick
WHERE Time <= (Select max(`Time`) FROM newradium.BannerStatsClick)
And Time >= (Select max(`Time`) - INTERVAL 30 DAY FROM newradium.BannerStatsClick)
Technically could get rid of "Time <= (Select max(Time) FROM newradium.BannerStatsClick)", doesn't really affect the selection. But left in in case you needed different range in future

Average posts per hour on MySQL?

I have a number of posts saved into a InnoDB table on MySQL. The table has the columns "id", "date", "user", "content". I wanted to make some statistic graphs, so I ended up using the following query to get the amount of posts per hour of yesterday:
SELECT HOUR(FROM_UNIXTIME(`date`)) AS `hour`, COUNT(date) from fb_posts
WHERE DATE(FROM_UNIXTIME(`date`)) = CURDATE() - INTERVAL 1 DAY GROUP BY hour
This outputs the following data:
I can edit this query to get any day I want. But what I want now is the AVERAGE of each hour of every day, so that if on Day 1 at 00 hours I have 20 posts and on Day 2 at 00 hours I have 40, I want the output to be "30". I'd like to be able to pick date periods as well if it's possible.
Thanks in advance!
You can use a sub-query to group the data by day/hour, then take the average by hour across the sub-query.
Here's an example to give you the average count by hour for the past 7 days:
select the_hour,avg(the_count)
from
(
select date(from_unixtime(`date`)) as the_day,
hour(from_unixtime(`date`)) as the_hour,
count(*) as the_count
from fb_posts
where `date` >= unix_timestamp(current_date() - interval 7 day)
and created_on < unix_timestamp(current_date())
group by the_day,the_hour
) s
group by the_hour
Aggregate the information by date and hour, and then take the average by hour:
select hour, avg(numposts)
from (SELECT date(`date`) as day, HOUR(FROM_UNIXTIME(`date`)) AS `hour`,
count(*) as numposts
from fb_posts
WHERE DATE(FROM_UNIXTIME(`date`)) between <date1> and <date2>
GROUP BY date(`date`), hour
) d
group by hour
order by 1
By the way, I prefer including the explicit order by, since most databases do not order the results of a group by. Mysql happens to be one database that does.
SELECT
HOUR(FROM_UNIXTIME(`date`)) AS `hour`
, COUNT(`id`) \ COUNT(DISTINCT TO_DAYS(`date`)) AS avgHourlyPostCount
FROM fb_posts
WHERE `date` > '2012-01-01' -- your optional date criteria
GROUP BY hour
This gives you a count of all the posts, divided by the number of days, by hour.