Select rows from MySQL and grouping use MAX and MIN - mysql

I have the following table called 'ArchiveTable':
+---------+-----------------+---------+-----------------+--------------+------------------+
| maxtemp | maxtemptime | mintemp | mintemptime | minwindchill | minwindchilltime |
+---------+-----------------+---------+-----------------+--------------+------------------+
| 27.9 | 3/17/2015 16:55 | 25.8 | 3/17/2015 19:00 | 25.8 | 3/17/2015 19:00 |
+---------+-----------------+---------+-----------------+--------------+------------------+
| 25.7 | 3/17/2015 19:05 | 19.3 | 3/18/2015 9:05 | 19.3 | 3/18/2015 9:05 |
+---------+-----------------+---------+-----------------+--------------+------------------+
| 23.1 | 3/18/2015 19:05 | 18.7 | 3/19/2015 6:30 | 18.7 | 3/19/2015 6:30 |
+---------+-----------------+---------+-----------------+--------------+------------------+
I have to select the maximum value of 'maxtemp' and its corresponding 'maxtemptime' date, minimum value of 'mintemp' and its corresponding date, and minimum value of 'minwindchill' and its corresponding date.
I know how to obtain the max and min values with the MAX() and MIN() functions, but I cannot associate these values to the corresponding date.

If you could take the values on separate rows, then you could do something like this:
(select a.* from archivetable order by maxtemp limit 1) union
(select a.* from archivetable order by maxtemp desc limit 1) union
. . .
Otherwise, if you can do something like this:
select atmint.mintemp, atmint.mintempdate,
atmaxt.maxtemp, atmaxt.maxtempdate,
atminwc.minwindchill, atminwc.minwindchilldate
from (select min(mintemp) as mintemp, max(maxtemp) as maxtemp, min(minwindchill) as minwindchill
from archivetable
) a join
archivetable atmint
on atmint.mintemp = a.mintemp join
archivetable atmaxt
on atmaxt.maxtemp = a.maxtemp join
archivetable atminwc
on atminwc.minwindchill = a.minwindchill
limit 1;
The limit 1 is because multiple rows might have the same values. If so, you can arbitrarily choose one of them, based on how your question is phrased.

See this MySQL Handling of GROUP BY
If I understood you correctly you should do something like this
SELECT field1, field2, fieldN, COUNT(field1) AS alias FROM table
GROUP BY field1
HAVING maxtemp = MAX(maxtemp); -- I think this is not correct
Although I'm not 100% sure about that solution you could try this as well
SELECT field1, field2, fieldN, COUNT(field1) AS alias FROM table
GROUP BY field1
HAVING maxtemp = (SELECT MAX(maxtemp) FROM table);

Related

Earliest time of daily maximum values

I have a table that logs weather data variables by datetime like this:
|------------------|------------| ----
| LogDateTime | Temp | ...
+------------------|------------| ----
| 2020-01-01 00:00 | 20.1 | ...
| 2020-01-01 00:05 | 20.1 | ...
| 2020-01-01 00:10 | 19.9 | ...
| 2020-01-01 00:15 | 19.8 | ...
---------------------------------------
From that table I want to return the earliest time of the maximum temperature for each day like this (just the time portion of the datetime value):
|------------|----------------------
| LogDate | LogTime| MaxTemp
+---------------------|--------------
| 2020-01-01 | 14:00 | 24.5
| 2020-01-02 | 15:12 | 23.2
| 2020-01-03 | 10:12 | 25.1
| 2020-01-04 | 12:14 | 28.8
--------------------------------
The query I have to return this so far is the below, but it returns the earliest temperature for each day instead of the earliest occurrence of the maximum temperature for each day
SELECT TIME(a.LogDateTime), a.Temp
FROM Monthly a
INNER JOIN (
SELECT TIME(LogDateTime), LogDateTime, MAX(Temp) Temp
FROM Monthly
GROUP BY LogDateTime
) b ON a.LogDateTime = b.LogDateTime AND a.Temp= b.Temp
GROUP BY DATE(a.LogDateTime)
I then want to use that query to update a table of one row per day that summarises the minimum and maximum values with a query something like this but update the time rather than the actual maximum temperature:
UPDATE Dayfile AS d
JOIN (
SELECT DATE(LogDateTime) AS date, MAX(Temp) AS Temps
FROM Monthly
GROUP BY date
) AS m ON DATE(d.LogDate) = m.date
SET d.MaxTemp = m.Temps
Your version of MariaDB supports window functions, so use ROW_NUMBER():
select LogDateTime, Temp
from (
select *,
row_number() over (partition by date(LogDateTime) order by Temp desc, LogDateTime) rn
from Monthly
) t
where t.rn = 1
See a simplified demo.
Use it to update Dayfile like this:
update Dayfile d
inner join (
select LogDateTime, Temp
from (
select *,
row_number() over (partition by date(LogDateTime) order by Temp desc, LogDateTime) rn
from Monthly
) t
where t.rn = 1
) m on date(d.LogDate) = m.date
set d.MaxTemp = m.Temp

MySQL select last row each day

Trying to select last row each day.
This is my (simplified, more records in actual table) table:
+-----+-----------------------+------+
| id | datetime | temp |
+-----+-----------------------+------+
| 9 | 2017-06-05 23:55:00 | 9.5 |
| 8 | 2017-06-05 23:50:00 | 9.6 |
| 7 | 2017-06-05 23:45:00 | 9.3 |
| 6 | 2017-06-04 23:55:00 | 9.4 |
| 5 | 2017-06-04 23:50:00 | 9.2 |
| 4 | 2017-06-05 23:45:00 | 9.1 |
| 3 | 2017-06-03 23:55:00 | 9.8 |
| 2 | 2017-06-03 23:50:00 | 9.7 |
| 1 | 2017-06-03 23:45:00 | 9.6 |
+-----+-----------------------+------+
I want to select row with id = 9, id = 6 and id = 3.
I have tried this query:
SELECT MAX(datetime) Stamp
, temp
FROM weatherdata
GROUP
BY YEAR(DateTime)
, MONTH(DateTime)
, DAY(DateTime)
order
by datetime desc
limit 10;
But datetime and temp does not match.
Kind Regards
Here's one way, which gets the MAX date per day and then uses it in the INNER query to get the other fields:
SELECT *
FROM test
WHERE `datetime` IN (
SELECT MAX(`datetime`)
FROM test
GROUP BY DATE(`datetime`)
);
Here's the SQL Fiddle.
If your rows are always inserted and never updated, and if id is an autoincrementing primary key, then
SELECT w.*
FROM weatherdata w
JOIN ( SELECT MAX(id) id
FROM weatherdata
GROUP BY DATE(datetime)
) last ON w.id = last.id
will get you what you want. Why? The inner query returns the largest (meaning most recent) id value for each date in weatherdata. This can be very fast indeed, especially if you put an index on the datetime column.
But it's possible the conditions for this to work don't hold. If your datetime column sometimes gets updated to change the date, it's possible that larger id values don't always imply larger datetime values.
In that case you need something like this.
SELECT w.*
FROM weatherdata w
JOIN ( SELECT MAX(datetime) datetime
FROM weatherdata
GROUP BY DATE(datetime)
) last ON w.datetime = last.datetime
Your query doesn't work because it misuses the nasty nonstandard extension to MySQL GROUP BY. Read this: https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
It should, properly, use the ANY_VALUE() function to highlight the unpredictability of the results. It shoud read ....
SELECT MAX(datetime) Stamp, ANY_VALUE(temp) temp
which means you aren't guaranteed the right row's temp value. Rather, it can return the temp value from any row in each day's grouping.

Optimizing SQL Query for max value with various conditions from a single MySQL table

I have the following SQL query
SELECT *
FROM `sensor_data` AS `sd1`
WHERE (sd1.timestamp BETWEEN '2017-05-13 00:00:00'
AND '2017-05-14 00:00:00')
AND (`id` =
(
SELECT `id`
FROM `sensor_data` AS `sd2`
WHERE sd1.mid = sd2.mid
AND sd1.sid = sd2.sid
ORDER BY `value` DESC, `id` DESC
LIMIT 1)
)
Background:
I've checked the validity of the query by changing LIMIT 1 to LIMIT 0, and the query works without any problem. However with LIMIT 1 the query doesn't complete, it just states loading until I shutdown and restart.
Breaking the Query down:
I have broken down the query with the date boundary as follows:
SELECT *
FROM `sensor_data` AS `sd1`
WHERE (sd1.timestamp BETWEEN '2017-05-13 00:00:00'
AND '2017-05-14 00:00:00')
This takes about 0.24 seconds to return the query with 8200 rows each having 5 columns.
Question:
I suspect the second half of my Query, is not correct or well optimized.
The tables are as follows:
Current Table:
+------+-------+-------+-----+-----------------------+
| id | mid | sid | v | timestamp |
+------+-------+-------+-----+-----------------------+
| 51 | 10 | 1 | 40 | 2015-05-13 11:56:01 |
| 52 | 10 | 2 | 39 | 2015-05-13 11:56:25 |
| 53 | 10 | 2 | 40 | 2015-05-13 11:56:42 |
| 54 | 10 | 2 | 40 | 2015-05-13 11:56:45 |
| 55 | 10 | 2 | 40 | 2015-05-13 11:57:01 |
| 56 | 11 | 1 | 50 | 2015-05-13 11:57:52 |
| 57 | 11 | 2 | 18 | 2015-05-13 11:58:41 |
| 58 | 11 | 2 | 19 | 2015-05-13 11:58:59 |
| 59 | 11 | 3 | 58 | 2015-05-13 11:59:01 |
| 60 | 11 | 3 | 65 | 2015-05-13 11:59:29 |
+------+-------+-------+-----+-----------------------+
Q: How would I get the MAX(v)for each sid for each mid?
NB#1: In the example above ROW 53, 54, 55 have all the same value (40), but I would like to retrieve the row with the most recent timestamp, which is ROW 55.
Expected Output:
+------+-------+-------+-----+-----------------------+
| id | mid | sid | v | timestamp |
+------+-------+-------+-----+-----------------------+
| 51 | 10 | 1 | 40 | 2015-05-13 11:56:01 |
| 55 | 10 | 2 | 40 | 2015-05-13 11:57:01 |
| 56 | 11 | 1 | 50 | 2015-05-13 11:57:52 |
| 58 | 11 | 2 | 19 | 2015-05-13 11:58:59 |
| 60 | 11 | 3 | 65 | 2015-05-13 11:59:29 |
+------+-------+-------+-----+-----------------------+
Structure of the table:
NB#2:
Since this table has over 110 million entries, it is critical to have have date boundaries, which limits to ~8000 entries over a 24 hour period.
The query can be written as follows:
SELECT t1.id, t1.mid, t1.sid, t1.v, t1.ts
FROM yourtable t1
INNER JOIN (
SELECT mid, sid, MAX(v) as v
FROM yourtable
WHERE ts BETWEEN '2015-05-13 00:00:00' AND '2015-05-14 00:00:00'
GROUP BY mid, sid
) t2
ON t1.mid = t2.mid
AND t1.sid = t2.sid
AND t1.v = t2.v
INNER JOIN (
SELECT mid, sid, v, MAX(ts) as ts
FROM yourtable
WHERE ts BETWEEN '2015-05-13 00:00:00' AND '2015-05-14 00:00:00'
GROUP BY mid, sid, v
) t3
ON t1.mid = t3.mid
AND t1.sid = t3.sid
AND t1.v = t3.v
AND t1.ts = t3.ts;
Edit and Explanation:
The first sub-query (first INNER JOIN) fetches MAX(v) per (mid, sid) combination. The second sub-query is to identify MAX(ts) for every (mid, sid, v). At this point, the two queries do not influence each others' results. It is also important to note that ts date range selection is done in the two sub-queries independently such that the final query has fewer rows to examine and no additional WHERE filters to apply.
Effectively, this translates into getting MAX(v) per (mid, sid) combination initially (first sub-query); and if there is more than one record with the same value MAX(v) for a given (mid, sid) combo, then the excess records get eliminated by the selection of MAX(ts) for every (mid, sid, v) combination obtained by the second sub-query. We then simply associate the output of the two queries by the two INNER JOIN conditions to get to the id of the desired records.
Demo
select * from sensor_data s1 where s1.v in (select max(v) from sensor_data s2 group by s2.mid)
union
select * from sensor_data s1 where s1.v in (select max(v) from sensor_data s2 group by s2.sid);
IN ( SELECT ... ) does not optimize well. It is even worse because of being correlated.
What you are looking for is a groupwise-max .
Please provide SHOW CREATE TABLE; we need to know at least what the PRIMARY KEY is.
Suggested code
You will need:
With the WHERE: INDEX(timestamp, mid, sid, v, id)
Without the WHERE: INDEX(mid, sid, v, timestamp, id)
Code:
SELECT id, mid, sid, v, timestamp
FROM ( SELECT #prev_mid := 99999, -- some value not in table
#prev_sid := 99999,
#n := 0 ) AS init
JOIN (
SELECT #n := if(mid != #prev_mid OR
sid != #prev_sid,
1, #n + 1) AS n,
#prev_mid := mid,
#prev_sid := sid,
id, mid, sid, v, timestamp
FROM sensor_data
WHERE timestamp >= '2017-05-13'
timestamp < '2017-05-13' + INTERVAL 1 DAY
ORDER BY mid DESC, sid DESC, v DESC, timestamp DESC
) AS x
WHERE n = 1
ORDER BY mid, sid; -- optional
Notes:
The index is 'composite' and 'covering'.
This should make one pass over the index, thereby providing 'good' performance.
The final ORDER BY is optional; the results may be in reverse order.
All the DESC in the inner ORDER BY must be in place to work correctly (unless you are using MySQL 8.0).
Note how the WHERE avoids including both midnights? And avoids manually computing leap-days, year-ends, etc?
With the WHERE (and associated INDEX), there will be filtering, but a 'sort'.
Without the WHERE (and the other INDEX), sort will not be needed.
You can test the performance of any competing formulations via this trick, even if you do not have enough rows (yet) to get reliable timings:
FLUSH STATUS;
SELECT ...
SHOW SESSION STATUS LIKE 'Handler%';
This can also be used to compare different versions of MySQL and MariaDB -- I have seen 3 significantly different performance characteristics in a related groupwise-max test.

How to query avg for every past 7 days in sql, MySQL?

Say I have a dataset of :
|dateid | value |
|20150101 | 1 |
|20150102 | 2 |
|20150103 | 3.1 |
|20150104 | 4.3 |
|20150105 | 3.1 |
|20150106 | 1 |
|20150107 | 1 |
|20150108 | 1 |
|.... | |
|.... | ... |
|20151001 | 10.3|
I want to query the average of every past 7 days based on a date range.
say for dateid from 20150707 and 20150730, when I select row of 20150707, I also need the average value between 20150701 and 20150707( (1+2+3.1+4.3+1+1+1+1)/7) as well as the value for 20150707(1) like:
select dateid, value , avg(value) as avg_past_7 from mytable where dateid between 20150707 and 20150730GROUP BY every past_7days.
And when the records are less than 7 rows to count, the avg remains null.
That means if I only have records from 20150707-20150730 in the table, the past_7_day avg for 20150707/8/9/10/11/12 remains null.
Correlated sub-select:
select dateid, value, (select avg(value) from mytable t2
where t2.dateid between (DATE_SUB(date(t1.dateid),INTERVAL 6 day)+0)
and t1.dateid) as avg_past_7
from mytable t1
where dateid between 20150101 and 20150201 order by dateid;
Use Date_SUB With Interval of 7 Days
I solve the problem by :
select t1.dateid, t1.value, if(count(1)>=7,avg(t2.value),null)
from mytable t1 , mytable t2
where t2.dateid between DATE_SUB(date(t1.dateid),INTERVAL 6 day)+0 and t1.dateid and
t1.dateid between 20150105 and 20150201
group by t1.dateid ,t1.value
order by dateid;

Update with SUM and LIMIT, rolling SUM

I have 2 tables, SVISE and OVERW
Inside OVERW I have some scores with person ids and the date of that score.
E.g
p_id degrees mo_date
5 10.2 2013-10-09
5 9.85 2013-03-10
8 14.75 2013-04-25
8 11.00 2013-02-22
5 5.45 2013-08-11
5 6.2 2013-06-10
SVISE.ofh field must be updated with the sum of the last three records
(for a specific person, ordered by date descending), so for person with id 5, the sum would result from the rows
5 10.2 2013-10-09
5 5.45 2013-08-11
5 6.2 2013-06-10
sum=21.85.
Desired final result on SVISE, based on the values above:
HID OFH START
5 21.85 October, 16 2013 ##(10.2 + 5.45 + 6.2)
5 21.5 September, 07 2013 ##(5.45 + 6.2 + 9.85)
5 0 March, 05 2013 ##(no rows)
8 25.75 October, 14 2013 ##(14.75 + 11)
3 0 October, 14 2013 ##(no rows)
5 0 March, 05 2012 ##(no rows)
OFHwas 0 initially
I can get the total sum for a specific person, but I can't use limit to get the last 3 rows. It gets ignored.
This is the query I use to retrieve the sum of all degrees per person for a given date:
UPDATE SVISE SV
SET
SV.ofh=(SELECT sum(degrees) FROM OVERW WHERE p_id =SV.hid
AND date(mo_date)<date(SV.start)
AND year(mo_date)=year(SV.start))
I cannot just use limit with sum:
UPDATE SVISE SV
SET
SV.ofh=(SELECT sum(degrees) FROM OVERW WHERE p_id =SV.hid
AND date(mo_date)<date(SV.start)
AND year(mo_date)=year(SV.start)
ORDER BY mo_date DESC
LIMIT 3)
This does not work.
I have tried with multi-table updates and with nested queries to achieve this.
Every scenario has known limitations that block me from accomplishing the desired result.
Nested queries cant see the parent table. Unknown column 'SV.hid'in 'where clause'
Multi-table update cant be use with limit. Incorrect usage of UPDATE and LIMIT
Any solution will do. There is no need to do it in a single query. If anyone wants to try even with an intermediate table.
An SQL fiddle is also available.
Thanks in advance for your help.
--Update--
Here is the solution from Akash: http://sqlfiddle.com/#!2/4cf1a/1
This should work,
UPDATED to have a join on svice
UPDATE
svice SV
JOIN (
SELECT
hid,
start,
sum(degrees) as degrees
FROM
(
SELECT
*,
IF(#prev_row <> unix_timestamp(start)+P_ID, #row_number:=0,NULL),
#prev_row:=unix_timestamp(start)+P_ID,
#row_number:=#row_number+1 as row_number
FROM
(
SELECT
mo_date,
p_id,
hid,
start,
degrees
FROM
OVERW
JOIN svice sv ON ( p_id = hid
AND date(mo_date)<date(SV.start)
AND year(mo_date)=year(SV.start) )
ORDER BY
hid,
start,
mo_date desc
) sub_query1
JOIN ( select #row_number:=0, #prev_row:=0 ) sub_query2
) sub_query
where
row_number <= 3
GROUP BY
hid,
start
) sub_query ON ( sub_query.hid = sv.hid AND sub_query.start = sv.start )
SET
SV.ofh = sub_query.degrees
Note: Check this with your updated data, the test data provided could not yield the results you expected due to the date conditions
Try
UPDATE svice SV
JOIN (SELECT SUM(degrees)sumdeg,p_id FROM(SELECT DISTINCT degrees,p_id FROM OVERW,svice WHERE OVERW.p_id IN (SELECT svice.hid FROM svice)
AND date(mo_date)<date(svice.start)
AND year(mo_date)=year(svice.start)ORDER BY mo_date DESC )deg group by p_id)bbc
ON bbc.p_id=SV.hid
SET
SV.ofh=bbc.sumdeg where p_id =SV.hid
http://sqlfiddle.com/#!2/95b42/42
Getting closer,now it "only" needs a limit in GROUP BY.
Two assumptions:
You can figure out how to turn this into an update, and
A PK exists on (id,mo_date)
Then you can do this -
SELECT p_id
, SUM(degrees) ttl
FROM
( SELECT x.*
FROM overw x
JOIN overw y
ON y.p_id = x.p_id
AND y.mo_date >= x.mo_date
GROUP
BY p_id
, mo_date HAVING COUNT(*) <= 3
) a
GROUP
BY p_id;
Maybe I'm slow, but let's ignore svice for now.
Can you show the correct result and the working for each row below...
+------+---------+------------+--------+
| p_id | degrees | mo_date | result |
+------+---------+------------+--------+
| 5 | 6.20 | 2013-06-10 | ? |
| 5 | 5.45 | 2013-08-11 | ? |
| 5 | 10.20 | 2013-10-09 | 21.85 | <- = 10.2+5.45+6.2 = 21.85
| 8 | 14.75 | 2013-04-25 | ? |
| 5 | 9.85 | 2013-03-10 | ? |
| 8 | 11.00 | 2013-02-22 | ? |
+------+---------+------------+--------+