I have 3 very large tables, with values getting logged every minute,
below is an extract of these tables.
I would like to get hourly averages for a period of 1 Day of these tables and join them with respect to time, please note there is a couple of seconds gap between log time for ph and temperature
Table PH (extract only, this table is very large with more than 130,000 values)
ID time Ph
72176 2013-04-06 03:29:34 7.58
72177 2013-04-06 03:30:34 7.58
72178 2013-04-06 03:31:34 7.54
72179 2013-04-06 03:32:34 7.58
72180 2013-04-06 03:33:34 7.58
72181 2013-04-06 03:34:34 7.58
72182 2013-04-06 03:35:34 7.54
72183 2013-04-06 03:36:34 7.58
72184 2013-04-06 03:37:34 7.54
72185 2013-04-06 03:38:34 7.58
72186 2013-04-06 03:39:34 7.58
Table temperature1 (extract only, this table is very large with more than 130,000 values)
ID time temperature
133312 2013-04-06 03:29:36 25.37
133313 2013-04-06 03:30:36 25.37
133314 2013-04-06 03:31:36 25.37
133315 2013-04-06 03:32:36 25.31
133316 2013-04-06 03:33:36 25.31
133317 2013-04-06 03:34:36 25.31
133318 2013-04-06 03:35:36 25.37
133319 2013-04-06 03:36:36 25.31
133320 2013-04-06 03:37:36 25.31
133321 2013-04-06 03:38:36 25.31
133322 2013-04-06 03:39:36 25.37
Table solids (extract only, this table is very large with more than 130,000 values)
ID time solids
123791 2013-04-06 03:29:49 140
123792 2013-04-06 03:30:49 140
123793 2013-04-06 03:31:49 143
123794 2013-04-06 03:32:49 140
123795 2013-04-06 03:33:49 140
123796 2013-04-06 03:34:49 140
123797 2013-04-06 03:35:49 140
123798 2013-04-06 03:36:49 143
123799 2013-04-06 03:37:49 140
123800 2013-04-06 03:38:49 140
123801 2013-04-06 03:39:49 140
I am currently getting hourly averages using the query below
SELECT DATE_FORMAT(x.time,'%Y-%m-%d %H:00:00')
, avg(x.solids) avg_solids
FROM solids x where time >= NOW() - INTERVAL 1 DAY
GROUP
BY DATE_FORMAT(x.time,'%Y-%m-%d %H:00:00');
how can I efficiently join (with respect to time) the results of the query above for each sensor (x3) to be displayed in 1 table
===============================
this query below gets the hourly values, but not sure how to tweek it to get averages per hour
SELECT DATE_FORMAT(timeTable.minuteTime, '%Y-%m-%d %k:%i') time,
(oT2.temperature) temperature,
(T2.temperature) temp,
(S2.solids) solids,
(P2.Ph) Ph
FROM
(
SELECT minuteTime.minuteTime minuteTime,
( SELECT MAX(time) FROM outside_temperature WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) otempTime,
( SELECT MAX(time) FROM temperature1 WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) tempTime,
( SELECT MAX(time) FROM Ph WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) phTime,
( SELECT MAX(time) FROM solids WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) solidsTime
FROM
(
SELECT DATE(time) + INTERVAL (HOUR(time) DIV 1 *1 ) HOUR minuteTime
FROM Ph
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
UNION SELECT DATE(time) + INTERVAL (HOUR(time) DIV 1 *1) HOUR
FROM solids
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
UNION SELECT DATE(time) + INTERVAL (HOUR(time) DIV 1 *1) HOUR
FROM outside_temperature
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
UNION SELECT DATE(time) + INTERVAL (HOUR(time) DIV 1 *1) HOUR
FROM temperature1
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
GROUP BY 1
) minuteTime
) timeTable
LEFT JOIN outside_temperature oT2 ON oT2.time = timeTable.otempTime
LEFT JOIN temperature1 T2 ON T2.time = timeTable.tempTime
LEFT JOIN solids S2 ON S2.time = timeTable.solidsTime
LEFT JOIN Ph P2 ON P2.time = timeTable.phTime
GROUP BY DATE_FORMAT(timeTable.minuteTime, '%Y-%m-%d %k:%i')
ORDER BY minuteTime ASC
Hour() seems that it would be a useful function for this, since you are only looking at a single day. Perhaps something like this would work for you:
SELECT * FROM
(SELECT HOUR(time) hour, avg(ph) AS avg_ph
FROM ph
WHERE time >= NOW() - INTERVAL 1 DAY
GROUP BY hour) p
JOIN
(SELECT HOUR(time) hour, avg(temperature) AS avg_temp
FROM temperature1
WHERE time >= NOW() - INTERVAL 1 DAY
GROUP BY hour) t ON t.hour = p.hour
JOIN
(SELECT HOUR(time) hour, avg(solids) AS avg_solids
FROM solids
WHERE time >= NOW() - INTERVAL 1 DAY
GROUP BY hour) s ON s.hour = p.hour;
Being that it is using inner joins, I'm making the assumption that there will always be at least one records in each table for the hour, but it seems like a reasonable assumption.
Related
im looking for a SQL Query that can deliver me all of the free time Intervals between a given Range for a Table with two datetime Columns (DATE_FROM, DATE_TILL). As a Requirement: All other entries are not overlapped and the only acceptable distance between each interval is 1 Second.
I have found a Solution but this doesnt fill all my Requirements, specially the one where i want to put a given start and end datetime to calculate the missing Intervals if given.
Here is my datatable:
ROW_ID LOCATION_ID DATE_FROM DATE_TILL
1 193 2019-02-01 00:00:00 2019-12-31 23:59:59
2 193 2020-02-01 00:00:00 2020-12-31 23:59:59
3 193 2021-01-01 00:00:00 2021-12-31 23:59:59
4 193 2022-01-01 00:00:00 2022-12-31 23:59:59
5 204 2020-01-01 00:00:00 2021-12-31 23:59:59
And this is my SQL Query, which is from another Solution in this Plattform where i made some requirements changes.
SELECT DATE_ADD(DATE_TILL,INTERVAL 1 SECOND) AS GAP_FROM, DATE_SUB(DATE_FROM,INTERVAL 1 SECOND) AS GAP_TILL
FROM
(
SELECT DISTINCT DATE_FROM, ROW_NUMBER() OVER (ORDER BY DATE_FROM) RN
FROM overlappingtable T1
WHERE
LOCATION_ID = 193 AND
NOT EXISTS (
SELECT *
FROM overlappingtable T2
WHERE T1.DATE_FROM > T2.DATE_FROM AND T1.DATE_FROM < T2.DATE_TILL
)
) T1
JOIN (
SELECT DISTINCT DATE_TILL, ROW_NUMBER() OVER (ORDER BY DATE_TILL) RN
FROM overlappingtable T1
WHERE
LOCATION_ID = 193 AND
NOT EXISTS (
SELECT *
FROM overlappingtable T2
WHERE T1.DATE_TILL > T2.DATE_FROM AND T1.DATE_TILL < T2.DATE_TILL
)
) T2
ON T1.RN - 1 = T2.RN
WHERE
DATE_ADD(DATE_TILL,INTERVAL 1 SECOND) < DATE_FROM
This Query delivers me this result:
GAP_FROM GAP_TILL
2020-01-01 00:00:00 2020-01-31 23:59:59
Which is great, this is the free Interval that i have to deliver between entries that have their ranges and dont overlap.
But I want to set in this Query two Parameters for The Main Range for this entries. One for the startdate and the other for enddate. For this example:
startdate = '2019-01-01 00:00:00'
enddate = '9999-12-31 23:59:59'
For LOCATION_ID = 193 i am missing the gap between the startdate('2019-01-01 00:00:00') and the first DATE_FROM for the first entry('2019-02-01 00:00:00').
The result that i would like to deliver should look like this for LOCATION_ID = 193:
GAP_FROM GAP_TILL
2019-01-01 00:00:00 2019-01-31 23:59:59
2020-01-01 00:00:00 2020-01-31 23:59:59
2023-01-01 00:00:00 9999-12-31 23:59:59
Im really new at SQL and could understand this Query, but i can't develop this further to set these Main Ranges and deliver the missing gaps.
Thanks in Advance
For clarity I would recommend to find the initial gaps, the middle ones, and the ending ones in separate CTEs, as shown below in the b, m, and e CTEs. Then, a simple UNION ALL can combine all of them:
with
p (loc_id, start_date, end_date) as (
select 193, '2019-01-01 00:00:00', '9999-12-31 23:59:59'
),
r as (
select location_id, date_from,
date_add(date_till, interval 1 second) as date_till,
lead(date_from) over(partition by location_id order by date_from) as next_from
from overlappingtable t
cross join p
where t.location_id = p.loc_id
),
b as (
select p.start_date as gap_from, r.date_from as gap_till
from (select * from r order by date_from limit 1) r
cross join p
where p.start_date < r.date_from
),
m as (
select date_till, next_from
from r
where date_till < next_from
),
e as (
select r.date_till, p.end_date
from (select * from r order by date_till desc limit 1) r
cross join p
where r.date_till < p.end_date
)
select * from b
union all select * from m
union all select * from e
order by gap_from
Result:
gap_from gap_till
-------------------- -------------------
2019-01-01 00:00:00 2019-02-01 00:00:00
2020-01-01 00:00:00 2020-02-01 00:00:00
2023-01-01 00:00:00 9999-12-31 23:59:59
See running example atDB Fiddle.
The initial CTE p includes the parameters of the query (loc_id, start_date, end_date) and is added for clarity.
You could join to a sub-query with the start & end datetimes.
Then compare to the previous & next datetimes per location_id.
The previous or next datetimes can be found via the LAG & LEAD functions.
WITH CTE_UNDERLAPS AS
(
SELECT t.*
, LAG(DATE_TILL) OVER (PARTITION BY LOCATION_ID ORDER BY DATE_FROM, DATE_TILL) AS PREV_DATE_TILL
, LEAD(DATE_FROM) OVER (PARTITION BY LOCATION_ID ORDER BY DATE_FROM, DATE_TILL) AS NEXT_DATE_FROM
, l.*
FROM overlappingtable t
JOIN (
SELECT
CAST('2019-01-01 00:00:00' AS DATETIME) AS START_DATETIME
, CAST('9999-12-31 23:59:59' AS DATETIME) AS END_DATETIME
) l ON DATE_FROM >= START_DATETIME AND DATE_TILL <= END_DATETIME
)
SELECT LOCATION_ID
, COALESCE(DATE_ADD(PREV_DATE_TILL,INTERVAL 1 SECOND), START_DATETIME) AS DATE_FROM
, DATE_SUB(DATE_FROM,INTERVAL 1 SECOND) AS DATE_TILL
FROM CTE_UNDERLAPS
WHERE COALESCE(DATE_ADD(PREV_DATE_TILL,INTERVAL 1 SECOND), START_DATETIME) < DATE_FROM
UNION
SELECT LOCATION_ID
, DATE_ADD(DATE_TILL,INTERVAL 1 SECOND) AS DATE_FROM
, COALESCE(DATE_SUB(NEXT_DATE_FROM,INTERVAL 1 SECOND), END_DATETIME) AS DATE_TILL
FROM CTE_UNDERLAPS
WHERE DATE_ADD(DATE_TILL,INTERVAL 1 SECOND) < COALESCE(NEXT_DATE_FROM, END_DATETIME)
ORDER BY LOCATION_ID, DATE_FROM, DATE_TILL
LOCATION_ID
DATE_FROM
DATE_TILL
193
2019-01-01 00:00:00
2019-01-31 23:59:59
193
2020-01-01 00:00:00
2020-01-31 23:59:59
193
2023-01-01 00:00:00
9999-12-31 23:59:59
204
2019-01-01 00:00:00
2019-12-31 23:59:59
204
2022-01-01 00:00:00
9999-12-31 23:59:59
Demo on db<>fiddle here
I have a MySQL table with 2 fields: id_type and created_at
There are several rows with the same id_type and different timestamps. Eg:
3 - 2015-06-10 12:01:20
1 - 2015-03-21 04:14:10
1 - 2015-03-17 04:14:10
0 - 2015-05-06 21:43:00
3 - 2015-05-13 19:34:32
3 - 2015-07-18 03:47:55
I need to select id_type if the newest corresponding created_at is older than 30 days (Or in other words, any id_type that was last recorded more than 30 days ago)
Expected result:
1
0
I've tried:
SELECT id_type FROM table WHERE MAX(created_at) < DATE_SUB(NOW(), INTERVAL 30 day)
Which has given me the error:
invalid use of group function
How should I build it properly?
Try this,
SELECT id_type FROM table_b WHERE created_at IN (Select MAX(created_at) from table_b where created_at < DATE_SUB(NOW(), INTERVAL 30 day));
You can use this :
select id_type from `table` where `created_at` >= DATE_SUB(CURDATE(), INTERVAL 3 MONTH)
I have database with statistics over a number of websites and I'm currently having an issue with a rather complex query that I have no idea how to do (or if it's even possible).
I have 2 tables: websites and visits. The former is a list of all websites and their properties, while the former is a list of each user's visit on a specific website.
The program I'm making is supposed to fetch websites that need to be "scanned". The interval between each scan for each site depends on the websites total number of visits for the last 30 days. Here is a table with the intended scan-interval:
The tables have the following structure:
Websites
Visits
What I want is a query that returns the websites that are either at or past their individual update deadline (can be seen from the last_scanned column).
Is this easily doable in a single query?
Here's something you can try:
SELECT main.*
FROM (
SELECT
w.web_id,
w.url,
w.last_scanned,
(SELECT COUNT(*)
FROM visits v
WHERE v.web_id = w.web_id
AND TIMESTAMPDIFF(DAY,v.added_on, NOW()) <=30
) AS visit_count,
TIMESTAMPDIFF(HOUR,w.last_scanned, NOW()) AS hrs_since_update
FROM websites w
) main
WHERE
(CASE
WHEN visit_count >= 0 AND visit_count <= 10 AND hrs_since_update >= 4320 THEN 1
WHEN visit_count >= 11 AND visit_count <= 100 AND hrs_since_update >= 2160 THEN 1
WHEN visit_count >= 101 AND visit_count <= 500 AND hrs_since_update >= 1080 THEN 1
WHEN visit_count >= 501 AND visit_count <= 1000 AND hrs_since_update >= 720 THEN 1
WHEN visit_count >= 1001 AND visit_count <= 2000 AND hrs_since_update >= 360 THEN 1
WHEN visit_count >= 2001 AND visit_count <= 5000 AND hrs_since_update >= 168 THEN 1
WHEN visit_count >= 5001 AND visit_count <= 10000 AND hrs_since_update >= 72 THEN 1
WHEN visit_count >= 10001 AND hrs_since_update >= 24 THEN 1
ELSE 0
END) = 1;
Here's the fiddle demo: http://sqlfiddle.com/#!9/1f671/1
First, I would make a subquery to get the visit counts from the visits table for each distinct web_id. Then, LEFT OUTER JOIN the websites table to this subquery. You can then query the result for each possible condition in your visits-to-update-frequency table, like so:
SELECT websites.* FROM websites
LEFT OUTER JOIN (
SELECT visits.web_id, COUNT(*) AS visits_count FROM visits GROUP BY visits.web_id
) v ON v.web_id = websites.web_id
WHERE
(v.visits_count <= 10 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 4320 HOUR)) OR
(v.visits_count BETWEEN 11 AND 100 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 2160 HOUR)) OR
(v.visits_count BETWEEN 101 AND 500 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 1080 HOUR)) OR
(v.visits_count BETWEEN 501 AND 1000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 720 HOUR)) OR
(v.visits_count BETWEEN 1001 AND 2000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 360 HOUR)) OR
(v.visits_count BETWEEN 2001 AND 5000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 168 HOUR)) OR
(v.visits_count BETWEEN 5001 AND 10000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 72 HOUR)) OR
(v.visits_count > 10000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 24 HOUR));
Just an improvment on #morgb query, using a table for visit count ranges
SQL FIDDLE DEMO
create table visitCount (
`min` bigint(20),
`max` bigint(20),
`frequency` bigint(20)
);
SELECT main.*
FROM (
SELECT
w.web_id,
w.url,
w.last_scanned,
(SELECT COUNT(*)
FROM visits v
WHERE v.web_id = w.web_id
AND TIMESTAMPDIFF(DAY,v.added_on, NOW()) <=30
) AS visit_count,
TIMESTAMPDIFF(HOUR,w.last_scanned, NOW()) AS hrs_since_update
FROM websites w
) main inner join
visitCount v on visit_count between v.min and v.max
WHERE
main.hrs_since_update > v.frequency
The following query grabs the temperature, outside temperature, ph and total suspended solids from my aquarium database and displays it in 1 table shown below. each of the values is stored in its own table and is being logged every minute.
I would like to change the query below to be able to get the average value for each hour instead of displaying the value stored in the database at x hour:00:00.
Please help
SELECT DATE_FORMAT(timeTable.minuteTime, '%Y-%m-%d %k:%i') time,
oT2.temperature temperature,
T2.temperature temp,
S2.solids solids,
P2.Ph Ph
FROM
(
SELECT minuteTime.minuteTime minuteTime,
( SELECT MAX(time) FROM outside_temperature WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) otempTime,
( SELECT MAX(time) FROM temperature1 WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) tempTime,
( SELECT MAX(time) FROM Ph WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) phTime,
( SELECT MAX(time) FROM solids WHERE time <= minuteTime.minuteTime AND time >= NOW() - INTERVAL 1 DAY) solidsTime
FROM
(
SELECT DATE(time) + INTERVAL (HOUR(time) DIV 2 * 2) HOUR minuteTime
FROM Ph
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
UNION SELECT DATE(time) + INTERVAL (HOUR(time) DIV 2 * 2) HOUR
FROM solids
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
UNION SELECT DATE(time) + INTERVAL (HOUR(time) DIV 2 * 2) HOUR
FROM outside_temperature
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
UNION SELECT DATE(time) + INTERVAL (HOUR(time) DIV 2 * 2) HOUR
FROM temperature1
WHERE time >= NOW() - INTERVAL 1 DAY AND time <= NOW()
GROUP BY 1
) minuteTime
) timeTable
LEFT JOIN outside_temperature oT2 ON oT2.time = timeTable.otempTime
LEFT JOIN temperature1 T2 ON T2.time = timeTable.tempTime
LEFT JOIN solids S2 ON S2.time = timeTable.solidsTime
LEFT JOIN Ph P2 ON P2.time = timeTable.phTime
ORDER BY minuteTime ASC
This is the result of the query
2013-04-03 22:00 27.12 26.06 139 7.54
2013-04-04 0:00 27.06 26 142 7.54
2013-04-04 2:00 26.94 26 142 7.5
2013-04-04 4:00 26.87 25.94 142 7.5
2013-04-04 6:00 26.75 25.87 141 7.58
2013-04-04 8:00 26.87 25.87 141 7.58
2013-04-04 10:00 26.87 25.87 141 7.58
2013-04-04 12:00 26.87 25.87 141 7.58
2013-04-04 14:00 26.69 25.87 144 7.54
2013-04-04 16:00 26.56 25.81 144 7.58
2013-04-04 18:00 26.5 25.75 144 7.61
2013-04-04 20:00 26.81 25.75 144 7.43
USE
SELECT DATE_FORMAT(timeTable.minuteTime, '%Y-%m-%d %k:%i') time,
AVG(oT2.temperature) temperature,
AVG(T2.temperature) temp,
AVG(S2.solids) solids,
AVG(P2.Ph) Ph
......./*YOUR QUERY
--AT THE LAST
GROUP BY DATE_FORMAT(timeTable.minuteTime, '%Y-%m-%d %k:%i')
I'd like to merge the results of the following three select statements horizontally. I tried using joins but no idea how to proceed since it involves COUNT and GROUP BY too.
SELECT DATE(created_at) as date,COUNT(*) as countd1 FROM b_users WHERE last_loggedin_at < DATE_ADD(created_at,INTERVAL 1 DAY) GROUP BY DATE(created_at)
SELECT DATE(created_at) as date,COUNT(*) as countd2 FROM b_users WHERE last_loggedin_at < DATE_ADD(created_at,INTERVAL 2 DAY) GROUP BY DATE(created_at)
SELECT DATE(created_at) as date,COUNT(*) as countd3 FROM b_users WHERE last_loggedin_at < DATE_ADD(created_at,INTERVAL 3 DAY) GROUP BY DATE(created_at)
The individual results would be
date countd1
2011-12-01 100
2011-12-02 120
2011-12-03 130
date countd2
2011-12-01 200
2011-12-02 220
2011-12-03 230
date countd3
2011-12-01 300
2011-12-02 320
2011-12-03 330
But I'd like to merge them so that I'll get the following result
date countd1 countd2 countd3
2011-12-01 100 200 300
2011-12-02 120 220 320
2011-12-03 130 230 330
How do I do this?
Is it possible to do something like the query below
SELECT a, COUNT(b where condition), COUNT(c where condition) FROM table GROUP BY a
.
Update
biziclop provided a great work around
SELECT DATE(created_at) AS date,
SUM(last_loggedin_at < DATE_ADD( created_at,INTERVAL 1 DAY )) AS countd1,
SUM(last_loggedin_at < DATE_ADD( created_at,INTERVAL 2 DAY )) AS countd2,
SUM(last_loggedin_at < DATE_ADD( created_at,INTERVAL 3 DAY )) AS countd3
FROM b_users GROUP BY DATE(created_at)
Solved, thank you! :)
In MySQL the results of comparisons are 1 if true, 0 if false, so you could SUM() them:
SELECT
DATE(created_at) AS date,
SUM( last_loggedin_at < DATE_ADD( created_at,INTERVAL 1 DAY )) AS countd1,
SUM( last_loggedin_at < DATE_ADD( created_at,INTERVAL 2 DAY )) AS countd2,
SUM( last_loggedin_at < DATE_ADD( created_at,INTERVAL 3 DAY )) AS countd3,
FROM b_users
GROUP BY DATE(created_at)