How to select array of rows with max param from each range? - mysql

I have sqlite table like so:
CREATE TABLE "table" (
`id` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`param` REAL NOT NULL,
`date` INTEGER NOT NULL
);
INSERT INTO table (param , date) VALUES (123.3, 1427824800 );
INSERT INTO table (param , date) VALUES (122.3, 1427825800 );
INSERT INTO table (param , date) VALUES (125.0, 1427652000 );
INSERT INTO table (param , date) VALUES (123.9, 1427652900);
|id| param | date |
|==|=======|============|
| 1| 123.3 | 1427824800 |
| 2| 122.3 | 1427825800 |
| 3| 125 | 1427652000 |
| 4| 123.9 | 1427652900 |
And get row with max date for each day like(BETWEEN startDay AND endDay) - I need to understand how to group it at least by day, but if there are any way to group via custom period(week, month) it would be great
SELECT id, param, MAX(date) FROM table WHERE date BETWEEN 1427652000 AND 1427824799
SELECT id, param, MAX(date) FROM table WHERE date BETWEEN 1427824800 AND 1427911199
But trully I got much more data for nearly year and 1000+ rows and make 365 queries not an option I think, but I don't know how to optimize it
UPD
After all I think it is real to get one query to get result like this:
|id| param | date |
|==|=======|============|
| 2| 122.3 | 1427825800 |
| 4| 123.9 | 1427652900 |

Here's what you want. You'll use GROUP BY and you'll use the DATE() function to extract the day from each item. UNIX_TIMESTAMP() and FROM_UNIXTIME() are helpful for flipping back and forth between the TIMESTAMP and DATETIME representations. This is necessary because the date arithmetic stuff works on DATETIME values. CURDATE() means today.
SELECT DATE(FROM_UNIXTIME(`date`)) AS day,
MAX(`date`) AS latest_timestamp_in_day
FROM table
WHERE `date` >= UNIX_TIMESTAMP(CURDATE() - INTERVAL 366 DAY)
AND `date` < UNIX_TIMESTAMP(CURDATE())
GROUP BY DATE(FROM_UNIXTIME(`date`))
ORDER BY DATE(FROM_UNIXTIME(`date`)) DESC
This works because
DATE(FROM_UNIXTIME(`date`))
takes each timestamp and returns the DATETIME value of the first moment of the calendar day in which it falls.
The first WHERE clause picks up dates on or after a year and a day ago. The second one excludes today's dates; presumably the MAX operation doesn't make much sense on a not-yet-completed day.
It isn't clear from your question whether you also want to display the param value associated with the last timestamp in each calendar day. If you do, that's a little harder. You first need to get the latest timestamp in each day, then you need to pull out the detail record. That requires a join operation. You'll treat the above query as a subquery, and join it to your table. Like this.
SELECT summary.day, detail.`date`, detail.param
FROM table AS detail
JOIN (
SELECT DATE(FROM_UNIXTIME(`date`)) AS day,
MAX(`date`) AS latest_timestamp_in_day
FROM table
WHERE `date` >= UNIX_TIMESTAMP(CURDATE() - INTERVAL 366 DAY)
AND `date` < UNIX_TIMESTAMP(CURDATE())
GROUP BY DATE(FROM_UNIXTIME(`date`))
) AS summary ON detail.`date` = summary.latest_timestamp_in_day
ORDER BY summary.day DESC
Careful, though. the DATETIME arithmetic is done in the local time zone. This can lead to bizarre results on the days when local time changes from daylight savings to standard and back.
Notice that your column named date is in backticks. It's the same as a reserved word in MySQL's query language, so the backticks help disambiguate.
Here's a more detailed exposition of this business of grouping by date. http://www.plumislandmedia.net/mysql/sql-reporting-time-intervals/

You can use a generator to generate each year day and then join with your table:
SELECT r0 as start_day, MAX(date)
(
SELECT #row as r0, #row := #row + (24*60*60) as r1 FROM
(select 0 union all select 1 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 6 union all select 7 union all select 8 union all select 9) st1,
(select 0 union all select 1 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 6 union all select 7 union all select 8 union all select 9) st2,
(select 0 union all select 1 union all select 3 union all select 4 ) st3,
(SELECT #row:=1420070400) st00
) S
left outer join T
on T.date BETWEEN S.r0 AND S.r1
where r1 <= 1451520000

Related

print amount of records based on two dates

I'm trying to get amount of record based on two dates in MySql, example table :
id
data
datetime_1
datetime_2.
1
row data 1
2021-06-28 10:00:00
2021-07-02 10:00:00
and I want the result like this :
id
data
datetime
1
row data 1
2021-06-28 10:00:00
2
row data 1
2021-06-29 10:00:00
3
row data 1
2021-06-30 10:00:00
4
row data 1
2021-07-01 10:00:00
5
row data 1
2021-07-02 10:00:00
is that possible ?
You would typically handle this requirement via a calendar table, which is a table containing a sequence of dates which you expect to need in your query. Here is an example, where I have used an inline union subquery in place of a formal calendar table:
WITH dates AS (
SELECT '2021-06-28 10:00:00' AS dt UNION ALL
SELECT '2021-06-29 10:00:00' UNION ALL
SELECT '2021-06-30 10:00:00' UNION ALL
SELECT '2021-07-01 10:00:00' UNION ALL
SELECT '2021-07-02 10:00:00'
)
SELECT
ROW_NUMBER() OVER (ORDER BY t2.dt) AS id,
t1.data,
t2.dt AS datetime
FROM yourTable t1
INNER JOIN dates t2
ON t2.dt BETWEEN t1.datetime_1 AND t1.datetime_2
ORDER BY
t2.dt;
WITH RECURSIVE date_ranges AS (
SELECT datetime_1 AS dt1, datetime_2 dt2 FROM mytable
UNION ALL
SELECT dt1+INTERVAL 1 DAY, dt2 FROM date_ranges WHERE dt1 < dt2)
SELECT ROW_NUMBER() OVER (ORDER BY dt1) AS id,
b.data, dt1 as datetime FROM date_ranges a JOIN
mytable b ON a.dt1 BETWEEN datetime_1 AND b.datetime_2;
You can use this method to generate a date range based on your existing data. The final query might need other functions like MAX() etc. depending on your data but with the current sample that you've provided, this should suffice.
Fiddle with additional scenario

Why does using the year 20,000 in MySQL have a different result

Using the exact same table when I run the following two queries on MYSQL (version 5.6.37) I get very different results which makes no sense to me...
SELECT COUNT(*) FROM salesTransactions WHERE date<'2000-01-01'
I get 159 results (as expected). However when I then run the same report but increase the year to 20,000 I get a completely different result:
SELECT COUNT(*) FROM salesTransactions WHERE date<'20000-01-01'
I get the result 6.
How is that possible? If I change the date to 30,000 I get the expected result count. I played around and the year 20100 has 8 results, which is another number.
The table is:
ID: integer
Date: date
Name: varchar(32)
How can this be?
The DATE type is used for values with a date part but no time part. MySQL retrieves and displays DATE values in 'YYYY-MM-DD' format. The supported range is '1000-01-01' to '9999-12-31'. The DATETIME type is used for values that contain both date and time parts. MySQL retrieves and displays DATETIME values in 'YYYY-MM-DD HH:MM:SS' format. The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'. The TIMESTAMP data type is used for values that contain both date and time parts. TIMESTAMP has a range of '1970-01-01 00:00:01' UTC to '2038-01-19 03:14:07' UTC. (https://dev.mysql.com/doc/refman/5.7/en/datetime.html)
Interestingly you would expect the implicit date conversion to return null and so the query would return nothing. So I ran a little test.
Select count(*) from awsalesorderheader
union all
select count(*) from awsalesorderheader where orderdate < '9999-01-01'
union all
select count(*) from awsalesorderheader where orderdate < '20049-01-01'
union all
select count(*) from awsalesorderheader where orderdate < '20050-01-01'
union all
select count(*) from awsalesorderheader where orderdate < '20070-01-01'
union all
select count(*) from awsalesorderheader where orderdate < '10000-01-01'
union all
select count(*) from awsalesorderheader where orderdate < '40000-01-01'
union all
select count(*) from awsalesorderheader where orderdate < '30000-01-01'
union all
select count(*) from awsalesorderheader where orderdate < null
;
+----------+
| count(*) |
+----------+
| 31466 |
| 31466 |
| 0 |
| 1379 |
| 17514 |
| 0 |
| 31466 |
| 31466 |
| 0 |
+----------+
9 rows in set (0.26 sec)
Notice there is a cutoff at 20049-01-01 after which I start to get counts > 0. My mind jumps to the (probably wrong) conclusion that y2k hasn't gone away.

Adding blank rows to display of result set returned by MySQL query

I am storing hourly results in a MySQL database table which take the form:
ResultId,CreatedDateTime,Keyword,Frequency,PositiveResult,NegativeResult
349,2015-07-17 00:00:00,Homer Simpson,0.0,0.0,0.0
349,2015-07-17 01:00:00,Homer Simpson,3.0,4.0,-2.0
349,2015-07-17 01:00:00,Homer Simpson,1.0,1.0,-1.0
349,2015-07-17 04:00:00,Homer Simpson,1.0,1.0,0.0
349,2015-07-17 05:00:00,Homer Simpson,8.0,3.0,-2.0
349,2015-07-17 05:00:00,Homer Simpson,1.0,0.0,0.0
Where there might be several results for a given hour, but none for certain hours.
If I want to produce averages of the hourly results, I can do something like this:
SELECT ItemCreatedDateTime AS 'Created on',
KeywordText AS 'Keyword', ROUND(AVG(KeywordFrequency), 2) AS 'Average frequency',
ROUND(AVG(PositiveResult), 2) AS 'Average positive result',
ROUND(AVG(NegativeResult), 2) AS 'Average negative result'
FROM Results
WHERE ResultsNo = 349 AND CreatedDateTime BETWEEN '2015-07-13 00:00:00' AND '2015-07-19 23:59:00'
GROUP BY KeywordText, CreatedDateTime
ORDER BY KeywordText, CreatedDateTime
However, the results only include the hours where data exists, e.g.:
349,2015-07-17 01:00:00,Homer Simpson,2.0,2.5,-1.5
349,2015-07-17 04:00:00,Homer Simpson,1.0,1.0,0.0
349,2015-07-17 05:00:00,Homer Simpson,4.5,1.5,-1.0
But I need to show blanks rows for the missing hours, e.g.
349,2015-07-17 01:00:00,Homer Simpson,2.0,2.5,-1.5
349,2015-07-17 02:00:00,Homer Simpson,0.0,0.0,0.0
349,2015-07-17 03:00:00,Homer Simpson,0.0,0.0,0.0
349,2015-07-17 04:00:00,Homer Simpson,1.0,1.0,0.0
349,2015-07-17 05:00:00,Homer Simpson,4.5,1.5,-1.0
Short of inserting blanks into the results before they are presented, I am uncertain of how to proceed: can I use MySQL to include the blank rows at all?
SQL in general has no knowledge about the data, so you have to add that yourself. In this case you will have to insert the not used hours somehow. This can be done by inserting empty rows, or a bit different by counting the hours and adjusting your average for that.
Counting the hours and adjusting the average:
Count all hours with data (A)
Calculate the number of hours in the period (B)
Calculate the avg as you already did, multiply by A divide by B
Example code to get the hours:
SELECT COUNT(*) AS number_of_records_with_data,
(TO_SECONDS('2015-07-19 23:59:00')-TO_SECONDS('2015-07-13 00:00:00'))/3600
AS number_of_hours_in_interval
FROM Results
WHERE ResultsNo = 349 AND CreatedDateTime
BETWEEN '2015-07-13 00:00:00' AND '2015-07-19 23:59:00'
GROUP BY KeywordText, CreatedDateTime;
And just integrate it with the rest of your query.
You can't use MySQL for that. You'll have to do this with whatever you're using later to process the results. Iterate over the range of hours/dates you're interested in and for those, where MySQL returned some data, us that data. For the rest, just add null/zero values.
Small update after some discussions with my stackoverflow colleagues:
Instead of you can't I should have wrote you shouldn't - as other users have proved there are ways to do this. But I still believe that for different tasks we should use tools that were created having such tasks in mind. And by that I mean that while it's probably possible to tow a car with an F-16, it's still better to just call a tow truck ;) That's what tow trucks are made for.
Although you already have accepted an answer I want to demonstrate how you can generate a datetime series in the query and use that to solve your problem.
This query uses a combination of cross joins together with basic arithmetic and date functions to generate a series of all hours between 2015-07-16 00:00:00 AND 2015-07-18 23:59:00.
Generating this type of data on the fly isn't the best option though; if you already had a table with the numbers 0-31 then all the union queries would be unnecessary.
See this SQL Fiddle to see how it could look using a small number table.
Sample SQL Fiddle with a demo of the query below
select
c.createddate as "Created on",
c.Keyword,
coalesce(ROUND(AVG(KeywordFrequency), 2),0.0) AS 'Average frequency',
coalesce(ROUND(AVG(PositiveResult), 2),0.0) AS 'Average positive result',
coalesce(ROUND(AVG(NegativeResult), 2),0.0) AS 'Average negative result'
from (
select
q.createddate + interval d day + interval t hour as createddate,
d.KeywordText AS 'Keyword'
from (
select distinct h10*10+h1 d from (
select 0 as h10
union all select 1 union all select 2 union all select 3
) d10 cross join (
select 0 as h1
union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) d1
) days cross join (
select distinct t10*10 + t1 t from (
select 0 as t10 union all select 1 union all select 2
) h10 cross join (
select 0 as t1
union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) h1
) hours
cross join
-- use the following line to set the start date for the series
(select '2015-07-16 00:00:00' createddate) q
-- or use the line below to use the dates in the table
-- (select distinct cast(CreatedDateTime as date) CreatedDate from results) q
cross join (select distinct KeywordText from results) d
) c
left join results r on r.CreatedDateTime = c.createddate AND ResultsNo = 349 and r.KeywordText = c.Keyword
where c.createddate BETWEEN '2015-07-16 00:00:00' AND '2015-07-18 23:59:00'
GROUP BY c.createddate, Keyword
ORDER BY c.createddate, Keyword;
I came up with an idea to do it for add rows with null values in the last of your MySQL query.
Just run this query (in the limit add any number of empty rows you want), and ignore the last column:
SELECT ItemCreatedDateTime AS 'Created on',
KeywordText AS 'Keyword',
ROUND(AVG(KeywordFrequency), 2) AS 'Average frequency',
ROUND(AVG(PositiveResult), 2) AS 'Average positive result',
ROUND(AVG(NegativeResult), 2) AS 'Average negative result',
null
FROM Results
WHERE ResultsNo = 349 AND CreatedDateTime BETWEEN '2015-07-13 00:00:00' AND
'2015-07-19 23:59:00'
GROUP BY KeywordText, CreatedDateTime
UNION
SELECT * FROM (
SELECT null a, null b, null c, null d, null e,
(#cnt := #cnt + 1) f
FROM (SELECT null FROM Results LIMIT 23) empty1
LEFT JOIN (SELECT * FROM Results LIMIT 23) empty2 ON FALSE
JOIN (SELECT #cnt := 0) empty3
) empty
ORDER BY KeywordText, CreatedDateTime

Get Total From a certain timestamp

I have a column of timestamps and I like to have a result where I can see
the amount of added entries for a certain date (added_on_this_date)
and the total amount since the beginning (total_since_beginning)
My table:
added
==========
1392040040
1392050040
1392060040
1392070040
1392080040
1392090040
1392100040
1392110040
1392120040
1392130040
1392140040
1392150040
1392160040
1392170040
1392180040
1392190040
1392200040
The result should look like:
date | added_on_this_date | total_since_beginning
=========================================================
2014-02-10 | 4 | 4
2014-02-11 | 9 | 13
2014-02-12 | 4 | 17
I'm using this query which gives me the wrong result
SELECT FROM_UNIXTIME(added, '%Y-%m-%d') AS date,
count(*) AS added_on_this_date,
(SELECT COUNT(*) FROM mytable t2 WHERE t2.added <= t.added) AS total_since_beginning
FROM mytable t WHERE 1=1 GROUP BY date
I've created a fiddle for better understanding: http://sqlfiddle.com/#!2/a72a9/1
your mixing timestamps and yyyy-mm-dd dates...
As you group by a yyyy-mm-dd, you're not sure to know which timestamp will be taken.
You could do
SELECT FROM_UNIXTIME(added, '%Y-%m-%d') AS date,
count(*) AS added_on_this_date,
(SELECT COUNT(*) FROM mytable t2 WHERE FROM_UNIXTIME(t2.added, '%Y-%m-%d') <= FROM_UNIXTIME(t.added, '%Y-%m-%d')) AS total_since_beginning
FROM mytable t GROUP BY date
This is probably more efficient to do with variables than with a subquery:
select date, added_on_this_date,
#cumsum := #cumsum + added_on_this_date as total_since_beginning
from (SELECT FROM_UNIXTIME(added, '%Y-%m-%d') AS date,
count(*) AS added_on_this_date
FROM mytable t
WHERE 1=1
GROUP BY date
) d cross join
(select #cumsum := 0) const
order by date;
EDIT (in response to comment):
The above query has a significant performance advantage because it aggregates the data once and that is basically all the effort the query needs to do. Your original formulation with a correlated subquery can be optimized using an appropriate index. Unfortunately, once the condition in the correlated subquery uses a function on both tables, then MySQL will not be able to take advantage of an index (in general).
Because the query is aggregating by date anyway, this should perform much better.

MySQL query to get daily differential values

I want to make a MySQL to get daily differential values from a table who looks like this:
Date | VALUE
--------------------------------
"2011-01-14 19:30" | 5
"2011-01-15 13:30" | 6
"2011-01-15 23:50" | 9
"2011-01-16 9:30" | 10
"2011-01-16 18:30" | 15
I have made two subqueries. The first one is to get the last daily value, because I want to compute the difference values from this data:
SELECT r.Date, r.VALUE
FROM table AS r
JOIN (
SELECT DISTINCT max(t.Date) AS Date
FROM table AS t
WHERE t.Date < CURDATE()
GROUP BY DATE(t.Date)
) AS x USING (Date)
The second one is made to get the differential values from the result of the first one (I show it with "table" name):
SELECT Date, VALUE - IFNULL(
(SELECT MAX( VALUE )
FROM table
WHERE Date < t1.table) , 0) AS diff
FROM table AS t1
ORDER BY Date
At first, I tried to save the result of first query in a temporary table but it's not possible to use temporary tables with the second query. If I use the first query inside the FROM of second one between () with an alias, the server complaints about table alias doesn't exist. How can get a something like this:
Date | VALUE
---------------------------
"2011-01-15 00:00" | 4
"2011-01-16 00:00" | 6
Try this query -
SELECT
t1.dt AS date,
t1.value - t2.value AS value
FROM
(SELECT DATE(date) dt, MAX(value) value FROM table GROUP BY dt) t1
JOIN
(SELECT DATE(date) dt, MAX(value) value FROM table GROUP BY dt) t2
ON t1.dt = t2.dt + INTERVAL 1 DAY