I'm trying to fetch the records with half an hour time interval of the current day with concern data count for that time period.
So, my output came as expected. But, If count(no records) on the particular time period let's say 7:00 - 7:30 I'm not getting that record with zero count.
My Query as follows :
SELECT time_format( FROM_UNIXTIME(ROUND(UNIX_TIMESTAMP(start_time)/(30* 60)) * (30*60)) , '%H:%i')
thirtyHourInterval , COUNT(bot_id) AS Count FROM bot_activity
WHERE start_time BETWEEN CONCAT(CURDATE(), ' 00:00:00') AND CONCAT(CURDATE(), ' 23:59:59')
GROUP BY ROUND(UNIX_TIMESTAMP(start_time)/(30* 60))
For reference of my output :
We need a source for that 7:30 row; a row source for all the time values.
If we have a clock table that contains all of the time values we want to return, such that we can write a query that returns that first column, the thirty minute interval values we want to return,
as an example:
SELECT c.hhmm AS thirty_minute_interval
FROM clock c
WHERE c.hhmm ...
ORDER BY c.hhmm
then we can do an outer join the results of the query with the missing rows
SELECT c.hhmm AS _thirty_minute_interval
, IFNULL(r._cnt_bot,0) AS _cnt_bot
FROM clock c
LEFT
JOIN ( -- query with missing rows
SELECT time_format(...) AS thirtyMinuteInterval
, COUNT(...) AS _cnt_bot
FROM bot_activity
WHERE
GROUP BY time_format(...)
) r
ON r.thirtyMinuteInterval = c.hhmm
WHERE c.hhmm ...
ORDER BY c.hhmm
The point is that the SELECT will not generate "missing" rows from a source where they don't exist; we need a source for them. We don't necessarily have to have a separate clock table, we could have an inline view generate the rows. But we do need to be able to SELECT those value from a source.
( Note that bot_id in the original query is indeterminate; the value will be from some row in the collapsed set of rows, but no guarantee which value. (If we add ONLY_FULL_GROUP_BY to sql_mode, the query will throw an error, like most other relational databases will when non-aggregate expressions in the SELECT list don't appear in the GROUP BY are aren't functionally dependent on the GROUP BY )
EDIT
In place of a clock table, we can use an inline view. For small sets, we could something like this.
SELECT c.tmi
FROM ( -- thirty minute interval
SELECT CONVERT(0,TIME) + INTERVAL h.h+r.h HOUR + INTERVAL m.mm MINUTE AS tmi
FROM ( SELECT 0 AS h UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7
UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11
) h
CROSS JOIN ( SELECT 0 AS h UNION ALL SELECT 12 ) r
CROSS JOIN ( SELECT 0 AS mm UNION ALL SELECT 30 ) m
ORDER BY tmi
) c
ORDER
BY c.tmi
(Inline view c is a standin for a clock table, returns time values on thirty minute boundaries.)
That's kind of ugly. We can see where if we had a rowsource of just integer values, we could make this much simpler. But if we pick that apart, we can see how to extend the same pattern to generate fifteen minute intervals, or shorten it to generate two hour intervals.
Related
I need an Amazon Redshift SQL query to calculate the number of a particular day fall in between two dates.
Date Format - YYYY-MM-DD
For example - Start date = 2019-06-14, End Date = 2019-10-09, Day - 2nd of every month
Now, I want to calculate the count of 2nd-day fall in between 2019-06-14 and 2019-10-09
So, the actual result for the above example should be 4. Since 4 times the 2nd-day will fall in between 2019-06-14 and 2019-10-09.
I tried the DATE_DIFF function and months_between function of redshift. But failed to build the logic. Since not able to understand what math or equation should be.
for me it seems as if you wanted to select from a calendar table. That's how you can solve your problem. You'll notice that the query looks a little hacky because Redshift does not support any functions to generate sequences, which leaves you with creating sequence tables yourself (see seq_10 and seq_1000). Once you have a sequence, you can easily create a calendar with all the information you need (eg. day_of_month).
That's the query answering your question:
WITH seq_10 as (
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1
), seq_1000 as (
select
row_number() over () - 1 as n
from
seq_10 a cross join
seq_10 b cross join
seq_10 c
), calendar as (
select '2018-01-01'::date + n as date,
extract(day from date) as day_of_month,
extract(dow from date) as day_of_week
from seq_1000
)
select count(*) from calendar
where day_of_month = 2
and date between '2019-06-14' and '2019-10-09'
I am storing hourly results in a MySQL database table which take the form:
ResultId,CreatedDateTime,Keyword,Frequency,PositiveResult,NegativeResult
349,2015-07-17 00:00:00,Homer Simpson,0.0,0.0,0.0
349,2015-07-17 01:00:00,Homer Simpson,3.0,4.0,-2.0
349,2015-07-17 01:00:00,Homer Simpson,1.0,1.0,-1.0
349,2015-07-17 04:00:00,Homer Simpson,1.0,1.0,0.0
349,2015-07-17 05:00:00,Homer Simpson,8.0,3.0,-2.0
349,2015-07-17 05:00:00,Homer Simpson,1.0,0.0,0.0
Where there might be several results for a given hour, but none for certain hours.
If I want to produce averages of the hourly results, I can do something like this:
SELECT ItemCreatedDateTime AS 'Created on',
KeywordText AS 'Keyword', ROUND(AVG(KeywordFrequency), 2) AS 'Average frequency',
ROUND(AVG(PositiveResult), 2) AS 'Average positive result',
ROUND(AVG(NegativeResult), 2) AS 'Average negative result'
FROM Results
WHERE ResultsNo = 349 AND CreatedDateTime BETWEEN '2015-07-13 00:00:00' AND '2015-07-19 23:59:00'
GROUP BY KeywordText, CreatedDateTime
ORDER BY KeywordText, CreatedDateTime
However, the results only include the hours where data exists, e.g.:
349,2015-07-17 01:00:00,Homer Simpson,2.0,2.5,-1.5
349,2015-07-17 04:00:00,Homer Simpson,1.0,1.0,0.0
349,2015-07-17 05:00:00,Homer Simpson,4.5,1.5,-1.0
But I need to show blanks rows for the missing hours, e.g.
349,2015-07-17 01:00:00,Homer Simpson,2.0,2.5,-1.5
349,2015-07-17 02:00:00,Homer Simpson,0.0,0.0,0.0
349,2015-07-17 03:00:00,Homer Simpson,0.0,0.0,0.0
349,2015-07-17 04:00:00,Homer Simpson,1.0,1.0,0.0
349,2015-07-17 05:00:00,Homer Simpson,4.5,1.5,-1.0
Short of inserting blanks into the results before they are presented, I am uncertain of how to proceed: can I use MySQL to include the blank rows at all?
SQL in general has no knowledge about the data, so you have to add that yourself. In this case you will have to insert the not used hours somehow. This can be done by inserting empty rows, or a bit different by counting the hours and adjusting your average for that.
Counting the hours and adjusting the average:
Count all hours with data (A)
Calculate the number of hours in the period (B)
Calculate the avg as you already did, multiply by A divide by B
Example code to get the hours:
SELECT COUNT(*) AS number_of_records_with_data,
(TO_SECONDS('2015-07-19 23:59:00')-TO_SECONDS('2015-07-13 00:00:00'))/3600
AS number_of_hours_in_interval
FROM Results
WHERE ResultsNo = 349 AND CreatedDateTime
BETWEEN '2015-07-13 00:00:00' AND '2015-07-19 23:59:00'
GROUP BY KeywordText, CreatedDateTime;
And just integrate it with the rest of your query.
You can't use MySQL for that. You'll have to do this with whatever you're using later to process the results. Iterate over the range of hours/dates you're interested in and for those, where MySQL returned some data, us that data. For the rest, just add null/zero values.
Small update after some discussions with my stackoverflow colleagues:
Instead of you can't I should have wrote you shouldn't - as other users have proved there are ways to do this. But I still believe that for different tasks we should use tools that were created having such tasks in mind. And by that I mean that while it's probably possible to tow a car with an F-16, it's still better to just call a tow truck ;) That's what tow trucks are made for.
Although you already have accepted an answer I want to demonstrate how you can generate a datetime series in the query and use that to solve your problem.
This query uses a combination of cross joins together with basic arithmetic and date functions to generate a series of all hours between 2015-07-16 00:00:00 AND 2015-07-18 23:59:00.
Generating this type of data on the fly isn't the best option though; if you already had a table with the numbers 0-31 then all the union queries would be unnecessary.
See this SQL Fiddle to see how it could look using a small number table.
Sample SQL Fiddle with a demo of the query below
select
c.createddate as "Created on",
c.Keyword,
coalesce(ROUND(AVG(KeywordFrequency), 2),0.0) AS 'Average frequency',
coalesce(ROUND(AVG(PositiveResult), 2),0.0) AS 'Average positive result',
coalesce(ROUND(AVG(NegativeResult), 2),0.0) AS 'Average negative result'
from (
select
q.createddate + interval d day + interval t hour as createddate,
d.KeywordText AS 'Keyword'
from (
select distinct h10*10+h1 d from (
select 0 as h10
union all select 1 union all select 2 union all select 3
) d10 cross join (
select 0 as h1
union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) d1
) days cross join (
select distinct t10*10 + t1 t from (
select 0 as t10 union all select 1 union all select 2
) h10 cross join (
select 0 as t1
union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) h1
) hours
cross join
-- use the following line to set the start date for the series
(select '2015-07-16 00:00:00' createddate) q
-- or use the line below to use the dates in the table
-- (select distinct cast(CreatedDateTime as date) CreatedDate from results) q
cross join (select distinct KeywordText from results) d
) c
left join results r on r.CreatedDateTime = c.createddate AND ResultsNo = 349 and r.KeywordText = c.Keyword
where c.createddate BETWEEN '2015-07-16 00:00:00' AND '2015-07-18 23:59:00'
GROUP BY c.createddate, Keyword
ORDER BY c.createddate, Keyword;
I came up with an idea to do it for add rows with null values in the last of your MySQL query.
Just run this query (in the limit add any number of empty rows you want), and ignore the last column:
SELECT ItemCreatedDateTime AS 'Created on',
KeywordText AS 'Keyword',
ROUND(AVG(KeywordFrequency), 2) AS 'Average frequency',
ROUND(AVG(PositiveResult), 2) AS 'Average positive result',
ROUND(AVG(NegativeResult), 2) AS 'Average negative result',
null
FROM Results
WHERE ResultsNo = 349 AND CreatedDateTime BETWEEN '2015-07-13 00:00:00' AND
'2015-07-19 23:59:00'
GROUP BY KeywordText, CreatedDateTime
UNION
SELECT * FROM (
SELECT null a, null b, null c, null d, null e,
(#cnt := #cnt + 1) f
FROM (SELECT null FROM Results LIMIT 23) empty1
LEFT JOIN (SELECT * FROM Results LIMIT 23) empty2 ON FALSE
JOIN (SELECT #cnt := 0) empty3
) empty
ORDER BY KeywordText, CreatedDateTime
I want to get all records which are not "older" than 20 days. If there are no records within 20 days, I want all records from the most recent day. I'm doing this:
SELECT COUNT(DISTINCT t.id) FROM t
WHERE
(DATEDIFF(NOW(), t.created) <= 20
OR
(date(t.created) >= (SELECT max(date(created)) FROM t)));
This works so far, but it is awful slow. created is a datetime, might be due tue the conversion to a date... Any ideas how to speed this up?
SELECT COUNT(*) FROM (
SELECT * FROM t WHERE datediff(now(),created) between 0 and 20
UNION
SELECT * FROM (SELECT * FROM t WHERE created<now() LIMIT 1) last1
) last20d
I used the between clause just in case there might be dates in the future in the table. These will be excluded. Also you can simplify the select, if you just need the count() to
SELECT COUNT(*) FROM (
SELECT id FROM t WHERE datediff(now(),created) between 0 and 20
UNION
SELECT id FROM (SELECT id FROM t WHERE created<now() LIMIT 1) last1
) last20d
otherwise, in the first select version you can leave out the outer select if you want all the data of the chosen records. The UNION will make sure that duplicates will be excluded (in other cases I always use UNION ALL since it is faster).
I need to display the total of 'orders' for each year and month. But for some months there is no data, but I DO want to display that month (with a total value of zero). I could make a helpertable 'months' with 12 records for each year, but is there maybe a way to get a range of months, without introducing a new table?
Something like:
SELECT [all year-month combinations between january 2000 and march 2011]
FROM DUAL AS years_months
Does anybody have an idea how to do this? Can you use SELECT with some kind of formula, to 'create' data on the fly?!
UPDATE:
Found this myself:
generate days from date range
The accepted answer in this question is kind of what I'm looking for. Maybe not the easiest method, but it does what I want: fill a select with data, based on a formula....
To 'create' a table on the fly with all months of the last 10 years:
SELECT CONCAT(MONTHNAME(datetime), ' ' , YEAR(datetime)) AS YearMonth,
MONTH(datetime) AS Month,
YEAR(datetime) AS Year
FROM (
select (curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) MONTH) as datetime
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
LIMIT 120
) AS t
ORDER BY datetime ASC
I must admit, this is VERY exotic, but it DOES work...
I can use this select to join it with my 'orders'-table and get the totals for each month, even when there is no data in a certain month.
But using a 'numbers' or 'calendar' table is probably the best option, so I'm going to use that.
If at all possible, try to stay away from generating data on the fly. It makes very simple queries ridiculusly complex, but above all: it confuses the optimizer to no end.
If you need a series of integers, use a static table of integers. If you need a series of dates, months or whatever, use a calendar table. Unless you are dealing with some truly extraordinary requirements, a static table is the way to go.
I gave an example on how to create a table of numbers and a minimal calendar table(only dates) in this answer.
If you have those tables in place, it becomes easy to solve your query.
Aggregate the order data to MONTH.
Right join to the table of months (or distinct MONTH from the table of dates)
You could try something like this
select * from
(select 2000 as year union
select 2001 as year union
select 2009
) as years,
(select 1 as month union
select 2 as month union
select 3 as month union
select 4 as month union
select 5 as month union
select 6 as month union
select 7 as month union
select 8 as month union
select 9 as month
)
AS months
WHERE year between 2001 AND 2008 OR (year=2000 and month>0) OR (year = 2009 AND month < 4)
ORDER by year,month
You could just fill in the missing months after you've done your query in your application logic.
You should most definitely do this in your application rather than the DB layer. Simply create an array of dates for the time range, and merge the actual data with the empty dates you pre-created. See this answer to similar question
I do following query to generate months in a given interval. For my case it generate list of month started from may 2013 until now.
SELECT date_format(#dt:= DATE_ADD( #dt, INTERVAL 1 MONTH),'%M %Y') date_string,
#dt as date_full
FROM (SELECT #dt := DATE_SUB(CAST(DATE_FORMAT('2013-05-01' ,'%Y-%m-01') AS DATE),
INTERVAL 1 MONTH) ) vars,
your_tables
WHERE #dt<NOW()
The concern is, it should be joined with table containing sufficient rows to supply number of month you expected. E.g. if you need to generate all month in a particular year, you will need a tables consisting at least 12 rows.
For me it is a bit straight forward. I joined it with my configuration table, consisting around 370 rows. So it could generate months in a year, or days in a year if I need it. Changing from month interval into days interval would be easy, as I need only to change the interval from MONTH to DAY.
If you're using PostgreSQL, you can combine both date_trunc and generate_series to do some very fun grouping and series generation.
For example, you could use this to generate a table of all dates in the last year:
SELECT current_date - s.a as date
FROM generate_series(0,365,1) as s(a);
Then, you could use date_trunc to grab the months and group by that date_trunc'ed field:
SELECT date(date_trunc('month', series.date)) as month, COUNT(*) as days
FROM (SELECT current_date - s.a as date
FROM generate_series(0,365,1) as s(a)) series
GROUP BY month;
Create a table (e.g. tblMonths) that includes all 12 months and use a LEFT JOIN (or RIGHT JOIN) on it and your partial source data.
Check out the reference and this tutorial for how this works.
I would do something like this:
SELECT COUNT(Order.OrderID)
FROM Orders
WHERE YEAR(Order.DateOrdered) > 2000
GROUP BY MONTH(Order.DateOrdered)
This will give you the number of orders grouped by each month.
Then in you application simply assign a ZERO to the months in which no data was returned
I hope this Helps
Query on static data MySQL.
You can select static data from hardcoded list with table by this query
SELECT *
FROM (
values row('Hamza','23'), row('Ali', '24')
) t1 (name, age);
Data:
values date
14 1.1.2010
20 1.1.2010
10 2.1.2010
7 4.1.2010
...
sample query about january 2010 should get 31 rows. One for every day. And values vould be added. Right now I could do this with 31 queries but I would like this to work with one. Is it possible?
results:
1. 34
2. 10
3. 0
4. 7
...
This is actually surprisingly difficult to do in SQL. One way to do it is to have a long select statement with UNION ALLs to generate the numbers from 1 to 31. This demonstrates the principle but I stopped at 4 for clarity:
SELECT MonthDate.Date, COALESCE(SUM(`values`), 0) AS Total
FROM (
SELECT 1 AS Date UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
--
SELECT 28 UNION ALL
SELECT 29 UNION ALL
SELECT 30 UNION ALL
SELECT 31) AS MonthDate
LEFT JOIN Table1 AS T1
ON MonthDate.Date = DAY(T1.Date)
AND MONTH(T1.Date) = 1 AND YEAR(T1.Date) = 2010
WHERE MonthDate.Date <= DAY(LAST_DAY('2010-01-01'))
GROUP BY MonthDate.Date
It might be better to use a table to store these values and join with it instead.
Result:
1, 34
2, 10
3, 0
4, 7
Given that for some dates you have no data, you'll need to fill in the gaps. One approach to this is to have a calendar table prefilled with all dates you need, and join against that.
If you want the results to show day numbers as you have showing in your question, you could prepopulate these in your calendar too as labels.
You would join your data table date field to the date field of the calendar table, group by that field, and sum values. You might want to specify limits for the range of dates covered.
So you might have:
CREATE TABLE Calendar (
label varchar,
cal_date date,
primary key ( cal_date )
)
Query:
SELECT
c.label,
SUM( d.values )
FROM
Calendar c
JOIN
Data_table d
ON d.date_field = c.cal_date
WHERE
c.cal_date BETWEEN '2010-01-01' AND '2010-01-31'
GROUP BY
d.date_field
ORDER BY
d.date_field
Update:
I see you have datetimes rather than dates. You could just use the MySQL DATE() function in the join, but that would probably not be optimal. Another approach would be to have start and end times in the Calendar table defining a 'time bucket' for each day.
This works for me... Its a modification of a query I found on another site. The "INTERVAL 1 MONTH" clause ensures I get the current month data, including zeros for days that have no hits. Change this to "INTERVAL 2 MONTH" to get last months data, etc.
I have a table called "payload" with a column "timestamp" - Im then joining the timestamp column on to the dynamically generated dates, casting it so that the dates match in the ON clause.
SELECT `calendarday`,COUNT(P.`timestamp`) AS `cnt` FROM
(SELECT #tmpdate := DATE_ADD(#tmpdate, INTERVAL 1 DAY) `calendarday`
FROM (SELECT #tmpdate :=
LAST_DAY(DATE_SUB(CURDATE(),INTERVAL 1 MONTH)))
AS `dynamic`, `payload`) AS `calendar`
LEFT JOIN `payload` P ON DATE(P.`timestamp`) = `calendarday`
GROUP BY `calendarday`
To dynamically get the dates within a date range using SQL you can do this (example in mysql):
Create a table to hold the numbers 0 through 9.
CREATE TABLE ints ( i tinyint(4) );
insert into ints (i)
values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Run a query like so:
select ((curdate() - interval 2 year) + interval (t.i * 100 + u.i * 10 + v.i) day) AS Date
from
ints t
join ints u
join ints v
having Date between '2015-01-01' and '2015-05-01'
order by t.i, u.i, v.i
This will generate all dates between Jan 1, 2015 and May 1, 2015.
Output
2015-01-01
2015-01-02
2015-01-03
2015-01-04
2015-01-05
2015-01-06
...
2015-05-01
The query joins the table ints 3 times and gets an incrementing number (0 through 999). It then adds this number as a day interval starting from a certain date, in this case a date 2 years ago. Any date range from 2 years ago and 1,000 days ahead can be obtained with the example above.
To generate a query that generates dates for more than 1,000 days simply join the ints table once more to allow for up to 10,000 days of range, and so forth.
If I'm understanding the rather vague question correctly, you want to know the number of records for each date within a month. If that's true, here's how you can do it:
SELECT COUNT(value_column) FROM table WHERE date_column LIKE '2010-01-%' GROUP BY date_column