MySQL: group by date RANGE? - mysql

OK I have this query that groups 2 columns together quite nicely:
SELECT search_query_keyword, search_query_date, COUNT(1) as count
FROM search_queries
WHERE search_query_date >= '.$from.' AND search_query_date <= '.$to.'
GROUP BY search_query_keyword, search_query_date
ORDER BY count DESC
LIMIT 10
But what if I want to group by a date RANGE instead of just a date? Is there a way to do that?
Thanks!
EDIT: OK these answers are pretty complicated and I think what I want can be acheived a lot easier so let me re-explain. I want to select keywords over a time period ">= 20090601 AND <= 20090604" for example. But instead of getting repeated keywords I would rather just get the keyword ounce and how many times it occured. So for example instead of this:
keyword: foo
keyword: foo
keyword: foo
keyword: bar
keyword: bar
I would get:
keyword: foo, count: 3
keyword: bar, count: 2

I'm not exactly sure about the date range grouping -- you'd have to define the date ranging that you would want and then maybe you could UNION those queries:
SELECT
'Range 1' AS 'date_range',
search_query_keyword
FROM search_queries
WHERE search_query_date >= '.$fromRange1.' AND search_query_date <= '.$toRange1.'
UNION
SELECT
'Range 2' AS 'date_range',
search_query_keyword
FROM search_queries
WHERE search_query_date >= '.$fromRange2.' AND search_query_date <= '.$toRange2.'
GROUP BY 1,2
Or if you wanted to put them within a grouping of how many days old like "30 days, 60 days, etc" you could do this:
SELECT
(DATEDIFF(search_query_date, NOW()) / 30) AS date_group,
search_query_keyword
FROM search_queries
GROUP BY date_group, search_query_keyword
EDIT: Based on the further information you provided, this query should produce what you want:
SELECT
search_query_keyword,
COUNT(search_query_keyword) AS keyword_count
FROM search_queries
WHERE search_query_date >= '.$from.' AND search_query_date <= '.$to.'
GROUP BY search_query_keyword

You could group on a CASE statement or on the result of a function. For instance:
SELECT search_query_keyword, QUARTER(search_query_date), COUNT(1) as count
FROM search_queries
WHERE search_query_date >= '.$from.' AND search_query_date <= '.$to.'
GROUP BY search_query_keyword, QUARTER(search_query_date)
ORDER BY count DESC

look into the different DATE-based functions and build based on that, such as
select YEAR( of your date ) + MONTH( of your date ) as ByYrMonth
but the result in above case would need to be converted to character to prevent a year of 2009 + January ( month 1) = 2010 also getting falsely grouped with 2008 + February (month 2 ) = 2010, etc... Your string should end up as something like:
...
200811
200812
200901
200902
200903
...
If you wanted by calendar Quarters, you would have to do a INTEGER of the (month -1) divided by 4 so...
Jan (-1) = 0 / 4 = 0
Feb (-1) = 1 / 4 = 0
Mar (-1) = 2 / 4 = 0
Apr (-1) = 3 / 4 = 0
May (-1) = 4 / 4 = 1
June (-1)= 5 / 4 = 1 ... etc...
Yes, a previous example explicitly reference the QUARTER() function that handles more nicely, but if also doing based on aging, such as 30, 60, 90 days, you could apply the similar math above but divide by 30 for your groups.

Related

MySQL ORDER BY FIELD for months

I have a table called months - this contains all 12 months of the calendar, the IDs correspond to the month number.
I will be running a query to retrieve 2 or 3 sequential months from this table, e.g
April & May
June, July, August
December & January
However I want to ensure that whenever December are January and retrieved, that it retrieves them in that order, and not January - December. Here is what I have tried:
SELECT * FROM `months`
WHERE start_date BETWEEN <date1> AND <date2>
ORDER BY
FIELD(id, 12, 1)
This works for December & January, but now when I try to retrieve January & February it does those in the wrong order, i.e "February - January" - I'm guessing because we specified 1 in the ORDER BY as the last value.
Anybody know the correct way to achieve this? As I mentioned this should also work for 3 months, so for example "November, December, January" and "December, January, February" should all be retrieved in that order.
If you want December first, but the other months in order, then:
order by (id = 12) desc, id
MySQL treats booleans as numbers, with "1" for true and "0" for false. The desc puts the 12s first.
EDIT:
To handle the more general case, you can use window functions. Assuming the numbers are consecutive, then the issue is trickier. This will work for 2 and 3-month spans:
order by (case min(id) over () > 1 then id end),
(case when id > 6 1 else 2 end),
id
I'm reluctant to think about a more general solution based only on months. After all, you can just use:
order by start_date
Or, if you have an aggregation query:
order by min(start_date)
to solve the real problem.
This is not "mysql solution" properly :
with cte (id, month) AS (
select id, month from months
union all
select id, month from months
)
, cte1 (id, month, r) as (select id, month, row_number() over() as r from cte )
select * from cte1
where id in (12, 1)
and r >= 12 order by r limit 2 ;
DECLARE
#monthfrom int = 12,
#monthto int = 1;
with months as (select 1 m
union all
select m+1 from months where m<12)
select m
from months
where m in (#monthfrom,#monthto)
order by
case when #monthfrom>#monthto
then
m%12
else
m
end
result:
12
1
Basically in MySQL this can be done the same way:
set #from =12;
set #to =1;
with recursive months(m) as (
select 1 m
union all
select m+1 from months where m<12)
select *
from months
where m in (#from,#to)
order by case when #from>#to then m%12 else m end;

How to select records but exclude if one type is outside a subquery?

We have multiple invStatus values (1-10) and want to exclude only one status type (1) BUT only those of that type that are a older than X number of days. So all records will show but NOT those who's invStatus = 1 and is older than X days. invStatus = 1 and younger than X days will be included in the recordset.
Do I select all records generically, then in a subquery filter those of status = 1 that are older than X days?
The query below uses NOT IN in an attempt to select those records to exclude but it is not working and also seems to be inefficient as it takes a couple seconds to execute.
SELECT
tblinventory.invId,
tblinventory.invTitle,
tblinventory.invStatus,
tblhouseinfo.Address,
tblhouseinfo.City,
tblhouseinfo.`State`,
tblhouseinfo.Zip,
tblhouseinfo.Update_date,
CURRENT_DATE() - INTERVAL 10 DAY AS dateEx
FROM
tblinventory
LEFT OUTER JOIN tblhouseinfo ON tblinventory.invId = tblhouseinfo.addInfoID
WHERE
invReleased = 0
AND invStatus NOT IN (SELECT invId from tblhouseinfo WHERE invStatus = 1
AND tblhouseinfo.Update_date < CURRENT_DATE() - INTERVAL 10 DAY )
ORDER BY
`tblhouseinfo`.`Update_date` DESC
I could filter the results with PHP on the page level but this also seems less than efficient and would prefer to perform this task using the best practices.
UPDATE:
There are a total of 155 rows.
All tblhouseinfo.Update_date (timestamp) values are "2017-09-06 10:53:17" (Aug 9th) accept three I changed for testing to "2017-07-06 10:53:17
" (July 6th)
Utilizing the suggestion for :
AND NOT (invStatus = 1 AND tblhouseinfo.Update_date > CURRENT_DATE() - INTERVAL 10 DAY )
60 records are excluded not the expected 3.
"2017-08-28" is the current result from CURRENT_DATE() - INTERVAL 10 DAY which should be within the 10 day range to select "2017-09-06 10:53:17" and only exclude the three records that are "2017-07-06 10:53:17"
FINAL WORKING SOLUTION/Query:
SELECT
tblinventory.invId,
tblinventory.invTitle,
tblinventory.invStatus,
tblhouseinfo.Address,
tblhouseinfo.City,
tblhouseinfo.`State`,
tblhouseinfo.Zip,
tblhouseinfo.Update_date,
CURRENT_DATE() - INTERVAL 10 DAY AS dateEx
FROM
tblinventory
LEFT OUTER JOIN tblhouseinfo ON tblinventory.invId = tblhouseinfo.addInfoID
WHERE
invReleased = 0
AND NOT (invStatus = 1 AND tblhouseinfo.Update_date < CURRENT_DATE() - INTERVAL 10 DAY )
ORDER BY
`tblhouseinfo`.`Update_date` DESC
SELECT
tblinventory.invId,
tblinventory.invTitle,
tblinventory.invStatus,
tblhouseinfo.Address,
tblhouseinfo.City,
tblhouseinfo.`State`,
tblhouseinfo.Zip,
tblhouseinfo.Update_date,
CURRENT_DATE() - INTERVAL 10 DAY AS dateEx
FROM
tblinventory
LEFT OUTER JOIN tblhouseinfo ON tblinventory.invId = tblhouseinfo.addInfoID
WHERE
invReleased = 0
AND NOT (invStatus = 1 AND tblhouseinfo.Update_date < CURRENT_DATE() - INTERVAL 10 DAY )
ORDER BY
`tblhouseinfo`.`Update_date` DESC
You don't need to select invID from the other table if you know you never want the ID #1 (invStatus 1). But you can also throw in an AND statement for the # of days.
I always use timestamps (in UNIX) for recording data entry / modification.
AND (timestamp >= beginTimestamp AND timeStamp <= endTimestamp)

Aggregating table data in MySQL, is there an easier way to do this?

I'm trying to write a query that aggregates data from a table.
Essentially I have a long list of devices that have been inventoried and eventually installed over the last couple of years.
I want to find the average amount of time between when the device was received and when it was installed, and then have that data sorted by the month the device was installed. BUT in each month's row, I also want to include the data from the previous months.
So essentially what I want to see is: (sorry for terrible formatting)
MonthInstalled | TimeToInstall | Total#Devices
-----------------+---------------+----------------------------
Jan | 10 Days | 5
Feb(=Jan+Feb) | 15 Days | 18 (5 in Jan + 13 in Feb)
Mar(=Jan+Feb+Mar)| 13 Days | 25 (5 + 13 + 7)
...
The query I currently have written looks like this:
INSERT INTO DevicesInstall
SELECT ROUND(AVG(DATEDIFF(dvc.dt_install , dvc.dt_receive)), 1) AS 'Install',
COUNT(dvc.dvc_model) AS 'Total Devices',
MAX(dvc.dt_install) AS 'Date',
loc.loc_campus AS 'Campus'
FROM dvc_info dvc, location loc
WHERE dvc.dvc_loc_bin = loc.loc_bin
AND dvc.dt_install < '20160201'
;
Although this is functional, I have to iterate this for each month manually, so it is not scale-able. Is there a way to condense this at all?
We can return the dates using an inline view (derived table), and then join to the dvc_info table, so we can get the "cumulative" results.
To get the results for:
Jan
Jan+Feb
Jan+Feb+Mar
We need to return three copies of the rows for Jan, and two copies of the rows for Feb, and then collapse the those rows into an appropriate group.
The loc_campus is being included in the SELECT list... not clear why that is needed. If we want results "by campus", then we need to include that expression in the GROUP BY clause. Otherwise, the value returned for that non-aggregate is indeterminate... we will get a value for some row "in the group", but it could be any row.
Something like this:
SELECT d.dt AS `before_date`
, loc.loc_campus AS `Campus`
, ROUND(AVG(DATEDIFF(dvc.dt_install,dvc.dt_receive)),1) AS `Install`
, COUNT(dvc.dvc_model) AS `Total Devices`
, MAX(dvc.dt_install) AS `latest_dt_install`
FROM ( SELECT '2016-01-01' + INTERVAL 1 MONTH AS dt
UNION ALL SELECT '2016-01-01' + INTERVAL 2 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 3 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 4 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 5 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 6 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 7 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 8 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 9 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 10 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 11 MONTH
UNION ALL SELECT '2016-01-01' + INTERVAL 12 MONTH
) d
CROSS
JOIN location loc
LEFT
JOIN dvc_info dvc
ON dvc.dvc_loc_bin = loc.loc_bin
AND dvc.dt_install < d.dt
GROUP
BY d.dt
, loc.loc_campus
ORDER
BY d.dt
, loc.loc_campus
Note that the value returned for d.dt will be the "up until" date. We're going to get '2016-02-01' returned for the January results. If we want to return a value of January date, we can use an expression in the SELECT list...
SELECT DATE_FORMAT(d.dt + INTERVAL -1 MONTH,'%Y-%m') AS `month`
Lots of options on query alternatives.
But it looks like the "big hump" is that to get cumulative results, we need to return multiple copies of the dvc_info rows, so the rows can be collapsed into each "grouping".
I recommend working on just the SELECT first. And get that tested working, before monkeying around to turn it into an INSERT ... SELECT.
FOLLOWUP
We can use any query as an inline view (derived table d) that returns a set of dates we want.
e.g.
FROM ( SELECT DATE_FORMAT(m.install_dt,'%Y-%m-01') + INTERVAL 1 MONTH AS dt
FROM dvc_install m
WHERE m.install_dt >= '2016-01-01'
GROUP BY DATE_FORMAT(m.install_dt,'%Y-%m-01') + INTERVAL 1 MONTH
) d
Note that with this approach, if there are no install_dt in February, we won't get back a row for February. Using the static UNION ALL SELECT approach allows us to get back "zero" counts, i.e. to return rows for months where there isn't an install_dt in that month. (But that's the answer to a different question... how do I get back a "zero" count for February when there aren't any rows for Februrary?)
Alternatively, if we have a calendar table e.g. cal that contains a list of the dates we want, we could just reference the table in place of the inline view, or the inline view query could get rows from that.
FROM ( SELECT cal.dt
FROM cal cal
WHERE cal.dt >= '2016-01-01'
AND cal.dt <= NOW()
AND DATE_FORMAT(cal.dt,'%d') = '01'
) d

using <= or >= in timestamp field in mysql

I would like to select all records before 2014-03-22 date:
where date < 2014-03-22 // what I need
but below code doesn't see 2013 year's records :
SELECT * FROM `tractions` WHERE YEAR(date) <= 2014 AND MONTH(date) <= 3 and DAY(date) <= 22 and succ = 1
Is there anything wrong with:
SELECT * FROM tractions
WHERE date < '2014-03-22' -- place the date, correctly formatted, in quotes
Since this comparison doesn't use any functions, it will also allow you to use any indices setup on the date column.

Mysql Count with DAYOFWEEK AND BETWEEN not working together

I am trying to count a value where I get the count from the weekday and from the first 3 months of the year.
SELECT
SUM(count_LP) AS down1
FROM
`splittest`
WHERE DAYOFWEEK(DATE) = 2
AND (
dato BETWEEN 2014-01-01
AND 2014-04-01
)
It stops working when I add the BETWEEN.
Any ideas?
To specify dates you have to quote them like '2014-01-01'
SELECT
SUM(count_LP) AS down1
FROM
`splittest`
WHERE DAYOFWEEK(DATE) = 2
AND (
dato BETWEEN '2014-01-01'
AND '2014-04-01'
)