I have a MySQL DB as such:
Date Customer_ID
How can I turn it into:
Customer_ID | Count_Visits_Past_Week | Count_Visits_Past_Month | Count_Visits_Past_90days | Count_Total
Note : Count_Total =sum of the other three counts
Thanks
The first step is to determine the demarcation points for the specified date ranges.
There's several questions to answer here: did you want to compare just the DATE ('yyyy-mm-dd') and disregard any time component?
By "past week", does that mean within the last seven days, or does it mean so far since the previous Sunday, or does it mean the last last full week, from Sunday through Saturday.
For "past month", does that mean the previous whole month, from the first through the end of the month? Or does it mean that if the query is run on the 20th of the month, we want dates since the 20th of the previous month up until today? Or yesterday?
Once we know the points in time that begin and end each specified period, relative to today's date, we can build expressions that evaluate to those dates.
For example, "past week" could be represented as the most recent seven day period:
DATE(NOW())-INTERVAL 1 WEEK -thru- DATE(NOW())
And "past month" can be represented as the same "day of month" (e.g. 17th) of the immediately preceding month up until today:
DATE(NOW())-INTERVAL 1 MONTH -thru- DATE(NOW())
That's really the first step, to determine the begin and end dates of each specified period.
Once we have that, we can move on to building a query that gets a "count" of rows with a date column that falls within each period.
The "trick" is to use conditional tests in expressions in the SELECT list of the query. We'll use those conditional tests to return a 1 if the row is to be included in the "count", and return 0 or NULL if the row should be excluded.
I prefer to use the SUM() aggregate function to get the "count". But it's also possible to use COUNT() aggregate. (If we use COUNT(), we need to use an expression that returns NULL when the row is to be excluded. I prefer to return a 1 or 0; I think it makes debugging easier.
Here's an outline of what a "count" query would look like.
SELECT t.Customer_Id
, SUM(IF( <some_condition> ,1,0) AS Count_something
, SUM(IF( <other_condition> ,1,0) AS Count_something_else
FROM mytable t
GROUP BY t.Customer_Id
When <some_condition> is true, we return a 1, otherwise we return 0.
To test the conditional expressions, it's often easiest to avoid doing the aggregation, and just return the individual rows:
That way, we can see which individual rows are going to be included in each "count".
For example:
SELECT t.Customer_ID
, t.date
, IF(t.date BETWEEN DATE(NOW())-INTERVAL 1 WEEK AND DATE(NOW()),1,0)
AS visit_past_week
, IF(t.date BETWEEN DATE(NOW())-INTERVAL 1 MONTH AND DATE(NOW()),1,0)
AS visit_past_month
FROM mytable t
ORDER BY t.date, t.Customer_Id
That query doesn't return the "count", it just returns the results of the expressions, which can be useful in testing. And of course we want to test the expressions that return the beginning and ending date of each period:
SELECT DATE(NOW()) - INTERVAL 1 WEEK AS past_week_begin
, DATE(NOW()) AS past_week_end
With this approach, the same row can be included in multiple "counts" with one query and one pass through the table.
Note that the expressions inside the SUM() aggregate in the query below are taking advantage of a convenient shorthand, an expression evaluated as a boolean will return 1 if TRUE, 0 if false, or a NULL.
To use the COUNT aggregate, we need to insure that the expression being aggregated returns a non-NULL when the row is to be "counted", and a NULL when the row is to be excluded from the count. So we use the convenient NULLIF function to return NULL if the value returned by the expression is a zero.
SELECT t.Customer_ID
, COUNT(NULLIF( t.date BETWEEN DATE(NOW())-INTERVAL 1 WEEK AND DATE(NOW()),0))
AS Count_Visits_Past_Week
, COUNT(NULLIF( t.date BETWEEN DATE(NOW())-INTERVAL 1 MONTH AND DATE(NOW()),0))
AS Count_Visits_Past_Month
, COUNT(NULLIF( t.date BETWEEN DATE(NOW())-INTERVAL 90 DAY AND DATE(NOW()),0))
AS Count_Visits_Past_90days
, COUNT(NULLIF( t.date BETWEEN DATE(NOW())-INTERVAL 1 WEEK AND DATE(NOW()),0))
+ COUNT(NULLIF( t.date BETWEEN DATE(NOW())-INTERVAL 1 MONTH AND DATE(NOW()),0))
+ COUNT(NULLIF( t.date BETWEEN DATE(NOW())-INTERVAL 90 DAY AND DATE(NOW()),0))
AS Count_Total
FROM mytable t
GROUP BY t.Customer_Id
NOTE: NULLIF(a,b) is a convenient shorthand for IF a IS NULL THEN return b ELSE return a
Returning the Count_Total is a bit odd, since it's got the potential to "count" the same row multiple times... but the value it returns should match the total of the individual counts.
I think this will give you what you want.
select customer_id,
sum(case when splitter = 'week' then num_visits else 0 end) as visits_this_week,
sum(case when splitter = 'month' then num_visits else 0 end) as visits_this_month,
sum(case when splitter = '90days' then num_visits else 0 end) as visits_last_90days,
sum(num_visits) as total
from (select customer_id, 'week' as splitter, count(*) as num_visits
from tbl
where extract(week from date) = extract(week from sysdate())
and extract(year from date) = extract(year from sysdate())
group by customer_id
union all
select customer_id, 'month' as splitter, count(*) as num_visits
from tbl
where extract(month from date) = extract(month from sysdate())
and extract(year from date) = extract(year from sysdate())
group by customer_id
union all
select customer_id, '90days' as splitter, count(*) as num_visits
from tbl
where date between date_sub(sysdate(), interval 90 day) and
sysdate()) x
group by customer_id
sql fiddle example: http://sqlfiddle.com/#!2/a762c/12/0
Related
I have question about a MySQL query that is logging error's since updating the MySQL-5.7.
The error is the "only_full_group_by" which is will spoken off on stackoverflow.
In many answers it's stated not to disable this option but improve your sql query.
The query that I'm using is returning the minimum and maximum values of a counter per hour.
SELECT MAX( counter ) AS max,
MIN( counter ) AS min,
DATE_FORMAT(date_time, '%H:%i') AS dt
FROM table1
WHERE date_time >= NOW() - INTERVAL 1 DAY
GROUP BY YEAR(date_time), MONTH(date_time), DAY(date_time), HOUR(date_time)
as I understand from the error message I'm missing one of the items from the SELECT cause in the GROUP BY cause. But however I restort/remove/add items I'm not getting the result I got before the upgrade to MySQL-5.7.
I tried to subquery the main query to improve the SQL query. But somehow I can't recreate the results.
What is it I'm missing?
MySQL isn't able to determine the functional dependence ... between the expressions in the GROUP BY clause, and the expressions in the SELECT list.
The non-aggregate expression in the SELECT list (DATE_FORMAT(date_time, '%H:%i') includes a minutes component. The GROUP BY clause is going to collapse the rows into groups by just hour. So the value of the minutes is indeterminate... we know it's going to come from some row in the group, but there's no guarantee which one.
(The question reference to ONLY_FULL_GROUP_BY seems to indicate that we've got some understanding of indeterminate values...)
The easiest (fewest) changes fix would be to wrap that expression in a MIN or MAX function.
SELECT MAX(t.counter) AS `max`
, MIN(t.counter) AS `min`
, MIN(DATE_FORMAT(t.date_time,'%H:%i')) AS `dt`
FROM table1 t
WHERE t.date_time >= NOW() - INTERVAL 1 DAY
GROUP
BY YEAR(t.date_time)
, MONTH(t.date_time)
, DAY(t.date_time)
, HOUR(t.date_time)
ORDER
BY YEAR(t.date_time)
, MONTH(t.date_time)
, DAY(t.date_time)
, HOUR(t.date_time)
If we want rows returned in a particular order, we should include an ORDER BY clause, and not rely on MySQL-specific extension or behavior of GROUP BY (which may disappear in future releases.)
It's a bit odd to be doing a GROUP BY year, month, day and not including those values in the SELECT list. (It's not invalid to do that, just kind of strange. The conditions in the WHERE clause are guaranteeing that we don't have more than 24 hours span for date_time.
My preference would to do the GROUP BY on the same expression as the non-aggregate in the SELECT list. If I ever needed more than 24 hours, I'd include the date component:
SELECT MAX(t.counter) AS `max`
, MIN(t.counter) AS `min`
, DATE_FORMAT(t.date_time,'%Y-%m-%d %H:00') + INTERVAL 0 DAY AS `dt`
FROM table1 t
WHERE t.date_time >= NOW() - INTERVAL 1 DAY
GROUP
BY DATE_FORMAT(t.date_time,'%Y-%m-%d %H:00') + INTERVAL 0 DAY
ORDER
BY DATE_FORMAT(t.date_time,'%Y-%m-%d %H:00') + INTERVAL 0 DAY
--or--
if we always know it's just one day's worth of date_time, and we only want to return the hour, then we can group by just the hour. The same expression as in the SELECT list.
SELECT MAX(t.counter) AS `max`
, MIN(t.counter) AS `min`
, DATE_FORMAT(t.date_time,'%H:00') AS `dt`
FROM table1 t
WHERE t.date_time >= NOW() - INTERVAL 1 DAY
GROUP
BY DATE_FORMAT(t.date_time,'%H:00')
, DATE_FORMAT(t.date_time,'%Y-%m-%d %H')
ORDER
BY DATE_FORMAT(t.date_time,'%Y-%m-%d %H')
SELECT MAX( counter ) AS max,
MIN( counter ) AS min,
YEAR(date_time) AS g_year,
MONTH(date_time)AS g_month,
DAY(date_time) AS g_day,
HOUR(date_time) AS g_hour
FROM table1
WHERE date_time >= NOW() - INTERVAL 1 DAY
GROUP BY g_year, g_month, g_day, g_hour
Or you can get rid of redundant data if you always do it for 1 day:
SELECT MAX( counter ) AS max,
MIN( counter ) AS min,
DAY(date_time) AS g_day,
HOUR(date_time) AS g_hour
FROM table1
WHERE date_time >= NOW() - INTERVAL 1 DAY
GROUP BY g_day, g_hour
I have a database that has two columns - result and time
I'm trying to get a count of how many rows exist of each result in a particular month. There are only two options for result success and failure
I've managed to get a count of how many rows there are in each month, but I can't get the individual count of how many success and how many failure there were in each month.
Here is what I have:
SELECT result, MONTH(time) MONTH, COUNT(*) COUNT
FROM mytable
WHERE YEAR(time)=2017
GROUP BY MONTH(time);
I'm looking for a result that provides me with something like there were 12 successes and 8 failures in a particular month.
Any help would be appreciated.
Use conditional aggregation
SELECT result, MONTH(time) MONTH,
sum(result = 'success') as success_count,
sum(result = 'failure') as failure_count
FROM mytable
WHERE YEAR(time) = 2017
GROUP BY result, MONTH(time);
I would use the following query:
SELECT,
DATE_FORMAT(time, '%m %Y') AS month_year,
SUM(CASE WHEN result = 'success' THEN 1 ELSE 0 END) AS success_count,
SUM(CASE WHEN result = 'failure' THEN 1 ELSE 0 END) AS failure_count
FROM mytable
WHERE YEAR(time) = 2017
GROUP BY
DATE_FORMAT(time, '%m %Y')
Note that you should be aggregating by time period alone, and not by the result, which instead is part of the sum in the CASE expression.
I have a database with records of date-time and a measurement value.
I've been writing two separate queries, one to return the total count of all daily records between certain times of day for the previous month, and the same query but a count of only when the measurement value is below threshold. I then manually divide the theshold count by total count for each day, and I am able to get a % uptime or SLA.
So I have two questions:
1) Can I combine these two queries into one query. I found the Answer to #1, see below
2) Can I go ahead and do the math in the queries, so what I get returned is just a listing of each day, the count above, the count below, and the % above or below threshold...
Sample data and query are listed below.
TableA
hostname, date_time, value
Sample Query to return days from previous month, excluding weekend days.
SELECT
count(*),
DATE(date_time),
SUM(
CASE WHEN rssi_val < 100
THEN 1
ELSE 0
END
)
FROM TableA
WHERE hostname = 'hostA'
AND DATE(date_time) BETWEEN '2013-07-01' AND '2013-07-31'
AND TIME(date_time) BETWEEN '06:00:00' AND '18:00:00'
AND DAYOFWEEK(date_time) NOT IN (1, 7)
GROUP BY DATE(date_time);
So now I just want to know how to add a 4th column that gives the percent uptime/downtime.
Have you tried this ?
select count(*),
DATE(date_time),
SUM(CASE WHEN rssi_val<100 THEN 1 ELSE 0 END),
SUM(CASE WHEN rssi_val<100 THEN 1 ELSE 0 END)/count(*) as percentage
from TableA
where hostname='hostA'
and DATE(date_time) between '2013-07-01' and '2013-07-31'
and TIME(date_time) between '06:00:00' and '18:00:00'
and DAYOFWEEK(date_time) NOT IN (1,7)
group by DATE(date_time);
SELECT
EXTRACT( DAY FROM date) AS day,
EXTRACT( MONTH FROM date) AS month, who, sum ( wsd ) AS total FROM weekly
WHERE season = '08'
AND status = 1
AND who = 'NAME SURNAME'
GROUP BY day, month
day month who total
12 8 NAME SURNAME 18
I am getting totals of "wsd" column for every person in "who" column per day. If one person doesn't have records in table for a day, we can not see that name in the results as expected.
But i want to see that records too, with date and name with "0" in total column.
How can i do it with mysql only?
For this case, you can move the conditional logic from the where to the aggregation -- assuming that you have data for someone on every day:
SELECT EXTRACT( DAY FROM date) AS day, EXTRACT( MONTH FROM date) AS month, who,
sum(case when status = 1 AND who = 'NAME SURNAME' then wsd else 0 end) AS total
FROM weekly
WHERE season = '08'
GROUP BY day, month;
Did you try Coalesce or Isnull for inserting default values(in your case 0) ?
I have a simple table with 4 columns - ID, Date, Category, Value.
I have 5 distinct categories that have certain values daily. I would like to select value column at different points in time and display result along with the appropriate category.
This is the code that I'm using:
select
Category,
case when date=DATE_SUB(CURDATE(),INTERVAL 1 DAY) then value else 0 end as Today,
case when date=DATE_SUB(CURDATE(),INTERVAL 1 MONTH) then value else 0 end as "Month Ago",
case when date=DATE_SUB(CURDATE(),INTERVAL 1 Year) then value else 0 end as "Year Ago"
from table
group by category
It's not working. I'm using mysql database but will run the query in SSRS through an ODBC connection.
The problem with your query is that, as written, the case statements need to be embedded in aggregation functions:
select Category,
avg(case when date=DATE_SUB(CURDATE(),INTERVAL 1 DAY) then value end) as Today,
avg(case when date=DATE_SUB(CURDATE(),INTERVAL 1 MONTH) then value end) as "Month Ago",
avg(case when date=DATE_SUB(CURDATE(),INTERVAL 1 Year) then value end) as "Year Ago"
from table
group by category
I chose "avg" since this seems reasonable if there are multiple values and the "value" column is numeric. You might prefer min() or max() to get other values.
Also, I removed the "else 0" clause, so you will see NULL rather than 0 when there is no value.
This type of query is best done with three separate queries:
SELECT 'Today' AS `When`, Category, value FROM `table`
WHERE date = DATE_SUB(CURDATE(),INTERVAL 1 DAY)
UNION ALL
SELECT 'Month Ago' AS `When`, Category, value FROM `table`
WHERE date = DATE_SUB(CURDATE(),INTERVAL 1 MONTH)
UNION ALL
SELECT 'Year Ago' AS `When`, Category, value FROM `table`
WHERE date = DATE_SUB(CURDATE(),INTERVAL 1 YEAR)
try something like this:
SELECT
t1.Category, t1.Value, t2.Value, t3.Value
FROM YourTable t1
LEFT OUTER JOIN YourTable t2 ON t1.Category=t2.Category
AND Date=DATE_SUB(CURDATE(),INTERVAL 1 Month)
LEFT OUTER JOIN YourTable t3 ON t1.Category=t3.Category
AND Date=DATE_SUB(CURDATE(),INTERVAL 1 Year)
WHERE Date=DATE_SUB(CURDATE(),INTERVAL 1 DAY)
this assumes that you have only one row per your interval. if you have multiple rows per interval, you need to decide which value you want to show for that interval (min, max, etc). you then need to aggergate your multiple rows. if this is the case the OP should provide some sample data and expected query output so testing is possible.