I need to query data with count and sum by multiple date ranges and I am looking for a faster query than what I am doing now.
I have a transaction table with a date and amount. I need to present a table with a count of transactions and total amount by date ranges of today, yesterday, this week, last week, this month, last month. Currently I am doing sub queries, is there a better way?
select
(select count(date) from transactions where date between ({{today}})) as count_today,
(select sum(amount) from transactions where date between ({{today}})) as amount_today,
(select count(date) from transactions where date between ({{yesterday}})) as count_yesterday,
(select sum(amount) from transactions where date between ({{yesterday}})) as amount_yesterday,
(select count(date) from transactions where date between ({{thisweek}})) as count_thisweek,
(select sum(amount) from transactions where date between ({{thisweek}})) as amount_thisweek,
etc...
Is there a better way?
although you have a marked solution, I have another that will probably simplify your query even further using MySQL variables so you don't have to mis-type / calculate dates and such...
Instead of declaring variables up front, you can do them inline as a select statement, then use them as if they were columns in another table. Since it is created as a single row, there is no Cartesian result. First the query, then I'll describe the computations on it.
select
sum( if( t.date >= #today AND t.date < #tomorrow, 1, 0 )) as TodayCnt,
sum( if( t.date >= #today AND t.date < #tomorrow, amount, 0 )) as TodayAmt,
sum( if( t.date >= #yesterday AND t.date < #today, 1, 0 )) as YesterdayCnt,
sum( if( t.date >= #yesterday AND t.date < #today, amount, 0 )) as YesterdayAmt,
sum( if( t.date >= #FirstOfWeek AND t.date < #EndOfWeek, 1, 0 )) as WeekCnt,
sum( if( t.date >= #FirstOfWeek AND t.date < #EndOfWeek, amount, 0 )) as WeekAmt
from
transations t,
( select #today := curdate(),
#yesterday := date_add( #today, interval -1 day ),
#tomorrow := date_add( #today, interval 1 day ),
#FirstOfWeek := date_add( #today, interval +1 - dayofweek( #today) day ),
#EndOfWeek := date_add( #FirstOfWeek, interval 7 day ),
#minDate := least( #yesterday, #FirstOfWeek ) ) sqlvars
where
t.date >= #minDate
AND t.date < #EndOfWeek
Now, the dates. Since the #variables are prepared in sequence, you can think of it as an inline program to set the variables. Since they are a pre-query, they are done first and available for the duration of the rest of the query as previously stated. So to start, I am working with whatever "curdate()" is which gets the date portion only without respect to time. From that, subtract 1 day (add -1) to get the beginning of yesterday. Add 1 day to get Tomorrow. Then, the first of the week is whatever the current date is +1 - the actual day of week (you will see shortly). Add 7 days from the first of the week to get the end of the week. Finally, get whichever date is the LEAST between a yesterday (which COULD exist at the end of the prior week), OR the beginning of the week.
Now look at today for example... Feb 23rd.
Sun Mon Tue Wed Thu Fri Sat Sun
21 22 23 24 25 26 27 28
Today = 23
Yesterday = 22
Tomorrow = 24
First of week = 23 + 1 = 24 - 3rd day of week = 21st
End of Week = 21st + 7 days = 28th.
Why am I doing a cutoff of the dates stripping times? To simplify the SUM() condition for >= AND <. If I stated some date = today, what if your transactions were time-stamped. Then you would have to extract the date portion only to qualify. By this approach, I can say that "Today" count and amount is any date >= Feb 23 at 12am midnight AND < Feb 24th 12 am midnight. This is all time inclusive Feb 23rd up to 11:59:59pm hence LESS than Feb 24th (tomorrow).
Similar consideration for yesterday is all inclusive UP TO but not including whatever "today" is. Similarly for the week range.
Finally the WHERE clause is looking for the earliest date as the range so it does not have to run through the entire database of transactions to the end.
Lastly, if you ever wanted the counts and totals for a prior week / period, whatever, you could just extrapolate and change
#today := '2015-01-24'
and the computations will be AS IF the query was run ON THAT DATE.
Similar if you cared to alter such as for a month, you could compute the first of the month to the first of a following month for MONTHLY totals.
Hope you enjoy this flexible solution to you.
Yes, you can use aggregate functions on conditional expressions, like so:
SELECT SUM(IF(date between ({{today}})), 1, 0) AS count_today
, SUM(IF(date between ({{today}})), amount, 0) AS amount_today
, ...
Related
I have data with start date and end date (Say 20th Feb 2018 to 20th Feb 2020), I want to find out the total days in every year inside this range.
For example:
2018 - x days
, 2019 - 365 days
, 2020 - y days etc.
Is there a way I can do in SQL without hardcoding year values?
I tried hardcoding the values and it worked well. But I want a solution without hardcoding year values
I'm not familiar enough with MySql to know if this will port, however here is a tested and confirmed SQL Server solution.
The fiddle link is here for your use.
Given start dates 02/20/2018 and 02/20/2020, the result set is as follows:
Year
periodStart
periodEnd
DaysInPeriod
2018
2018-02-20
2018-12-31
314
2019
2019-01-01
2019-12-31
365
2020
2020-01-01
2020-02-20
51
Declare #StartDate date = '2018-02-20', #EndDate date = '2020-02-20';
WITH x AS (SELECT n FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(n)),
Years AS (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS Year
FROM x ones, x tens, x hundreds, x thousands)
SELECT Years.Year,
CASE
WHEN Year(#StartDate) = Years.year THEN #StartDate
ELSE DATEFROMPARTS(years.year, 01, 01)
END AS periodStart,
CASE
WHEN Year(#EndDate) = Years.year THEN #EndDate
ELSE DATEFROMPARTS(years.year, 12, 31)
END AS periodEnd,
DATEDIFF(day,
CASE
WHEN Year(#StartDate) = Years.year THEN #StartDate
ELSE DATEFROMPARTS(years.year, 01, 01)
END,
CASE
WHEN Year(#EndDate) = Years.year THEN #EndDate
ELSE DATEFROMPARTS(years.year, 12, 31)
END
) + 1 AS DaysInPeriod
FROM Years
WHERE Years.Year >= Year(#StartDate)
AND Years.Year <= Year(#EndDate)
Using WITH RECURSIVE to create range of dates then we can easly count the number of days for each year using DATEDIFF
WITH RECURSIVE dates AS
(
SELECT min(start_date) as start_date, DATE_FORMAT(min(start_date),'%Y-12-31') as last_day FROM mytable
UNION ALL
SELECT DATE_FORMAT(start_date + INTERVAL 1 YEAR,'%Y-01-01'),
DATE_FORMAT(start_date + INTERVAL 1 YEAR,'%Y-12-31')
FROM dates
WHERE DATE_FORMAT(start_date + INTERVAL 1 YEAR,'%Y-01-01') <= (SELECT MAX(end_date) FROM mytable)
),
cte2 as (
SELECT d.start_date as start_day, if(YEAR(d.start_date) = YEAR(m.end_date), m.end_date, d.last_day) as last_day
FROM dates d, mytable m
)
select *, DATEDIFF(last_day, start_day)+1 as total_days
from cte2;
Demo here
You are looking for the DATEDIFF function.
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_datediff
DATEDIFF() returns expr1 − expr2 expressed as a value in days from one date to the other. expr1 and expr2 are date or date-and-time expressions.
You are free to specify e.g. "2019-01-01" or "2020-01-01"
as input arguments to DATEDIFF.
You may find it convenient to store several January 1st
dates in a calendar reporting table, if you want SELECT to loop
over several years and report on number of days in each year.
I have a table like this:
I need to sum how many messages were delivered per msisdn in last 8 weeks(but for each week) from date entered. Here is what I came up with:
SELECT count(*) as ukupan_broj, SUM(IF (sent_messages.delivered = 1,1,0 )) as broj_dostavljenih,
count(*) - SUM(IF (sent_messages.delivered = 1,1,0 )) as non_billed,
SUM(IF (sent_messages.delivered = 1,1,0 )) / count(*) as ratio,
`sent_messages`.`msisdn`,
MONTH(`sent_messages`.`datetime`) AS MONTH, WEEK(`sent_messages`.`datetime`) AS WEEK,
DATE_FORMAT(`sent_messages`.`datetime`, '%Y-%m-%d') AS DATE
FROM `sent_messages`
INNER JOIN `received_messages` on `received_messages`.`uniqueid`=`sent_messages`.`originalID`
and `received_messages`.`msisdn`=`sent_messages`.`msisdn`
WHERE `sent_messages`.`datetime` >= '2016-12-12'
AND `sent_messages`.`originalID` = `received_messages`.`uniqueid`
AND `sent_messages`.`datetime` <= '2017-12-30'
AND `sent_messages`.`datetime` >= `received_messages`.`datetime`
AND `sent_messages`.`datetime` <= ( `received_messages`.`datetime` + INTERVAL 2 HOUR )
AND `sent_messages`.`type` = 'PAID'
GROUP BY WEEK
ORDER BY DATE ASC
And because I'm grouping it by WEEK, my result is showing sum of all delivered, undelivered etc. but not per msisdn. Here is how result looks like:
And when I add msisdn in GROUP BY clause I don't get the result the way I need it.
And I need it like this:
Please help me to write optimized query to fetch these results for each msisdn per last 8 weeks, because I'm stuck.
WEEK(...) has a problem near the first of the year. Instead, you could use TO_DAYS:
WHERE datetime > CURDATE() - INTERVAL 8 WEEK -- for the last 8 weeks
GROUP BY MOD(TO_DAYS(datetime), 7) -- group by week
That is quite simple, but there is a bug in it. It only works if today is the last day of a "week". And if date%7 lands on the desired day of week.
WHERE datetime > CURDATE() - INTERVAL 9 WEEK -- for the last 8 weeks
GROUP BY MOD(TO_DAYS(datetime) - 3, 7) -- group by week
Is the first cut at fixing the bugs -- 9-week interval will include the current partial week and the partial week 8 weeks ago. The "- 3" (or whatever number works) will align your "week" to start on Monday or Sunday or whatever.
SUM(IF (sent_messages.delivered = 1,1,0 )) can be shortened to SUM(delivered = 1) or even SUM(delivered) if that column only has 0 or 1 values.
I have table ORDERS where is stored data about orders with their status and the date of order. I would like to search all orders with specified status and which was made yesterday after 3pm untill today 4pm. The query will run in different times (10am, 3pm, 5 pm... regardless).
So on example: if I run the query today (13.05.2014) I would like to get all orders made from 2014-12-05 15:00:00 untill 13-05-2015 16:00:00
The date is stored in format: YYYY-MM-DD HH:MM:SS
What I got is:
select *
from orders
where status = 'new'
and (
(
date_add(created_at, INTERVAL 1 day) = CURRENT_DATE()
and hour(created_at) >= 15
) /*1*/
or (
date(created_at) = CURRENT_DATE()
and hour(created_at) <= 16
) /*2*/
)
And I get only orders made today - like only the 2nd condition was taken into account.
I prefer not to use created >= '2014-05-12 16:00:00' (I will not use this query, someone else will).
When you add an interval of 1 day to the date/time, you still keep the time component. Use date() for the first condition:
where status = 'new' and
((date(date_add(created_at, INTERVAL 1 day)) = CURRENT_DATE() and
hour(created_at) >= 15
) /*1*/ or
(date(created_at) = CURRENT_DATE() and
hour(created_at) <= 16
) /*2*/
)
And alternative method is:
where status = 'new' and
(created_at >= date_add(CURRENT_DATE(), interval 15-24 hour) and
created_at <= date_add(CURRENT_DATE(), interval 16 hour)
)
The advantage of this approach is that all functions are moved to CURRENT_DATE(). This would allow MYSQL to take advantage of an index on created_at.
I have a MySQL database containing discounts. A simplified version looks like this:
id | start (UNIX timestamp) | end (UNIX timestamp)
45 | 1384693200 | 1398992400
68 | 1386018000 | 1386277200
263 | 1388530800 | 1391209200
A discount can last a few days, a few months, or even a few years. I'm looking for a way to select a unique list of months where (future) discounts are valid.
If there is:
a discount which starts in november 2013 and ends in april 2014
a discount which starts in december 2013 and ends in the same month
a discount which starts in january 2014 and ends one month later
a discount which starts in june 2014 and ends the same month
The output should be:
- December (2013)
- January (2014)
- February (2014)
- March (2014)
- April (2014)
- June (2014)
November 2013 is not shown because it is in the past. May 2014 is not shown because there is no discount in that month.
Can somebody help?
Thanks in advance!
Create a table containing a sequence of numbers from 0 to a number of month you could ever require, and join this table to your table.
This is example how to get a list of years+months separately for each id
SELECT id,
year( start + interval x month ) year,
month( start + interval x month ) month
FROM
numbers n
JOIN
(
SELECT id,
from_unixtime( start ) start,
from_unixtime( end ) end
FROM Table1
) q
ON n.x <= period_diff( date_format( q.end, '%Y%m' ),date_format( q.start, '%Y%m' ))
ORDER BY id, year, month ;
Demo --> http://www.sqlfiddle.com/#!9/d7cfc/4
If you want to combine years+months for all id, skip id column and use GROUP BY
SELECT year( start + interval x month ) year,
month( start + interval x month ) month
FROM
numbers n
JOIN
(
SELECT id,
from_unixtime( start ) start,
from_unixtime( end ) end
FROM Table1
) q
ON n.x <= period_diff( date_format( q.end, '%Y%m' ),date_format( q.start, '%Y%m' ))
GROUP BY year, month
ORDER BY year, month ;
If you want to skip past years and months, add WHERE year >= current year AND month >= current month, this is a trivial change. Also add another WHERE end < current-unix-time in the subquery to filter out unwanted past rows.
I have a large number of records with a transaction datetime field going back several years. I would like to do a comparative analysis between the same timespan this year and last. How can I group by week over a 3 month range?
I'm running into problems using the YEARWEEK and WEEK functions because of the day the year 2012 starts of versus the day 2011 starts on.
Given that I have records with datetimes everyday from Jan 1st to the current day, and records with the same datetimes from the prior year, how can I group by week so the output is sums with dates like: 01/01/2011, 01/08/2011, 01/15/2011, etc., and 01/01/2012, 01/08/2012, 01/15/2012, etc.?
My query so far is as follows:
SELECT
DATE_FORMAT(A.transaction_date, '%Y-%m-%d') as date,
ROUND(sum(A.quantity), 3) AS quantity,
ROUND(sum(A.total_amount), 3) AS amount,
A.product_code,
D.fuel_type_code,
D.fuel_type_name,
C.customer_code,
C.customer_name
FROM
cl_transactions AS A
INNER JOIN
card AS B ON A.card_number=B.card_number
INNER JOIN
customer AS C ON B.customer_code=C.customer_code
INNER JOIN
fuel_type AS D ON A.fuel_type=D.fuel_type_code
WHERE
((A.transaction_date >= DATE_FORMAT(NOW() - INTERVAL 3 MONTH, '%Y-%m-01')) OR (A.transaction_date - INTERVAL 1 YEAR >= DATE_FORMAT(NOW() - INTERVAL 15 MONTH, '%Y-%m-01') AND A.transaction_date <= NOW() - INTERVAL 1 YEAR))
GROUP BY
A.transaction_date, fuel_type_code;
I would essentially like something that achieves the following pseudo-query:
GROUP BY
STARTING FROM THE OLDEST DATE (A.transaction_date + INTERVAL 6 DAY)
I started with an inner query using sqlvariables to build out from/to ranges for this year and last year of each respective start of year/month/day (ex: 2012-01-01 and 2011-01-01 respectively). From that, I'm also pre-formatting the date for final output so you have ONE master date basis for display reflecting that of whatever the "this year" week would be.
From that, I do a join to the transaction table where the transaction date is BETWEEN the respective start of current week and start of next week. Since date/time stamps include hour minute, 2012-01-01 by itself is implied as 12:00:00am (midnight) of the day. and between will go UP TO 7 days later 12:00:00 am. And that date will become the start date of the following week.
So, by joining on the date being between EITHER last yr or this yr time period, its the same group qualification. So the field selection does a ROUND( SUM( IF() )) per respective last year or this year. if the incoming transaction date is LESS than the current year's week start, then it must be a record from prior year, otherwise its for the current year. So, respectively, add the value itself, or zero as it applies.
So now, you have the group by. The week that it qualified for was already prepared from the inner query via "ThisYearWeekOf" formatted column, regardless of the otherwise computed "YEARWEEK()" or "WEEK()". The date ranges took care of that qualification for us.
Finally, I added the fuel-type as a join and included that as the group by. You have to group by all non-aggregate columns for proper SQL, although MySQL lets you get by by just grabbing the first entry for the given group if it is NOT so specified in group by.
To close, I DID include the information for the customer as you didn't have it in the group by and did not appear to be applicable... it would just arbitrarily grab one. However, I've added it to the group by, so now your records will show at the per customer level, per product and fuel type, how much sales and quantity between this year and last.
SELECT
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.Quantity, 0 )), 3) as LastYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.total_amount, 0 )), 3) as LastYrAmt,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.Quantity )), 3) as ThisYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.total_amount )), 3) as ThisYrAmt,
FROM
( SELECT
DATE_FORMAT(#ThisYearDate, '%Y-%m-%d') as ThisYearWeekOf,
#LastYearDate as LastYrWeekStart,
#ThisYearDate as ThisYrWeekStart,
#LastYearDate := date_add( #LastYearDate, interval 7 day ) LastYrStartOfNextWeek,
#ThisYearDate := date_add( #ThisYearDate, interval 7 day ) ThisYrStartOfNextWeek
FROM
(select #ThisYearDate := '2012-01-01',
#LastYearDate := '2011-01-01' ) sqlvars,
cl_transactions justForLimit
HAVING
ThisYrWeekStart < '2012-04-01'
LIMIT 15 ) JustWeekRange
JOIN cl_transactions AS CTrans
ON CTrans.transaction_date BETWEEN
JustWeekRange.LastYrWeekStart AND JustWeekRange.LastYrStartOfNextWeek
OR CTrans.transaction_date BETWEEN
JustWeekRange.ThisYrWeekStart AND JustWeekRange.ThisYrStartOfNextWeek
JOIN fuel_type FT
ON CTrans.fuel_type = FT.fuel_type_code
JOIN card
ON CTrans.card_number = card.card_number
JOIN customer AS C
ON card.customer_code = C.customer_code
GROUP BY
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name