SQL: Count/ sum columns within a specific date range - mysql

I have a table that roughly looks like this. There are thousands of rows.
booking_date checkin_date ...some other columns .... booking_value
22-mar-2016 29-mar-2016 ........................... $150
01-apr-2016 17-may-2016 ........................... $500
09-apr-2016 09-apr-2016 ........................... $222
17-apr-2016 23-apr-2016 ........................... $75
19-apr-2016 31-july-2016 ........................... $690
03-May-2016 07-May-2016 ............................. $301
.
.
.
.
I am trying to calculate number of bookings per day and the value of bookings per day in April 2016. The second part is to calculate for how many bookings the booking_date and checking_date were the same.
I am very new to SQL. I can formulate the logic in paper, but can't seem to figure out how to proceed with the code.

I recommend the following query:
SELECT CAST(booking_date AS DATE), COUNT(*) as Number_of_Booking,
SUM(CAST(booking_date AS DATE) = CAST(checkin_date AS DATE)) as count_with_same_date,
SUM(booking_value) as booking_value
FROM t
WHERE booking_date >= '2016-04-01' AND
booking_date < '2016-05-01'
GROUP BY CAST(booking_date AS DATE);
In particular, note the filtering on the dates. The direct comparisons allow MySQL to use an index.
The calculation of the number on the same date uses a nice feature of MySQL where boolean values are treated as numbers in a numeric context.

You can try this below code-
SELECT CAST(booking_date AS DATE),
COUNT(*) Number_of_Booking,
COUNT(
CASE
WHEN CAST(booking_date AS DATE)
= CAST(checkin_date AS DATE) THEN 1
ELSE NULL
END
) count_with_same_date,
SUM(booking_value) booking_value -- Booking value has to be Number field
FROM your_table
WHERE YEAR(booking_date ) = 2016
AND MONTH(booking_date ) = 4
GROUP BY CAST(booking_date AS DATE)

For the first question you can try
Select booking_date
,count(*) as Number_of_bookings
,Sum(booking_value) as value
From table_name
Where booking_date between '01-apr-2016' and '30-apr-2016'
group by booking_date:
Or you can use month() and year() function in filter.
For the second question try,
Select booking_date
,checkin_date
,count(*)
from table_name
where booking_date=checkin_date
group by booking_date, checkin_date

Related

Avg function not returning proper value

I expect this query to give me the avg value from daily active users up to date and grouped by month (from Oct to December). But the result is 164K aprox when it should be 128K. Why avg is not working? Avg should be SUM of values / number of current month days up to today.
SELECT sq.month_year AS 'month_year', AVG(number)
FROM
(
SELECT CONCAT(MONTHNAME(date), "-", YEAR(DATE)) AS 'month_year', count(distinct id_user) AS number
FROM table1
WHERE date between '2020-10-01' and '2020-12-31 23:59:59'
GROUP BY EXTRACT(year_month FROM date)
) sq
GROUP BY 1
Ok guys thanks for your help. The problem was that on the subquery I was pulling the info by month and not by day. So I should pull the info by day there and group by month in the outer query. This finally worked:
SELECT sq.day_month, AVG(number)
FROM (SELECT date(date) AS day_month,
count(distinct id_user) AS number
FROM table_1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY 1
) sq
GROUP BY EXTRACT(year_month FROM day_month)
Do not use single quotes for column aliases!
SELECT sq.month_year, AVG(number)
FROM (SELECT CONCAT(MONTHNAME(date), '-', YEAR(DATE)) AS month_year,
count(distinct id_user) AS number
FROM table1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY month_year
) sq
GROUP BY 1;
Note the fixes to the query:
The GROUP BY uses the same columns as the SELECT. Your query should return an error (although it works in older versions of MySQL).
The date comparisons have been simplified.
No single quotes on column aliases.
Note that the outer query is not needed. I assume it is there just to illustrate the issue you are having.

mySQL query that is a bit tricky

Hi there I want to design this query in mySQL.
Statement: For all the customers that transacted during 2017, what % made another transaction within 30 days?
can you tell me how such query can be designed?
This is the picture of the table to perform this query on:
Table name is: transactions
Just use lead() to get the next date. Then aggregate at the customer level to determine if any transaction in the time period has another within 30 days for that customer.
Finally, aggregate again:
select avg(case when mindiff < 30 then 1.0 else 0 end) as within_30days
from (select customerid, min(datediff(next_date - date)) as mindiff
from (select t.*, lead(date) over (partition by customerid order by date) as next_date
from transactions t
) t
where date >= '2017-01-01' and date < '2018-01-01'
group by customerid
) c

Determining whether date in one table falls between dates in another table

I have two columns of dates, the first column is the date a purchase order was received to be inspected, the second is the date that purchase order was accepted or rejected. What I would like is a graph with dates on the X-axis, and then the number of purchase orders in the queue on that day on the Y-axis.
Some purchase orders are completed that day, so they would still be counted, but they might not get addressed for days or weeks, so they would be counted on all those days until they were addressed.
Sample data below:
Row ID Date In Date Out
1 9/1/18 9/1/18
2 9/1/18 9/1/18
3 9/1/18 9/2/18
4 9/1/18 9/3/18
1 9/2/18 9/2/18
2 9/2/18 9/4/18
So, it would be 4 for 9/1/18, 4 for 9/2/18, 2 for 9/3/18, and 1 for 9/4/18.
I asked a similar question for Excel, and had success with that. However, we would like the data to be generated from within our ERP system (as opposed to copying the data to Excel manually), and I thought this might be possible using SQL and SSRS.
The formula for Excel was =COUNTIFS($A$1:$A$1000,"<="&C1,Sheet1!$B$1:$B$1000,">="&C1), so A and B were the Date In and Date Out columns, respectively, and then C was a column of all the days for the year.
I am not sure how I would a query like this, so I've started by making it similar to the Excel solution. I generated a list of dates, and I am trying to compare a date in that list to my dataset of purchase order dates, but I'm not sure how to do that, since it's comparing one (row by row) to many.
You need to create a table of dates and then join it to your orders based on the date being between the Date In and Date Out so that they are counted in multiple days.
DECLARE #START_DATE AS DATE = '2018-09-01'
DECLARE #END_DATE AS DATE = '2018-09-30'
IF OBJECT_ID('tempdb..#DATES') IS NOT NULL DROP TABLE #DATES;
IF OBJECT_ID('tempdb..#ORDERS') IS NOT NULL DROP TABLE #ORDERS;
SELECT 1 AS ROW_ID, CAST('2018-09-01' AS DATE) AS DATE_IN, CAST('2018-09-01' AS DATE) AS DATE_OUT
INTO #ORDERS
UNION
SELECT 2 AS ROW_ID, CAST('2018-09-01' AS DATE) AS DATE_IN, CAST('2018-09-01' AS DATE) AS DATE_OUT
UNION
SELECT 3 AS ROW_ID, CAST('2018-09-01' AS DATE) AS DATE_IN, CAST('2018-09-02' AS DATE) AS DATE_OUT
UNION
SELECT 4 AS ROW_ID, CAST('2018-09-01' AS DATE) AS DATE_IN, CAST('2018-09-03' AS DATE) AS DATE_OUT
UNION
SELECT 1 AS ROW_ID, CAST('2018-09-02' AS DATE) AS DATE_IN, CAST('2018-09-02' AS DATE) AS DATE_OUT
UNION
SELECT 2 AS ROW_ID, CAST('2018-09-02' AS DATE) AS DATE_IN, CAST('2018-09-04' AS DATE) AS DATE_OUT
;WITH GETDATES AS (SELECT #START_DATE AS DATE1
UNION ALL
SELECT DATEADD(DAY, 1, DATE1)
FROM GETDATES
WHERE DATE1 < #END_DATE
)
SELECT *
INTO #DATES
FROM GETDATES;
SELECT DATE1, COUNT(O.ROW_ID) AS ORDERS
FROM #DATES D
LEFT JOIN #ORDERS O ON D.DATE1 BETWEEN O.DATE_IN AND O.DATE_OUT
GROUP BY D.DATE1
Try comparing using "ReportItems!" property. Try to write expression something like:
iif(ReportItems!DateColumnX = somevalue, "Then Do something", "Else Do Something else")
Edits:
Try something like this:
Fields!ValueFromCurrentDataset(CompareLogic like(<,>,<> etc)) LookUp( Fields!ValueFromDataset2, "DataSet2")

Get percentage of total when using GROUP BY in SQL query

I have a SQL query that I'm using to return the number of training sessions recorded by a client on each day of the week (during the last year).
SELECT COUNT(*) total_sessions
, DAYNAME(log_date) day_name
FROM programmes_results
WHERE log_date >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR)
AND log_date <= CURDATE()
AND client_id = 7171
GROUP
BY day_name
ORDER
BY FIELD(day_name, 'MONDAY', 'TUESDAY', 'WEDNESDAY', 'THURSDAY', 'FRIDAY', 'SATURDAY', 'SUNDAY')
I would like to then plot a table showing these values as a percentage of the total, as opposed to as a 'count' for each day. However I'm at a bit of a loss as to how to do that without another query (which I'd like to avoid).
Any thoughts?
Use a derived table
select day_name, total_sessions, total_sessions / sum(total_sessions) * 100 percentage
from (
query from your question goes here
) temp
group by day_name, total_sessions
You can add the number of trainings per day in your client application to get the total count. This way you definitely avoid having a 2nd query to get the total.
Use the with rollup modifier in the query to get the total returned in the last row:
...GROUP BY day_name WITH ROLLUP ORDER BY ...
Use a subquery to return the overall count within each row
SELECT ..., t.total_count
...FROM programmes_results INNER JOIN (SELECT COUNT(*) as total_count FROM programmes_results WHERE <same where criteria>) as t --NO join condition
...
This will have the largest performance impact on the database, however, it enables you to have the total number in each row.

How to add value for every date in MySQL query

I have following sql-query:
SELECT DATE(time), ROUND(AVG(out_temp),2)
FROM data_table
WHERE id= 1 AND time BETWEEN '2012-08-18' AND '2012-08-30'
GROUP BY DATE(time)
ORDER BY time ASC
This returns:
date avg_temp
2012-08-18 11.41
2012-08-19 5.90
2012-08-28 11.22
2012-08-29 10.07
Everything works well so far... but I would like to add missing dates with constant value like this:
date avg_temp
2012-08-18 11.41
2012-08-19 5.90
2012-08-20 <value>
... ...
2012-08-27 <value>
2012-08-28 11.22
2012-08-29 10.07
How should I modify my query? Could somebody help me with this problem? I read some posts about creating a separate calendar table with prefilled date values, but I still didnt get it to work.
If your data table actually has data on every date, you can do this:
SELECT thedate, coalesce(ROUND(AVG(out_temp),2), <value>)
FROM (select distinct date(time) as thedate
from data_table
) dates left outer join
data_table dt
on date(time) = thedates.date
WHERE id= 1 AND thedate BETWEEN '2012-08-18' AND '2012-08-30'
GROUP BY thedate
ORDER BY time ASC
What you need is a driving table to generate the dates that you need. You can then left join to this table, to get the summaries you want. The COALESCE function lets you put in your default value.
Create a table with all dates you need, and then do a LEFT JOIN. E.g.
CREATE TABLE calendar ( day DATE PRIMARY KEY );
Then insert into the table, probably with a loop on your programming language (pseudocode):
for day in day_range( start_date, end_date ):
query( 'INSERT INTO calendar VALUES ( ' + day + ' );' );
And then do a LEFT JOIN:
SELECT DATE(day), ROUND(AVG(out_temp),2)
FROM data_table LEFT JOIN calendar ON data_table.time = calendar.day
WHERE id= 1 AND day BETWEEN '2012-08-18' AND '2012-08-30'
GROUP BY DATE(day)
ORDER BY day ASC