I've a mysql table with me
Now we want to do some calculations like this
count date wise for all courses enrolled
count where course id = 2 for date > start_date AND date < end_date
Expected output where we calculate all courses enrolled
Expected output where we calculate all courses enrolled where course id = 2
*
expected output where course_id = 2 AND date range is between 2022-11-15 to 2022-11-13
The query which I've right now
SELECT COUNT(*), DATE(registered_on)
FROM courses_enrolled
WHERE course_id = 1
GROUP BY DATE(registered_on), course_id
ORDER BY registered_on desc;
You need to use some kind of calendar table approach here:
SELECT d.dt AS date, COUNT(ce.id) AS cnt
FROM (
SELECT '2022-11-12' AS dt UNION ALL
SELECT '2022-11-13' UNION ALL
SELECT '2022-11-14' UNION ALL
SELECT '2022-11-15'
) d
LEFT JOIN courses_enrolled ce
ON DATE(ce.registered_on) = d.dt AND
ce.course_id = 2
GROUP BY d.dt
ORDER BY d.dt;
The calendar table ensures that all dates you want in the output appear. In practice, you may replace the subquery in d with a bona-fide table containing all dates of interest. The left join ensures that no dates are dropped which have no matching courses on that day.
If you are using MySQL 8 you can use a recursive CTE to create your date range.
For all enrolled courses for given date range -
WITH RECURSIVE calendar (date) AS (
SELECT '2022-11-13' # start date
UNION ALL
SELECT date + INTERVAL 1 DAY FROM calendar
WHERE date + INTERVAL 1 DAY <= '2022-11-15' # end date
)
SELECT COUNT(ce.id) count_all, c.date
FROM calendar c
LEFT JOIN courses_enrolled ce
ON ce.registered_on BETWEEN c.date AND (c.date + INTERVAL 1 DAY - INTERVAL 1 SECOND)
GROUP BY c.date
ORDER BY c.date DESC;
Note the use of BETWEEN start AND end of day in the join criteria. For a small dataset this offers negligible benefit but on a large dataset it would allow for use of an index on registered_on, which could offer significantly improved performance.
Or for just the selected course -
WITH RECURSIVE calendar (date) AS (
SELECT '2022-11-13' # start date
UNION ALL
SELECT date + INTERVAL 1 DAY FROM calendar
WHERE date + INTERVAL 1 DAY <= '2022-11-15' # end date
)
SELECT COUNT(ce.id) count_selected_course, c.date
FROM calendar c
LEFT JOIN courses_enrolled ce
ON ce.registered_on BETWEEN c.date AND (c.date + INTERVAL 1 DAY - INTERVAL 1 SECOND)
AND ce.course_id = 2
GROUP BY c.date
ORDER BY c.date DESC;
Or counting both at the same time -
WITH RECURSIVE calendar (date) AS (
SELECT '2022-11-13' # start date
UNION ALL
SELECT date + INTERVAL 1 DAY FROM calendar
WHERE date + INTERVAL 1 DAY <= '2022-11-15' # end date
)
SELECT COUNT(ce.id) count_all, COUNT(IF(ce.course_id = 2, ce.id, NULL)) count_selected_course, c.date
FROM calendar c
LEFT JOIN courses_enrolled ce
ON ce.registered_on BETWEEN c.date AND (c.date + INTERVAL 1 DAY - INTERVAL 1 SECOND)
GROUP BY c.date
ORDER BY c.date DESC;
Related
My task is to get the total commission in the last 5 months. This is my code. I am using mysql.
SELECT CONCAT(a.first_name, " ", a.last_name) AS sales_reps,
YEAR(c.order_date),
ROUND(SUM((d.quantity_ordered*d.price_each)*.01), 2) AS commission_last_6mos
FROM employees a
LEFT JOIN customers b ON b.sales_rep_employee_no=a.employee_no
LEFT JOIN orders c on b.customer_no = c.customer_no
LEFT JOIN order_details d ON c.order_no = d.order_no
WHERE job_title='Sales Rep'AND c.order_date >= CURDATE()- INTERVAL 5 MONTH
GROUP BY CONCAT(a.first_name, " ", a.last_name)
ORDER BY commission_last_6mos DESC
LIMIT 1;
I have also used now(). They do not show any results.
It looks to me like the table containing job_title is not specified. If it is in the table employees, then you should have a.job_title, for example. For the time range, try:
AND c.order_date >= DATE_SUB(now(), INTERVAL 6 MONTH)
For more information about the DATE_SUB function, check out https://www.w3schools.com/sql/func_mysql_date_sub.asp
I'm writing this query where it gets a row value and it will return the number of records for each day for that row between two given dates and returns 0 if there is no records for that day.
I've written a query which does this for the past week.
Current Query:
select d.day, count(e.event) as count
from (
select 0 day union all
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6
) d
left join event e
on e.timestamp >= current_date - interval d.day day
and e.timestamp < current_date - interval (d.day - 1) day
and e.event = ?
group by d.day
The problem is this returns only the results for a fixed number of days.. I want to be able to give it two dates (start and end dates) and get the record counts for each day where I don't know the number of dates in between.
You could use/create a bona-fide calendar table. Something like this:
SELECT
d.day,
COUNT(e.timestamp) AS cnt
FROM
(
SELECT '2020-01-01' AS day UNION ALL
SELECT '2020-01-02' UNION ALL
...
SELECT '2020-12-31'
) d
LEFT JOIN event e
ON e.timestamp >= d.day AND e.timestamp < DATE_ADD(d.day, INTERVAL 1 DAY)
WHERE
d.day BETWEEN <start_date> AND <end_date>
GROUP BY
d.day;
I have covered only the calendar year 2020, but you may extend to cover whatever range you want.
The following query returns the visitors and pageviews of last 7 days. However, if there are no results (let's say it is a fresh account), nothing is returned.
How to edit this in order to return 0 in days that there are no entries?
SELECT Date(timestamp) AS day,
Count(DISTINCT hash) AS visitors,
Count(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND timestamp >= Subdate(Curdate(), 7)
GROUP BY day
Assuming that you always have at least one record in the table for each of the last 7 days (regardless of the company_id), then you can use conditional aggregation as follows:
select
date(timestamp) as day,
count(distinct case when company_id = 1 then hash end) as visitors,
sum(company_id = 1) as pageviews
from behaviour
where timestamp >= curdate() - interval 7 day
group by day
Note that I changed you query to use standard date arithmetics, which I find easier to understand that date functions.
Otherwise, you would need to move the condition on the date from the where clause to the aggregate functions:
select
date(timestamp) as day,
count(distinct case when timestamp >= curdate() - interval 7 day and company_id = 1 then hash end) as visitors,
sum(timestamp >= curdate() - interval 7 day and company_id = 1) as pageviews
from behaviour
group by day
If your table is big, this can be expensive so I would not recommend that.
Alternatively, you can generate a derived table of dates and left join it with your original query:
select
curdate - interval x.n day day,
count(distinct b.hash) visitors,
count(b.hash) page_views
from (
select 1 n union all select 2 union all select 3 union all select 4
union all select 5 union all select 6 union all select 7
) x
left join behavior b
on b.company_id = 1
and b.timestamp >= curdate() - interval x.n day
and b.timestamp < curdate() - interval (x.n - 1) day
group by x.n
Use a query that returns all the dates from today minus 7 days to today and left join the table behaviour:
SELECT t.timestamp AS day,
Count(DISTINCT b.hash) AS visitors,
Count(b.timestamp) AS pageviews
FROM (
SELECT Subdate(Curdate(), 7) timestamp UNION ALL SELECT Subdate(Curdate(), 6) UNION ALL
SELECT Subdate(Curdate(), 5) UNION ALL SELECT Subdate(Curdate(), 4) UNION ALL SELECT Subdate(Curdate(), 3) UNION ALL
SELECT Subdate(Curdate(), 2) UNION ALL SELECT Subdate(Curdate(), 1) UNION ALL SELECT Curdate()
) t LEFT JOIN behaviour b
ON Date(b.timestamp) = t.timestamp AND b.company_id = 1
GROUP BY day
Use IFNULL:
IFNULL(expr1, 0)
From the documentation:
If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2. IFNULL() returns >a numeric or string value, depending on the context in which it is used.
You can use next trick:
First, get query that return 1 dummy row: SELECT 1;
Next use LEFT JOIN to connect summary row(s) without condition. This join will return values in case data exists on NULL values in other case.
Last select from joined queries onle what we need and convert NULL's to ZERO's
using IFNULL dunction.
SELECT
IFNULL(b.day,0) AS DAY,
IFNULL(b.visitors,0) AS visitors,
IFNULL(b.pageviews,0) AS pageviews
FROM (
SELECT 1
) a
LEFT JOIN (
SELECT DATE(TIMESTAMP) AS DAY,
COUNT(DISTINCT HASH) AS visitors,
COUNT(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND TIMESTAMP >= SUBDATE(CURDATE(), 7)
GROUP BY DAY
) b ON 1 = 1;
I'm trying to find an answer to the following query:
A customer wants a single room for three consecutive nights. Find the first available date in December 2016.
As per the question, this should be the right answer. But I don't know how to solve it.
+-----+------------+
| id | MIN(i) |
+-----+------------+
| 201 | 2016-12-11 |
+-----+------------+
The link is from question number 14 here.
This is the ER diagram of the database:
I apologize that I'm a bit rusty with this kind of query and I can't guarantee that I got all of the syntax correct, but I think that something like the following might work:
SELECT id, DATE_ADD(b.booking_date, INTERVAL (end_date + 1 DAY) as date
FROM (
SELECT r.id, STR_TO_DATE('2016-01-01', '%Y-%m-%d') as start_of_month, b.booking_date as start_date, DATE_ADD(b.booking_date, INTERVAL (nights - 1) DAY) as end_date
FROM room r
LEFT JOIN booking b ON r.id = b.room_no
ORDER BY r.id, b.booking_date
) as room_bookings
WHERE DATE_DIFF(room_bookings.start_of_month, room_bookings.start_date) >= 3
OR DATE_DIFF(room_bookings.end_date, (
SELECT b2.booking_date FROM booking b2
WHERE b2.room_no = room_bookings.id AND b2.booking_date > room_bookings.start_date
ORDER BY b2.booking_date LIMIT 1)
) >= 3
In fact, now that I type that all out, you might be able to tweak the WHERE of the main query so that you don't even need the room_bookings subselect. Hopefully this helps and isn't too far off the mark.
This seems very hard to do without a calendar table -- because an appropriate room might have no booking at all during the month. Without any booking, there is no record in the month to start with.
select r.id, dte
from rooms r cross join
(select date('2018-12-01') as dte union all
select date('2018-12-02') as dte union all
. . .
select date('2018-12-32') as dte
) d
where not exists (select 1 from bookings b where b.room_no = r.id and b.booking_date = d.dte) and
not exists (select 1 from bookings b where b.room_no = r.id and b.booking_date = d.dte + interval 1 day) and
not exists (select 1 from bookings b where b.room_no = r.id and b.booking_date = d.dte + interval 2 day)
order by d.dte
limit 1;
This assumes that booking_date is the start of the stay. You need to provide the logic for a "single room".
select distinct top 1 alll.i,alll.room_no,
case
when (select count(*) from booking where room_no = alll.room_no and booking_date between dateadd(day,1,alll.i) and dateadd(day,3,alll.i)) > 0 then 'Y'
else 'N'
end as av3
from
(select c.i,b.room_no,b.booking_date
from calendar c cross join booking b
where month(c.i) = 12 and year(c.i) = 2016 and b.room_type_requested = 'single'
) as alll
join
(
select distinct c.i, b.room_no
from calendar c join booking b
on c.i between b.booking_date and DATEADD(day,b.nights-1,b.booking_date)
where month(c.i) = 12 and year(c.i) = 2016 and b.room_type_requested = 'single'
) as booked
on alll.i = booked.i
and alll.room_no <> booked.room_no
order by 1
This works. It is a little complicated but basically first checks all the rooms that are booked and then does a comparison between rooms not booked on each day of the month till the next 3 days.
My solution is separate problem into 2 parts (in the end was 2 queries joined together). May not be the most efficient but the solution is correct.
1) Of the single rooms, look at the last check-out date, and see which one is vacant first (i.e. no more bookings for the rest of the month)
2) check in between current reservations - and see if there's a 3 day gap between them
3) join those together - grab the min
WITH subquery AS( -- existing single-bed bookings in Dec
SELECT room_no, booking_date,
DATE_ADD(booking_date, INTERVAL (nights-1) DAY) AS last_night
FROM booking
WHERE room_type_requested='single' AND
DATE_ADD(booking_date, INTERVAL (nights-1) DAY)>='2016-12-1' AND
booking_date <='2016-12-31'
ORDER BY room_no, last_night)
SELECT room_no, MIN(first_avail) AS first_avail --3) join the 2 together
FROM(
-- 1) check the last date the room is booked in December (available after)
SELECT room_no, MIN(first_avail) AS first_avail
FROM(
SELECT room_no, DATE_ADD(MAX(last_night), INTERVAL 1 DAY) AS first_avail
FROM subquery q3
GROUP BY 1
ORDER BY 2) AS t2
UNION
-- 2) check if any 3-day exist in between reservations
SELECT room_no, DATE_ADD(MIN(end2), INTERVAL 1 DAY) AS first_avail
FROM(
SELECT q1.booking_date AS beg1, q1.room_no, q1.last_night AS end1,
q2.booking_date AS beg2, q2.last_night AS end2
FROM subquery q1
JOIN subquery q2
ON q1.room_no = q2.room_no AND q2.booking_date > q1.last_night
GROUP BY 2,1
ORDER BY 2,1) AS t
WHERE beg2-end1 > 3) AS inner_t
This works conceptually as the first avaiable date should always be the end of the previous booking.
SELECT MIN(DATE_ADD(a.booking_date, INTERVAL nights DAY)) AS i
FROM booking AS a
WHERE DATE_ADD(a.booking_date, INTERVAL nights DAY)
>= '2016-12-01'
AND room_type_requested = 'single'
AND NOT EXISTS
(SELECT 1 FROM booking AS b
WHERE b.booking_date BETWEEN
DATE_ADD(a.booking_date, INTERVAL nights DAY)
AND DATE_ADD(a.booking_date, INTERVAL nights+2 DAY)
AND a.room_no = b.room_no)
This seems like an easy task but my basic sql knowledge is failing me as I'm still learning.
Basically, I'm trying to combine:
SELECT DATE(created) DATE, COUNT(DISTINCT created) newpost FROM surveys
WHERE created >= Last_day(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND created < last_day(CURRENT_DATE) + INTERVAL 1 DAY GROUP BY DATE(created);
and
SELECT DATE(TIMESTAMP) DATE,subs FROM trafficstats
WHERE TIMESTAMP >= LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND TIMESTAMP < LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY;
into one query that will return data, grouped by date, into two additional columns - newposts and subs.
I've tried using UNION, which doesn't seem to be giving me the output I want. It combined the data into one column (newpost), and also didn't group by date.
I'm still fairly new to writing MySQL queries, and I've tried searching for answers to no avail. Hoping to seek the knowledge of those smarter than me here.
You could use JOIN
select t1.DATE, t1.newpost, t2.subs
from (
SELECT DATE(created) DATE, COUNT(DISTINCT created) newpost
FROM surveys
WHERE created >= Last_day(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND created < last_day(CURRENT_DATE) + INTERVAL 1 DAY
GROUP BY DATE(created)
) t1
left join (
SELECT DATE(TIMESTAMP) DATE, subs
FROM trafficstats
WHERE TIMESTAMP >= LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND TIMESTAMP < LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY
) t2 on t1.DATE = t2.DATE
I guess you want one row per distinct date, with two different count values shown.
This kind of query is slightly tricker than it seems at first glance, because the two summary queries might have different sets of dates.
So you need to start with a subquery that yields all possible dates of interest. You then need to LEFT JOIN each summary query to it. You must use LEFT JOIN instead of the ordinary inner JOIN, because LEFT JOIN doesn't suppress rows from the right side of the join when they don't match any rows from the left side.
Here goes:
All your dates. Notice the UNION operation is a setwise (duplicate-removing) union operation.
SELECT DISTINCT DATE(created) DATE FROM newpost
WHERE created >= Last_day(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND created < last_day(CURRENT_DATE) + INTERVAL 1 DAY
UNION
SELECT DISTINCT DATE(TIMESTAMP) DATE FROM trafficstats
WHERE TIMESTAMP >= LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND TIMESTAMP < LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY
Then you need your two summary subqueries. The first one is this. Notice that I changed COUNT(DISTINCT created) to COUNT(*) because I don't understand the logic behind the DISTINCT there. Can you have more than one row for a single post; do you tell them apart by timestamp? If you have a row for each post you should COUNT(*).
SELECT DATE(created), COUNT(*) newposts
FROM newpost
GROUP BY DATE(created)
The second summary is this. Again, I counted rows.
SELECT DATE(TIMESTAMP), COUNT(*) subs
FROM trafficstats
GROUP BY DATE(TIMESTAMP)
Finally, join those three subqueries like so. You get the dates from the first subquery, and the summary-by-date information from the second two subqueries.
SELECT dates.DATE, posts.newposts, subs.subs
FROM ( /* date subquery */ ) dates
LEFT JOIN ( /* posts subquery */ ) posts ON dates.DATE = posts.DATE
LEFT JOIN ( /* subs subquery */ ) subs ON dates.DATE = subs.DATE
ORDER BY dates.DATE
Putting it all together:
SELECT dates.DATE, posts.newposts, subs.subs
FROM (
SELECT DISTINCT DATE(created) DATE FROM newpost
WHERE created >= Last_day(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND created < last_day(CURRENT_DATE) + INTERVAL 1 DAY
UNION
SELECT DATE(TIMESTAMP) DATE FROM trafficstats
WHERE TIMESTAMP >= LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY - INTERVAL 1 MONTH
AND TIMESTAMP < LAST_DAY(CURRENT_DATE) + INTERVAL 1 DAY
) dates
LEFT JOIN (
SELECT DATE(created), COUNT(*) newposts
FROM newpost
GROUP BY DATE(created)
) posts ON dates.DATE = posts.DATE
LEFT JOIN (
SELECT DATE(TIMESTAMP), COUNT(*) subs
FROM trafficstats
GROUP BY DATE(TIMESTAMP)
) subs ON dates.DATE = subs.DATE
ORDER BY dates.DATE