I have a booking system that has multiple simultaneous bookings with a count number for each. I need to get the minimum and maximum for a specified date-range (for a day, in this case). I found some good code here, which works great in the test. But in my implementation, it fails in this particular instance:
It does not count bookings that start prior to the query-range and end within the query-range.
How do I fix this?
Here is an example:
This booking exists with these properties:
listings (an ID that multiple bookings can have, but only one in this case): 2f23f23f
date_start: 2016-01-15 08:00:00
date_end: 2016-01-17 08:00:00
state: active
count: 1
Result:
min_count: 0
max_count: 0
It should return:
min_count: 0
max_count: 1
If we query the very same, but with date range 2016-01-16 00:00:00 - 2016-01-16 23:59:59, it returns the correct answer:
min_count: 1
max_count: 1
Here is the MYSQL:
SELECT
MAX(simultaneous) AS max_count,
MIN(simultaneous) AS min_count
FROM (
SELECT IFNULL(SUM(
(
CASE WHEN (
listings = '2f23f23f'
AND
(state = 'active')
)
THEN count
ELSE 0
END
)
),0) AS simultaneous
FROM bookings RIGHT JOIN (
SELECT date_start AS boundary
FROM bookings
WHERE date_start BETWEEN '2016-01-17 00:00:00' AND '2016-01-17 23:59:59'
UNION
SELECT date_end
FROM bookings
WHERE date_end BETWEEN '2016-01-17 00:00:00' AND '2016-01-17 23:59:59'
UNION
SELECT MAX(boundary)
FROM (
SELECT MAX(date_start) AS boundary
FROM bookings
WHERE date_start <= '2016-01-17 00:00:00'
UNION ALL
SELECT MAX(date_end)
FROM bookings
WHERE date_end <= '2016-01-17 23:59:59'
) t
) t ON date_start <= boundary AND boundary < date_end
LEFT OUTER JOIN cart ON cart_bookings = id
GROUP BY boundary
) t
Whew, ok, here's the answer. The original evidently wasn't complete. It needed to include the start/end dates of the time-range being requested.
UNION
SELECT :date_start
UNION
SELECT :date_end
Complete code:
SELECT
MAX(simultaneous) AS max_count,
MIN(simultaneous) AS min_count
FROM (
SELECT IFNULL(SUM(
(
CASE WHEN (
listings = '2f23f23f'
AND
(state = 'active')
)
THEN count
ELSE 0
END
)
),0) AS simultaneous
FROM bookings RIGHT JOIN (
SELECT date_start AS boundary
FROM bookings
WHERE date_start BETWEEN '2016-01-17 00:00:00' AND '2016-01-17 23:59:59'
UNION
SELECT date_end
FROM bookings
WHERE date_end BETWEEN '2016-01-17 00:00:00' AND '2016-01-17 23:59:59'
UNION
SELECT MAX(boundary)
FROM (
SELECT MAX(date_start) AS boundary
FROM bookings
WHERE date_start <= '2016-01-17 00:00:00'
UNION ALL
SELECT MAX(date_end)
FROM bookings
WHERE date_end <= '2016-01-17 23:59:59'
) t
UNION
SELECT :date_start
UNION
SELECT :date_end
) t ON date_start <= boundary AND boundary < date_end
LEFT OUTER JOIN cart ON cart_bookings = id
GROUP BY boundary
) t
Related
I have a table "products" and a table "links". every product can have multiple links, each link can have many sales, clicks and impressions, but doesn't necessarily have all of them. I want to get a list of links of a certain product matching some criteria for them. I want to get this data grouped per day and campaign and link banner size.
The following query works correctly, but it's much slower than it could be. The problem is that the subqueries get the data for all link ids and it's just filtered in the end. The overall query would become much faster if the sub queries included something like
where link_id IN (...) but I only know the link_ids from the main query, not before
if I try to add
where link_id = l.id
it's obviously an unknown column, because the sub query doesn't have access to the main queries results.
how can I let the sub queries only look up data for the link_Ids that the main query found? I could split it up to 2 complete separate queries, but is this possible in one query?
select p.id, p.name, l.id, l.banner_size,
coalesce(sum(case when t1.col = 'sales' then ct else 0 end), 0) as total_sales,
coalesce(sum(case when t1.col = 'clicks' then ct else 0 end), 0) as total_clicks,
coalesce(sum(case when t1.col = 'impressions' then ct else 0 end), 0) as total_impressions,
t1.dt
from links l
inner join products p
on p.id = l.product_id
left join
(
select count(1) as ct, link_id, date(clicked) dt, 'sales' as col
from sales
where clicked >= '2020-01-01 00:00:00' and clicked <= '2020-01-31 00:00:00'
group by date(clicked), link_id
union all
select count(1) as ct, link_id, date(created) dt, 'clicks'
from clicks_source1
where created >= '2020-01-01 00:00:00' and created <= '2020-01-31 00:00:00'
group by date(created), link_id
union all
select count(1) as ct, link_id, date(time) dt, 'clicks'
from clicks_source2
where time >= '2020-01-01 00:00:00' and time <= '2020-01-31 00:00:00'
group by date(time), link_id
union all
select count(1) as ct, link_id, date(created) dt, 'impressions'
from impression_source1
where created > '2020-01-01 00:00:00' and created <= '2020-01-31 00:00:00'
group by date(created), link_id
union all
select count(1) as ct, link_id, date(time) dt, 'impressions'
from impression_source2
where time > '2020-01-01 00:00:00' and time <= '2020-01-31 00:00:00'
group by date(time), link_id
) t1 on t1.link_id = l.id
where l.agent_id = 300
and p.id = 3454
and l.banner_size = 48
and p.company NOT IN (13592, 28189)
group by c.id, l.banner_size, t1.dt
having (total_sales + total_clicks + total_impressions) > 0
order by dt DESC, p.id ASC, l.banner_size ASC
you can just add inner joins with links to all the subqueries
select count(1) as ct, s.link_id, date(s.clicked) dt, 'sales' as col
from sales s
join links l1 on l1.id = s.link_id
where s.clicked >= '2020-01-01 00:00:00' and s.clicked <= '2020-01-31 00:00:00'
group by date(s.clicked), s.link_id
union all etc.
Then you'll only get the rows with a match and the query should run faster
I have 2 queries right now for which I am looking to combine into 1 if possible.
I have open tickets stored in the Tickets_Open table and closed tickets in Tickets_Closed. Both tables have "Date_Requested" and "Date_Completed" columns. I need to count the number of tickets requested and completed each day.
My tickets requested count query is the following:
SELECT SUM(Count) AS TotalOpen, Date FROM(
SELECT COUNT(Ticket_Request_Code) AS Count, Date_Requested AS Date
FROM Tickets_Closed
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
UNION
SELECT COUNT(Work_Request_Code) AS Count, Date_Requested AS Date
FROM Tickets_Open
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
) AS t1 GROUP BY Date ORDER BY `t1`.`Date` DESC
My tickets completed count query is the following:
SELECT COUNT(Ticket_Request_Code) AS CountClosed, Date_Completed AS Date
FROM Tickets_Closed
Where Date_Completed >='2018-01-01 00:00:00'
GROUP BY(Date_Completed)
Both queries return the correct result. For open it returns with the column headings Date and TotalOpen. For close it returns with the column headings Date and CountClosed.
Is it possible to return it with the following column headings Date, TotalOpen, CountClosed.
You can combine these as:
SELECT Date, SUM(isopen) as isopen, SUM(isclose) as isclose
FROM ((SELECT date_requested as date, 1 as isopen, 0 as isclose
FROM Tickets_Closed
WHERE Date_Requested >= '2018-01-01'
) UNION ALL
(SELECT date_requested, 1 as isopen, 0 as isclose
FROM Tickets_Open
WHERE Date_Requested >= '2018-01-01'
) UNION ALL
(SELECT date_closed as date, 0 as isopen, 1 as isclose
FROM Tickets_Closed
WHERE date_closed >= '2018-01-01'
)
) t
GROUP BY Date
ORDER BY Date DESC;
This assumes that Ticket_Request_Code and Work_Request_Code are not NULL. If COUNT() is really being used to check for NULL values, then add the condition to the WHERE clause in each subquery.
This query uses the FULL OUTER JOIN on the Dates as well, but it correctly adds the Open/Closed counts together to give you a TotalOpen Count. This will also handle possible NULL values for cases where you have a day that doesn't close any tickets.
WITH open AS
(
SELECT COUNT(Work_Request_Code) AS OpenCount, Date_Requested AS Date
FROM Tickets_Open
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
)
, close AS
(
SELECT COUNT(Ticket_Request_Code) AS ClosedCount, Date_Requested AS Date
FROM Tickets_Closed
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
)
SELECT
COALESCE(c.Date, o.Date) AS Date
, IFNULL(o.OpenCount, 0) + IFNULL(c.ClosedCount, 0) AS TotalOpen
, IFNULL(c.CountClosed, 0) AS CountClosed
FROM open o
FULL OUTER JOIN closed c ON o.Date = c.Date
Could you help me to calculate percent of users, which made payments?
I've got two tables:
activity
user_id login_time
201 01.01.2017
202 01.01.2017
255 04.01.2017
255 05.01.2017
256 05.01.2017
260 15.03.2017
2
payments
user_id payment_date
200 01.01.2017
202 01.01.2017
255 05.01.2017
I try to use this query, but it calculates wrong percent:
SELECT activity.login_time, (select COUNT(distinct payments.user_id)
from payments where payments.payment_time between '2017-01-01' and
'2017-01-05') / COUNT(distinct activity.user_id) * 100
AS percent
FROM payments INNER JOIN activity ON
activity.user_id = payments.user_id and activity.login_time between
'2017-01-01' and '2017-01-05'
GROUP BY activity.login_time;
I need a result
01.01.2017 100 %
02.01.2017 0%
03.01.2017 0%
04.01.2017 0%
05.01.2017 - 50%
If you want the ratio of users who have made payments to those with activity, just summarize each table individually:
select p.cnt / a.cnt
from (select count(distinct user_id) as cnt from activity a) a cross join
(select count(distinct user_id) as cnt from payment) p;
EDIT:
You need a table with all dates in the range. That is the biggest problem.
Then I would recommend:
SELECT d.dte,
( ( SELECT COUNT(DISTINCT p.user_id)
FROM payments p
WHERE p.payment_date >= d.dte and p.payment_date < d.dte + INTERVAL 1 DAY
) /
NULLIF( (SELECT COUNT(DISTINCT a.user_id)
FROM activity a
WHERE a.login_time >= d.dte and p.login_time < d.dte + INTERVAL 1 DAY
), 0
) as ratio
FROM (SELECT date('2017-01-01') dte UNION ALL
SELECT date('2017-01-02') dte UNION ALL
SELECT date('2017-01-03') dte UNION ALL
SELECT date('2017-01-04') dte UNION ALL
SELECT date('2017-01-05') dte
) d;
Notes:
This returns NULL on days where there is no activity. That makes more sense to me than 0.
This uses logic on the dates that works for both dates and date/time values.
The logic for dates can make use of an index, which can be important for this type of query.
I don't recommend using LEFT JOINs. That will multiply the data which can make the query expensive.
First you need a table with all days in the range. Since the range is small you can build an ad hoc derived table using UNION ALL. Then left join the payments and activities. Group by the day and calculate the percentage using the count()s.
SELECT x.day,
concat(CASE count(DISTINCT a.user_id)
WHEN 0 THEN
1
ELSE
count(DISTINCT p.user_id)
/
count(DISTINCT a.user_id)
END
*
100,
'%')
FROM (SELECT cast('2017-01-01' AS date) day
UNION ALL
SELECT cast('2017-01-02' AS date) day
UNION ALL
SELECT cast('2017-01-03' AS date) day
UNION ALL
SELECT cast('2017-01-04' AS date) day
UNION ALL
SELECT cast('2017-01-05' AS date) day) x
LEFT JOIN payments p
ON p.payment_date = x.day
LEFT JOIN activity a
ON a.login_time = x.day
GROUP BY x.day;
i have a query with subqueries for a timeline widget of participants, leads and customers.
For example with 15k rows in the table but only 2k in this date range (January 1st to January 28th) this takes about 40 seconds!
SELECT created_at as date,
(
SELECT COUNT(id)
FROM participant
WHERE created_at <= date
) as participants,
(
SELECT COUNT(DISTINCT id)
FROM participant
WHERE participant_type = "lead"
AND created_at <= date
) as leads,
(
SELECT COUNT(DISTINCT id)
FROM participant
WHERE participant_type = "customer"
AND created_at <= date
) as customer
FROM participant
WHERE created_at >= '2016-01-01 00:00:00'
AND created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
How can i improve the performance?
The table fields are declared as follows:
id => primary_key, INT 10, auto increment
participant_type => ENUM "lead,customer", NULLABLE, ut8_unicode_ci
created_at => TIMESTAMP, default '0000-00-00 00:00:00'
Possibly try using conditions within the counts (or sums) to get the values you want, having cross joined things:-
SELECT a.created_at as date,
SUM(IF(b.created_at <= a.created_at, 1, 0)) AS participants,
COUNT(DISTINCT IF(b.participant_type = "lead" AND b.created_at <= a.created_at, b.id, NULL)) AS leads,
COUNT(DISTINCT IF(b.participant_type = "customer" AND b.created_at <= a.created_at, b.id, NULL)) AS customer
FROM participant a
CROSS JOIN participant b
WHERE a.created_at >= '2016-01-01 00:00:00'
AND a.created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
or maybe move the date check into the join
SELECT a.created_at as date,
COUNT(b.id) AS participants,
COUNT(DISTINCT IF(b.participant_type = "lead", b.id, NULL)) AS leads,
COUNT(DISTINCT IF(b.participant_type = "customer", b.id, NULL)) AS customer
FROM participant a
LEFT OUTER JOIN participant b
ON b.created_at <= a.created_at
WHERE a.created_at >= '2016-01-01 00:00:00'
AND a.created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
I'm not clearly understanding what you want to do with this query. But may I can provide way for optimization.
Try this one:
SELECT
participants.day as day,
participants.total_count,
leads.lead_count,
customer.customer_count
FROM
(
SELECT created_at as day, COUNT(id) as total_count
FROM participant
WHERE created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as participants
LEFT JOIN
(
SELECT created_at as day, COUNT(DISTINCT id) as lead_count
FROM participant
WHERE participant_type = "lead"
AND created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as leads ON (participants.day = leads.day)
LEFT JOIN
(
SELECT created_at as day, COUNT(DISTINCT id) as customer_count
FROM participant
WHERE participant_type = "customer"
AND WHERE created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as customer ON (participants.day = customer.day)
Add index to the query. You can execute Explain on this query.
With the help of EXPLAIN, you can see where you should add indexes to tables so that the statement executes faster by using indexes to find rows.
this query works but pulls all results. I would like it to only pull results that are not 0.00 which is the totaldue. This is calculated within the query but I do not know how to exclude results with 0.00?
SELECT name,
SUM(IF(timeperiod='0',totalinv-paidtotal,0)) AS p0030,
SUM(IF(timeperiod='30',totalinv-paidtotal,0)) AS p3060,
SUM(IF(timeperiod='60',totalinv-paidtotal,0)) AS p6090,
SUM(IF(timeperiod='90',totalinv-paidtotal,0)) AS p9000,
SUM(totalinv)-SUM(paidtotal) AS totaldue
FROM
(
SELECT primary_key, name, timeperiod, totalinv, SUM(paidtotal) as paidtotal
FROM
(
SELECT
a.primary_key,
a_name AS name,
CAST(totalinv AS DECIMAL(10,2)) as totalinv,
CAST(IFNULL(amount,0) AS DECIMAL(10,2)) as paidtotal,
CASE
WHEN invoicedate > DATE_SUB(STR_TO_DATE($today,'%Y%m%d'),INTERVAL 29 DAY) THEN '0'
WHEN invoicedate > DATE_SUB(STR_TO_DATE($today,'%Y%m%d'),INTERVAL 59 DAY) AND invoicedate <= DATE_SUB(STR_TO_DATE($today,'%Y%m%d'),INTERVAL 29 DAY) THEN '30'
WHEN invoicedate > DATE_SUB(STR_TO_DATE($today,'%Y%m%d'),INTERVAL 89 DAY) AND invoicedate <= DATE_SUB(STR_TO_DATE($today,'%Y%m%d'),INTERVAL 29 DAY) THEN '60'
ELSE '90'
END AS timeperiod
FROM $mysql_billing a
LEFT OUTER JOIN $mysql_billing_dates b ON a.primary_key = b.id
WHERE $today >= invoicedate
AND $totaldue!='0.00'
AND void=''
) foo
GROUP BY primary_key, name, timeperiod
) bar
GROUP BY name
ORDER BY name ASC
You are just missing a HAVING at the very end:
....
GROUP BY name
HAVING totaldue != 0
ORDER BY name ASC
That will allow you to select on your calculated/grouped column.