Use main query condition on sub query - mysql

I have a table "products" and a table "links". every product can have multiple links, each link can have many sales, clicks and impressions, but doesn't necessarily have all of them. I want to get a list of links of a certain product matching some criteria for them. I want to get this data grouped per day and campaign and link banner size.
The following query works correctly, but it's much slower than it could be. The problem is that the subqueries get the data for all link ids and it's just filtered in the end. The overall query would become much faster if the sub queries included something like
where product_id IN (...) but I only know the link_ids from the main query, not before
if I try to add
where link_id = l.id
it's obviously an unknown column, because the sub query doesn't have access to the main queries results.
how can I let the sub queries only look up data for the link_Ids that the main query found? I could split it up to 2 complete separate queries, but is this possible in one query?
select p.id, p.name, l.id, l.banner_size,
coalesce(sum(case when t1.col = 'sales' then ct else 0 end), 0) as total_sales,
coalesce(sum(case when t1.col = 'clicks' then ct else 0 end), 0) as total_clicks,
coalesce(sum(case when t1.col = 'impressions' then ct else 0 end), 0) as total_impressions,
t1.dt
from links l
inner join products p
on p.id = l.product_id
left join
(
select count(1) as ct, link_id, date(clicked) dt, 'sales' as col
from sales
where clicked >= '2020-01-01 00:00:00' and clicked <= '2020-01-31 00:00:00'
group by date(clicked), link_id
union all
select count(1) as ct, link_id, date(created) dt, 'clicks'
from clicks_source1
where created >= '2020-01-01 00:00:00' and created <= '2020-01-31 00:00:00'
group by date(created), link_id
union all
select count(1) as ct, link_id, date(time) dt, 'clicks'
from clicks_source2
where time >= '2020-01-01 00:00:00' and time <= '2020-01-31 00:00:00'
group by date(time), link_id
union all
select count(1) as ct, link_id, date(created) dt, 'impressions'
from impression_source1
where created > '2020-01-01 00:00:00' and created <= '2020-01-31 00:00:00'
group by date(created), link_id
union all
select count(1) as ct, link_id, date(time) dt, 'impressions'
from impression_source2
where time > '2020-01-01 00:00:00' and time <= '2020-01-31 00:00:00'
group by date(time), link_id
) t1 on t1.link_id = l.id
where l.agent_id = 300
and p.id = 3454
and l.banner_size = 48
and p.company NOT IN (13592, 28189)
group by c.id, l.banner_size, t1.dt
having (total_sales + total_clicks + total_impressions) > 0
order by dt DESC, p.id ASC, l.banner_size ASC

What you'd like to use is called lateral joins, but MySQL doesn't feature them.
One solution is to move the subqueries for the counts to the select clause:
select
id,
name,
coalesce((select count(*) from views v where v.product_id = p.product_id), 0)
as total_views,
coalesce((select count(*) from clicks c where c.product_id = p.product_id), 0)
as total_clicks
from products
where status = 1;
It is unnecessary that you combined views and clicks in one subquery. Maybe it's just this, that kept the optimizer from getting a better execution plan. You can try the following and check whether it's already much faster than your original query.
select
p.id,
p.name,
coalesce(v.total, 0) as total_views
coalesce(c.total, 0) as total_clicks
from products
left join (select product_id, count(*) as total from views group by product_id) v
on v.product_id = p.product_id
left join (select product_id, count(*) as total from clicks group by product_id) c
on c.product_id = p.product_id
where p.status = 1;

You say that your example is much simplified. Maybe you can just apply the condition early by repeating it. E.g.:
select p.id, p.name,
coalesce(sum(case when t1.col = 'views' then ct else 0 end), 0) as total_views,
coalesce(sum(case when t1.col = 'clicks' then ct else 0 end), 0) as total_clicks
from products p
left join
(
select count(1) as ct, product_id, 'views' as col
from views
where product_id in (select product_id from products where status = 1)
group by product_id
union all
select count(1) as ct, product_id, 'clicks' as col
from clicks
where product_id in (select product_id from products where status = 1)
group by product id
) t1 on t1.product_id = p.product_id
where p.status = 1;
Or with a WITH clause:
with p as (select * from products where status = 1)
select p.id, p.name,
coalesce(sum(case when t1.col = 'views' then ct else 0 end), 0) as total_views,
coalesce(sum(case when t1.col = 'clicks' then ct else 0 end), 0) as total_clicks
from p
left join
(
select count(1) as ct, product_id, 'views' as col
from views
where product_id in (select product_id from p)
group by product_id
union all
select count(1) as ct, product_id, 'clicks' as col
from clicks
where product_id in (select product_id from p)
group by product id
) t1 on t1.product_id = p.product_id
;

Related

Select customers with no invoices after date

I have Client and Invoice tables. They have one-to-many relationship, where Client.id = Invoice.client_id.
Client columns:
id
Invoice columns:
id,
client_id,
invoice_date
Of course the example is simplified to relevant data.
I am trying to select customers who did NOT have invoices after '2010-01-01'.
I can't figure out any working way to do this. Some routes I took look like this (there many other variations, but no point displaying the here):
SELECT c.id, COUNT(i.invoice_date > "2010-01-01") AS cnt
FROM Client AS c LEFT JOIN Invoice i ON i.client_id = c.id
GROUP BY c.id HAVING cnt = 0
and
SELECT client_id, COUNT(invoice_date > '2010-01-01') as cnt
FROM Invoice
GROUP BY client_id HAVING cnt = 0
You can use a sub-query with NOT EXISTS like this:
SELECT *
FROM Client
WHERE NOT EXISTS (
SELECT 1
FROM Invoice
WHERE Invoice.invoice_date > '2010-01-01' AND Invoice.client_id = Client.id
)
You can also use SUM with CASE or IF:
-- CASE
SELECT c.id, SUM(CASE WHEN i.invoice_date > '2010-01-01' THEN 1 ELSE 0 END) AS cnt
FROM Client AS c LEFT JOIN Invoice i ON i.client_id = c.id
GROUP BY c.id
HAVING cnt = 0
-- IF
SELECT c.id, SUM(IF(i.invoice_date > '2010-01-01', 1, 0)) AS cnt
FROM Client AS c LEFT JOIN Invoice i ON i.client_id = c.id
GROUP BY c.id
HAVING cnt = 0
You can also use COUNT, but with CASE or IF:
-- CASE
SELECT client_id, COUNT(CASE WHEN invoice_date > '2010-01-01' THEN 1 ELSE NULL END) as cnt
FROM Invoice
GROUP BY client_id HAVING cnt = 0
-- IF
SELECT client_id, COUNT(IF(invoice_date > '2010-01-01', 1, NULL)) as cnt
FROM Invoice
GROUP BY client_id HAVING cnt = 0
demo on dbfiddle.uk
You can also use NOT IN with a sub-query
SELECT * FROM client
WHERE id NOT IN (
SELECT client_id
FROM invoice
WHERE invoice_date>'2020-01-01'
);
Try this :-
Select Client.* from Client Client left join Invoice Invoice on
client.id = Invoice.client_id and Invoice.invoice_date > "2010-01-01"
where Invoice.client_id is null;
Basically you include only the subset of the data after "2010-01-01" from the invoice table.

How to count rows by group by and and add this to subquery?

i need to count rows gouped by Hours and add this to select subquery, but i got error on this line AND DATE(created_at) = T.day_start AND user_id = T.user_id
Here is my query:
SELECT
COUNT(*)
FROM
(
SELECT
HOUR (call_start_at) AS hours,
count(*) AS calls
FROM
calls
WHERE
1
AND user_id = 8
AND call_start_at >= '2016-01-06 00:00:00'
AND call_start_at <= '2016-01-06 23:59:59'
GROUP BY
HOUR (call_start_at)
) AS T1
And i try this add to select subquery, but wrong on marked line with T.day_start and T.user_id when i changing.
Here is my test:
SELECT
T2.name,
T2.calls,
ROUND(calling_time * 100 / working_time, 2) AS percent,
T2.calling_time,
T2.working_time
FROM
(SELECT
T.name,
(SELECT COUNT(*) FROM calls AS C WHERE DATE(C.created_at) = T.day_start AND C.user_id = T.user_id) AS calls,
(SELECT
COUNT(*)
FROM
(SELECT
HOUR(call_start_at) as hours,
count(*) as calls
FROM
calls
WHERE 1
AND DATE(created_at) = T.day_start AND user_id = T.user_id // marked line
GROUP BY
HOUR(call_start_at)) as T3
) as row_count,
(SELECT SUM(call_length) FROM calls AS C WHERE DATE(C.created_at) = T.day_start AND C.user_id = T.user_id) AS calling_time,
SUM(T.working_time) AS working_time
FROM
(SELECT
U.username AS name,
U.id AS user_id,
DATE(UW.start) as day_start,
UW.length AS working_time
FROM
users AS U
LEFT JOIN users_worktime AS UW ON UW.user_id = U.id
WHERE 1
AND U.type = 'agent'
AND UW.start >= '2016-01-06 00:00:00'
AND UW.start <= '2016-01-06 23:59:59'
) AS T
GROUP BY
T.name, T.user_id, T.day_start
) AS T2
You can more simply write the query as:
SELECT COUNT(DISTINCT HOUR(call_start_at)) as num
FROM calls
WHERE 1 AND
user_id = 8 AND
call_start_at >= '2016-01-06 00:00:00' AND
call_start_at <= '2016-01-06 23:59:59'
This will allow you to use the correlation clause.
Note: This ignores NULL values. I assume that is not a problem (it is easily fixed if it is).

MySQL SELECT subquery

I have a calendar and user_result table and I need to join these two queries.
calendar query
SELECT `week`, `date`, `time`, COUNT(*) as count
FROM `calendar`
WHERE `week` = 1
GROUP BY `date`
ORDER BY `date` DESC
and the result is
{"week":"1","date":"2014-08-21","time":"15:30:00","count":"4"}, {"week":"1","date":"2014-08-20","time":"17:30:00","count":"12"}
user_result query
SELECT `date`, SUM(`point`) as score
FROM `user_result`
WHERE `user_id` = 1
AND `date` = '2014-08-20'
and the result is just score 3
My goal is to always show calendar even if the user isn't present in the user_result table, but if he is, SUM his points for that day where calendar.date = user_result.date. Result should be:
{"week":"1","date":"2014-08-21","time":"15:30:00","count":"4","score":"3"}, {"week":"1","date":"2014-08-20","time":"17:30:00","count":"12","score":"0"}
I have tried this query below, but the result is just one row and unexpected count
SELECT c.`week`, c.`date`, c.`time`, COUNT(*) as count, SUM(p.`point`) as score
FROM `calendar` c
INNER JOIN `user_result` p ON c.`date` = p.`date`
WHERE c.`week` = 1
AND p.`user_id` = 1
GROUP BY c.`date`
ORDER BY c.`date` DESC
{"week":"1","date":"2014-08-20","time":"17:30:00","count":"4","score":"9"}
SQL Fiddle
ow sorry, i was edited, and i was try at your sqlfiddle, if you want to show all date from calendar you can use LEFT JOIN, but if you want to show just the same date between calendar and result you can use INNER JOIN, note: in this case INNER JOIN just show 1 result, and LEFT JOIN show 2 results
SELECT c.`week`, p.user_id, c.`date`, c.`time`, COUNT(*) as count, p.score
FROM `calendar` c
LEFT JOIN
(
SELECT `date`, SUM(`point`) score, user_id
FROM `result`
group by `date`
) p ON c.`date` = p.`date`
WHERE c.`week` = 1
GROUP BY c.`date`
ORDER BY c.`date` DESC
I put a pre-aggreate query / group by date as a select for the one person you were interested in... then did a left-join to it. Also, your column names of week, date and time (IMO) are poor choice column names as they can appear to be too close to reserved keywords in MySQL. They are not, but could be confusing..
SELECT
c.week,
c.date,
c.time,
coalesce( OnePerson.PointEntries, 0 ) as count,
coalesce( OnePerson.totPoints, 0 ) as score
FROM
calendar c
LEFT JOIN ( select
r.week,
r.date,
COUNT(*) as PointEntries,
SUM( r.point ) as totPoints
from
result r
where
r.week = 1
AND r.user_id = 1
group by
r.week,
r.date ) OnePerson
ON c.week = OnePerson.week
AND c.date = OnePerson.date
WHERE
c.week = 1
GROUP BY
c.date
ORDER BY
c.date DESC
Posted code to SQLFiddle

MySQL - Complicated SUMs inside Query

This is going to be tough to explain.
I'm looping through my client records from tbl_customers several times a day.
SELECT c.* FROM tbl_customers c
I'm returning simply the customer's: customerid, name, phone, email
Now the weird part.
I want to append 3 more columns, after email: totalpaid, totalowed, totalbalance
BUT, Those column names don't exist anywhere.
Here is how I query each one: (as a single query)
SELECT SUM(total) AS totalpaid
FROM tbl_customers_bills
WHERE customerid = X
AND billtype = 1
SELECT SUM(total) AS totalowed
FROM tbl_customers_bills
WHERE customerid = X
AND billtype = 2
SELECT SUM(total) AS totalbalance
FROM tbl_customers_bills
WHERE customerid = X
AND billtype IN(1,2)
So, the billtype is the column that tells me whether the record is paid or not.
I am at a loss here.
How can I SUM 3 separate queries into the first query's loop?
Just join customers to bills and do the sums. To separate out totalpaid and totalowed you can use SUM(CASE or SUM(IF as wless1's answer demonstrates
SELECT c.*,
SUM(CASE WHEN billtype = 1 THEN total ELSE 0 END) totalpaid ,
SUM(CASE WHEN billtype = 2 THEN total ELSE 0 END) totalowed ,
SUM(total) AS totalbalance
FROM
tbl_customers c
LEFT JOIN tbl_customers_bills b
ON c.customerid = b.customerid
and billtype in (1,2)
GROUP BY
c.customerid
Because this is MySQL you only need to group on the PK of customer.
You could do this with a combination of GROUP, SUM, and IF
SELECT c.id, c.name, c.phone, c.email,
SUM(IF(b.billtype = 1, b.total, 0)) AS totalpaid,
SUM(IF(b.billtype = 2, b.total, 0)) AS totalowed,
SUM(IF(b.billtype = 1 OR b.billtype = 2, b.total, 0)) AS totalbalance,
FROM tbl_customers c LEFT JOIN tbl_customers_bills b ON b.customerid = c.id
GROUP BY c.id
See:
http://dev.mysql.com/doc/refman/5.0/en//group-by-functions.html
http://dev.mysql.com/doc/refman/5.0/en/control-flow-functions.html

Making my mysql query more efficient

What I am trying to do is pulling the id,phone_type,os_version columns from Enswitch_Mobile_Users table.
And with the id i've just got to get the enswitch_id from Enswitch_Users table.
And after that to COUNT all the entires from Enswitch_Android_Purchases or Enswitch_Iphone_Purchases which the user colum match the id from enswitch_mobile_users. and getting first entry date and the last entry date.
And I managed to made it work with this query:
SELECT p.user AS `Mobile_User_ID`,
e.os_version `Os_Version`,
e.phone_type `Phone_Type`,
eu.enswitch_id `Enswitch_ID`,
Count(1) AS `Buy_Count`,
(SELECT pc.date
FROM
(
SELECT date, user, status
FROM enswitch_android_purchases
UNION
SELECT date, user, status
FROM enswitch_iphone_purchases
) AS pc
WHERE pc.status = 1
AND pc.user = p.user
ORDER BY pc.date ASC
LIMIT 1) AS `First_Purchase`,
(SELECT pc.date
FROM
(
SELECT date, user, status
FROM enswitch_android_purchases
UNION
SELECT date, user, status
FROM enswitch_iphone_purchases
) AS pc
WHERE pc.status = 1
AND pc.user = p.user
ORDER BY pc.date DESC LIMIT 1) AS `Last_Purchase`
FROM
(
SELECT item, date, user, status
FROM enswitch_android_purchases
UNION
SELECT item, date, user, status
FROM enswitch_iphone_purchases
) AS p
LEFT JOIN enswitch_mobile_users e
ON p.user = e.id
LEFT JOIN enswitch_users eu
ON e.user_id = eu.id
WHERE p.`date` >= :from_date
AND p.`date` <= :to_date
AND p.user is not null
AND p.status = 1
GROUP BY `Mobile_User_ID`
But because of the selects it will be really slow so how can I make it more efficient?
You might be able to use the following which replaces the two selects in the SELECT list with min(p.date) and max(p.date):
SELECT p.user AS `Mobile_User_ID`,
e.os_version `Os_Version`,
e.phone_type `Phone_Type`,
eu.enswitch_id `Enswitch_ID`,
Count(1) AS `Buy_Count`,
min(p.date) AS `First_Purchase`,
max(p.date) AS `Last_Purchase`
FROM
(
SELECT item, date, user, status
FROM enswitch_android_purchases
UNION
SELECT item, date, user, status
FROM enswitch_iphone_purchases
) AS p
LEFT JOIN enswitch_mobile_users e
ON p.user = e.id
LEFT JOIN enswitch_users eu
ON e.user_id = eu.id
WHERE p.`date` >= :from_date
AND p.`date` <= :to_date
AND p.user is not null
AND p.status = 1
GROUP BY p.user