I have a list of clients. Each client can have several activities (0..*). Each activity contains a status `is_completed` which is a Boolean (True/False).
I need to retrieve the list of clients that have all activities completed:
if a client has all its activities completed, I keep him.
if a client has not all its activities completes, I ignore him.
I wrote an SQL query that does the job but I am not convinced that it is optimized:
SELECT DISTINCT cc.client_id
FROM clients_clientactivity AS cc
LEFT JOIN clients_client AS c ON (c.id = cc.client_id)
WHERE c.client_type_id = 2
AND (
SELECT COUNT(cc1.id) FROM clients_clientactivity AS cc1 WHERE cc1.client_id = cc.client_id
) = (
SELECT COUNT(cc2.id) FROM clients_clientactivity AS cc2 WHERE cc2.is_completed = True AND cc2.client_id = cc.client_id
);
How can I improve it ?
Thank you for your help.
You could use a not in select for the not true
SELECT DISTINCT cc.client_id
FROM clients_clientactivity AS cc
LEFT JOIN clients_client AS c ON (c.id = cc.client_id)
WHERE c.client_type_id = 2
AND cc.client_id NOT IN (
SELECT cc2.client_id
FROM clients_clientactivity AS cc2
WHERE cc2.is_completed != True
)
I would use aggregation and having:
SELECT c.id
FROM clients_clientactivity ca JOIN
clients_client c
ON c.id = ca.client_id
WHERE c.client_type_id = 2
GROUP BY c.id
HAVING COUNT(*) = SUM(ca.iscompleted)
Your WHERE clause converts the LEFT JOIN to an INNER JOIN, so I removed the LEFT JOIN.
Let's simplify even further:
SELECT client_id
FROM clients_clientactivity
WHERE MIN(is_completed) = TRUE
GROUP BY client_id
(TRUE==1, FALSE==0)
Subqueries are often slow. NOT IN ( SELECT ... ) is really bad (unless the optimizer has magically gotten smarter).
You did not explain how client_type_id = 2, but maybe something like:
clients_client
SELECT a.client_id
FROM clients_client AS c
JOIN clients_clientactivity AS a ON (c.id = a.client_id)
WHERE MIN(a.is_completed) = TRUE
AND c.client_type_id = 2
GROUP BY a.client_id
If performance is a problem, then:
c needs INDEX(client_type_id, id)
a needs INDEX(client_id, is_completed)
Related
So I'm having a slight problem with having to save price on a product in two different tables due to a few reasons. Is it possible to merge two columns into one? I know UNION exists but does it work with LEFT JOIN's?
Any pointers is much appreciated.
Best Regards
SELECT
si.id AS shop_item_id,
si.item_price,
s.logo_file_name,
p.cat_id AS category_id,
api.item_price AS api_price,
MAX(c.campaign_desc) AS campaignDesc,
MAX(c.campaign_type_id) AS campaignType,
MAX(c.shop_id) AS campaign_shop_id,
MAX(ct.image_name) AS campaignLogo
FROM
shop_item si
LEFT JOIN
shop s ON
s.id = si.shop_id
LEFT JOIN
product p ON
si.product_id = p.id
LEFT JOIN
campaign_category cc ON
cc.category_id = p.cat_id
LEFT JOIN
campaign c ON
c.id = cc.campaign_id AND
c.shop_id = si.shop_id AND
c.show_in_pricetable = 1 AND
NOW() BETWEEN c.date_from and c.date_to
LEFT JOIN
campaign_type ct ON
c.campaign_type_id = ct.id
LEFT JOIN
shop_api_item api ON
si.rel_feed_api = api.unique_id AND si.shop_id = api.shop_id
WHERE
si.`product_id` = 586 AND
s.`active_shop` = 1
GROUP BY
s.name,
si.id ,
si.item_price
ORDER BY
si.`item_price`,
si.`shop_id`,
c.`campaign_desc` DESC
It looks like you would benefit from the COALESCE() function.
SELECT
si.id AS shop_item_id,
COALESCE(si.item_price, api.item_price) AS coalesced_price,
...
COALESCE() takes multiple arguments, and returns the first argument that is not NULL.
I have a query that selects properties, and I need to join them to get the most recent activity on each one, where that activity_status = 3 (closed deal). when I get that, I need to get the bank that closed the deal (banks.is_reward = 1)
Problem is that the data is spread over many tables, so when I join to get all the results, and then try to limit to the max(activity_date), I need to group the results, and then I don't get the correct data from the other columns.
Here is a SQL Fiddle
I can do
Select * from properties
join
(SELECT deal_properties.property_id, activity.deal_id, activity.activity_date, banks.bank_id
FROM deal_properties
JOIN activity on activity.deal_id = deal_properties.deal_id
AND activity.activity_status = 3
JOIN banks ON banks.deal_id = activity.deal_id
AND banks.is_rewarded = 1) a
on a.property_id = properties.property_id;
and that will get me all the closed properties, with the rewarded banks, but I cant seem to limit that by the max(activity_date).
Option 1
The following gives what you're looking for following your current line of thought:
SELECT LastActivities.property_id, ActivityDetails.bank_id, LastActivities.activity_date
FROM (
SELECT p.property_id, MAX(a.activity_date) AS activity_date
FROM properties p
JOIN deal_properties dp
ON dp.property_id = p.property_id
JOIN activity a
ON a.deal_id = dp.deal_id AND a.activity_status = 3
GROUP BY p.property_id
) LastActivities
JOIN(
SELECT a.activity_date, dp.property_id, b.bank_id
FROM deal_properties dp
JOIN activity a
ON a.deal_id = dp.deal_id AND a.activity_status = 3
JOIN banks b
ON b.deal_id = a.deal_id AND b.is_rewarded = 1
) ActivityDetails
ON ActivityDetails.property_id = LastActivities.property_id
AND ActivityDetails.activity_date = LastActivities.activity_date
Here is the fiddle: HERE
Option 2
Below is another way to get the same results... This should be a bit more efficient as it only has one derived table instead of two.
SELECT p.property_id, b.bank_id, a.activity_date
FROM activity a
JOIN banks b
ON b.deal_id = a.deal_id AND b.is_rewarded = 1
JOIN deal_properties dp
ON dp.deal_id = a.deal_id
JOIN properties p
ON p.property_id = dp.property_id
JOIN(SELECT p.property_id, max(a.activity_date) AS activity_date
FROM activity a
JOIN deal_properties dp
ON dp.deal_id = a.deal_id
JOIN properties p
ON p.property_id = dp.property_id
GROUP BY p.property_id
) latest
ON latest.activity_date = a.activity_date AND latest.property_id = p.property_id
WHERE a.activity_status = 3
Here is the fiddle for option 2: HERE
looking to your sample seems you need
Select * from properties p
inner join
( SELECT deal_properties.property_id as property_id , max(activity.activity_date) max_date
FROM deal_properties
INNER JOIN activity on activity.deal_id = deal_properties.deal_id
AND activity.activity_status = 3
INNER JOIN banks ON banks.deal_id = activity.deal_id AND banks.is_rewarded = 1
group by property_id
) a on a.property_id = p.property_id;
SELECT *
FROM members
WHERE memberid IN (SELECT follows.followingid
FROM follows
WHERE follows.memberid = '$memberid'
AND follows.followingid NOT IN (SELECT memberid
FROM userblock))
AND memberid NOT IN (SELECT blockmemberid
FROM userblock
WHERE memberid = '$memberid')
The query above is taking nearly 4 seconds to execute in MySQL and I want to know if anyone has any suggestions on how I might improve/optimize it to achieve a faster execution time?
Replace the in clauses with joins. I think the following captures the logic. Note that the not in turns into a left join with a condition in the where clause finding a non-match
SELECT m.*
FROM members m
follow f
on m.memberid = f.followingid and
f.memberid = $memberid left join
userblock ubf
on follows.followingid = ubf.memberid left join
userblock ub
on m.memberid = ub.blockmemberid and
ub.memberid = '$memberid'
where ub.blockmemberid is null and
ubf.memberid is null;
This looks similar but you have less nested queries.
SELECT *
FROM members m
WHERE EXIST (SELECT f.followingid
FROM follows f
WHERE f.memberid = '$memberid'
AND f.followingid = m.memberid)
AND NOT EXIST (SELECT u.blockmemberid
FROM userblock u
WHERE (m.memberid = '$memberid'
AND u.blockmemberid = m.memberid)
OR
(u.blockmemberid = m.memberid
AND u.memberid = m.memberid) )
This is the logic I reversed-engineered from your code without seeing tables.
SELECT m.*
FROM members m
INNER JOIN follows f ON f.followingid = m.memberid AND
f.memberid = '$memberid'
LEFT OUTER JOIN userblock ub1 ON f.followingid = ub1.memberid
LEFT OUTER JOIN userblock ub2 ON m.memberid = ub2.blockmemberid AND
ub2.memberid = '$memberid'
WHERE ub1.memberid IS NULL AND ub2.blockmemberid IS NULL
Situation
I have a database which heavily makes use of joins due to the various situations in which each entity is used. Here is a simplified diagram:
Goal
I would like to be able to get details of all modules and the "name" fields regardless of whether the "fk_chapter_id" within user_has_module is set or not.
In the case where "user_has_module.fk_chapter_id" is null, the system can return details of the module and then null chapter.
In the case where there is a user_has_module, I would like to get the status
Issue
Whenever I perform SQL statements, I get the results only partially returned. I.E. If I have 4 module records in total, two of which where the user has an entry in "user_has_module" returns the two records in full and then 2 null records for the other modules.
Update based on feedback, almost there
Now, the only problem is I get duplicates. Using some test data
SELECT DISTINCT
chapter_id,
chapter_name,
module_id,
module_name,
(null ) AS user_module_progress,
(SELECT COUNT(fk_chapter_id) FROM module_has_chapter WHERE fk_module_id = m.module_id) AS chapter_count
FROM
module as m
LEFT JOIN
module_has_chapter as mhc ON m.module_id = mhc.fk_module_id
LEFT JOIN
chapter as c ON mhc.fk_chapter_id = c.chapter_id
group by m.module_id
UNION
SELECT DISTINCT
chapter_id,
chapter_name,
module_id,
module_name,
user_module_progress,
(SELECT COUNT(fk_chapter_id) FROM module_has_chapter WHERE fk_module_id = m.module_id) AS chapter_count
FROM
module as m
LEFT JOIN
user_has_module as uhm ON m.module_id = uhm.fk_module_id
LEFT JOIN
user as u ON uhm.fk_user_id = u.user_id
LEFT JOIN
chapter as c ON uhm.fk_latest_chapter_id = c.chapter_id
WHERE u.user_id = 2
group by m.module_id;
I got there in the end but, not particularly happy about it. This works but, it's a bloody mess...Does anyone have a better solution please?
SELECT DISTINCT
(null) AS chapter_id,
(null) AS chapter_name,
module_id,
module_name,
(null ) AS user_module_progress,
(SELECT COUNT(fk_chapter_id) FROM module_has_chapter WHERE fk_module_id = m.module_id) AS chapter_count
FROM
module as m
LEFT JOIN
user_has_module as uhm ON m.module_id = uhm.fk_module_id
WHERE
uhm.fk_user_id IS NULL
UNION ALL
SELECT DISTINCT
chapter_id,
chapter_name,
module_id,
module_name,
user_module_progress,
(SELECT COUNT(fk_chapter_id) FROM module_has_chapter WHERE fk_module_id = m.module_id) AS chapter_count
FROM
module as m
LEFT JOIN
user_has_module as uhm ON m.module_id = uhm.fk_module_id
INNER JOIN
user as u ON uhm.fk_user_id = u.user_id
INNER JOIN
chapter as c ON uhm.fk_latest_chapter_id = c.chapter_id
WHERE
u.user_id = 2;
I have a query that uses SUBSTRING() as a criteria:
SELECT p.name p_name,
pa.line1 p_line1,
pa.zip p_zip,
c.name c_name,
ca.line1 c_line1,
ca.zip c_zip
FROM bank b
JOIN import_bundle ib ON ib.bank_id = b.id
JOIN generic_import gi ON gi.import_bundle_id = ib.id
JOIN account_import ai ON ai.generic_import_id = gi.id
JOIN account a ON a.account_import_id = ai.id
JOIN account_address aa ON aa.account_id = a.id
JOIN address ca ON aa.address_id = ca.id
JOIN address pa ON pa.zip = ca.zip OR (pa.zip = ca.zip AND pa.line1 = ca.line1)
JOIN prospect p ON p.address_id = pa.id
JOIN customer c ON a.customer_id = c.id
WHERE b.name = 'M'
AND ib.active = 1
AND gi.active = 1
AND SUBSTRING(p.name, 1, 12) = SUBSTRING(c.name, 1, 12)
LIMIT 100
As you can see, it's just comparing the first 12 characters of p.name and c.name. Unfortunately, adding this query to the WHERE clause makes my query unbearably slow. Are there any tricks out there to do this same comparison, or is my best bet to add another column to each table that contains the first 12 characters of the customer's name? I hope it's not the latter because that would be a lot of work and I'll ultimately be doing several comparisons like this.
Add the extra columns and set up an update trigger to populate them automatically. Be sure to create indexes on the new columns, of course.