mysql how to use JOIN instead of IN with WHERE clause - mysql

Can anyone please help me with below query in which i use IN clause which Leads performance issue. I want to use JOIN for it but not sure how for do for such query.
select *
from user_followings
where followed_id = 'xyz' AND owner_id IN (
select DISTINCT owner_id
from feed_events
where DTYPE = 'PLAYLIST' AND last_updated_timestamp > '20-04-2017' AND (feed_type = 'PLAYED_PLAYLIST' OR feed_type = 'STARTED_LISTENING')
order by last_updated_timestamp DESC)";

A join probably is not the best approach. Use exists:
select uf.*
from user_followings uf
where uf.followed_id = 'xyz' and
exists (select 1
from feed_events fe
where uf.owner_id = fe.owner_id and
fe.DTYPE = 'PLAYLIST' and
fe.last_updated_timestamp > '2017-04-20' and
fe.feed_type in ('PLAYED_PLAYLIST', 'STARTED_LISTENING')
);
You want an index on feed_events(owner_id, dtype, last_updated_timestamp, feed_type) and user_followings(followed_id, owner_id).
Other notes:
ORDER BY in such a subquery is useless.
Use standard date formats (YYYY-MM-DD) for constant dates.
Use IN instead of a bunch of ORs. It is easier to read and optimizes better under most circumstances.

I rewrote your query using join:
SELECT *
FROM user_followings
INNER JOIN feed_events ON user_followings.owner_id = feed_events.owner_id
WHERE followed_id = 'xyz'
AND DTYPE = 'PLAYLIST'
AND feed_events.last_updated_timestamp > '20-04-2017'
AND (
feed_type = 'PLAYED_PLAYLIST'
OR feed_type = 'STARTED_LISTENING'
)
ORDER BY last_updated_timestamp DESC

Related

Query is neglecting one of where clause, any idea why it is happening?

I have two tables, activity and users. I am trying to fetch data by using multiple where clauses.
SELECT SUM(activity.step_points) AS s_points
, `activity`.`user_id`
, `users`.`id`
, `users`.`app_id`
, `users`.`country_id`
FROM `activity`
LEFT JOIN `users` ON `users`.`id` = `activity`.`user_id`
WHERE `users`.`is_active` = 1 AND
`users`.`is_test_account` = 0 AND
`users`.`app_id` = 3 AND
`users`.`country_id` = 1 AND
`users`.`phone` NOT LIKE "%000000%" OR
`users`.`phone` IS NULL AND
`users`.`is_subscribed` = 1 AND
(`users`.`email` NOT LIKE "%#mycompanyname.net" OR
`users`.`email` IS NULL) AND
YEAR(`activity`.`created_at`) = "2021" AND
MONTH(`activity`.`created_at`) = "06"
GROUP BY `activity`.`user_id`
ORDER BY `s_points` DESC LIMIT 100 OFFSET 0
But I think users.country_id = 1 is getting neglected. You can see I want only rows that belong to country id 1. But I am getting country id 2, 3 too.
Why is it happening?
You need to properly use parentheses in the WHERE clause so the OR does not dominate over the ANDs:
SELECT . . .
FROM activity a JOIN
users u
ON u.id = a.id
WHERE u.is_active = 1 AND
u.is_test_account = 0 AND
u.app_id = 3 AND
u.country_id = 1 AND
(u.phone NOT LIKE '%000000%' OR u.phone IS NULL) AND
u.is_subscribed = 1 AND
(u.email NOT LIKE '%#mycompanyname.net OR u.email IS NULL) AND
(a.created_at >= '2021-01-01' AND a.created_at < '2022-01-01'
Note the other changes to the code:
You are filtering on both tables, so an outer join is not appropriate. The WHERE clause turns it into an inner join anyway.
Table aliases make the code easier to write and to read.
All the backticks just make the code harder to write and read and are not needed.
SQL's standard delimiter for strings is single quotes. Use them unless you have a good reason for preferring double quotes.
For date comparisons, it can be faster to avoid functions, hence the change for the year comparison. This helps the optimizer.

Inner query or multiple queries which would be result in better performance for mysql?

Inner query:
select up.user_id, up.id as utility_pro_id from utility_pro as up
join utility_pro_zip_code as upz ON upz.utility_pro_id = up.id and upz.zip_code_id=1
where up.available_for_survey=1 and up.user_id not in (select bjr.user_id from book_job_request as bjr where
((1583821800000 between bjr.start_time and bjr.end_time) and (1583825400000 between bjr.start_time and bjr.end_time)))
Divided in two queries:
select up.user_id, up.id as utility_pro_id from utility_pro as up
join utility_pro_zip_code as upz ON upz.utility_pro_id = up.id and upz.zip_code_id=1
Select bjr.user_id as userId from book_job_request as bjr where bjr.user_id in :userIds and (:startTime between bjr.start_time and bjr.end_time) and (:endTime between bjr.start_time and bjr.end_time)
Note:
As per my understanding, when single query will be executed using inner query it will scan all the data of book_job_request but while using multiple queries rows with specified user ids will be checked.
Any other better option for the same operation other than these two is also appreciated.
I expect that the query is supposed to be more like this:
SELECT up.user_id
, up.id utility_pro_id
FROM utility_pro up
JOIN utility_pro_zip_code upz
ON upz.utility_pro_id = up.id
LEFT
JOIN book_job_request bjr
ON bjr.user_id = up.user_id
AND bjr.end_time >= 1583821800000
AND bjr.start_time <= 1583825400000
WHERE up.available_for_survey = 1
AND upz.zip_code_id = 1
AND bjr.user_id IS NULL
For further help with optimisation (i.e. which indexes to provide) we'd need SHOW CREATE TABLE statements for all relevant tables as well as the EXPLAIN for the above
Another possibility:
SELECT up.user_id , up.id utility_pro_id
FROM utility_pro up
JOIN utility_pro_zip_code upz ON upz.utility_pro_id = up.id
WHERE up.available_for_survey = 1
AND upz.zip_code_id = 1
AND bjr.user_id IS NULL
AND NOT EXISTS( SELECT 1 FROM book_job_request
WHERE user_id = up.user_id
AND end_time >= 1583821800000
AND start_time <= 1583825400000 )
Recommended indexes (for my NOT EXISTS and for Strawberry's LEFT JOIN):
book_job_request: (user_id, start_time, end_time)
upz: (zip_code_id, utility_pro_id)
up: (available_for_survey, user_id, id)
The column order given is important. And, no, the single-column indexes you currently have are not as good.

How to optimize this complected query?

While working with following query on mysql, Its getting locked,
SELECT event_list.*
FROM event_list
INNER JOIN members
ON members.profilenam=event_list.even_loc
WHERE (even_own IN (SELECT frd_id
FROM network
WHERE mem_id='911'
GROUP BY frd_id)
OR even_own = '911' )
AND event_list.even_active = 'y'
GROUP BY event_list.even_id
ORDER BY event_list.even_stat ASC
The Inner query inside IN constraint has many frd_id, So because of that above query is slooow..., So please help.
Thanks.
Try this:
SELECT el.*
FROM event_list el
INNER JOIN members m ON m.profilenam = el.even_loc
WHERE el.even_active = 'y' AND
(el.even_own = 911 OR EXISTS (SELECT 1 FROM network n WHERE n.mem_id=911 AND n.frd_id = el.even_own))
GROUP BY el.even_id
ORDER BY el.even_stat ASC
You don't need the GROUP BY on the inner query, that will be making the database engine do a lot of unneeded work.
If you put even_own = '911' before the select from network, then if even_own IS 911 then it will not have to do the subquery.
Also why do you have a group by on the subquery?
Also run explain plan top find out what is taking the time.
This might work better:
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
JOIN network AS n ON e.even_own = n.frd_id
WHERE n.mem_id = '911'
AND e.even_active = 'y'
ORDER BY e.even_stat ASC )
UNION DISTINCT
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
WHERE e.even_own = '911'
AND e.even_active = 'y' )
ORDER BY e.even_stat ASC
Since I don't know whether the JOINs one-to-many (or what), I threw in DISTINCT to avoid dups. There may be a better way, or it may be unnecessary (that is, UNION ALL).
Notice how I avoid two things that are performance killers:
OR -- turned into UNION
IN (SELECT...) -- turned into JOIN.
I made aliases to cut down on the clutter. I moved the ORDER BY outside the UNION (and added parens to make it work right).

JOINing Three SQL Tables?

I have a working SQL query, but I need to grab another piece of data from a third table in the query for ease of use, but have been unable to grab it.
Every table is basically tied together by tenant_id
(I apologize for the bad structure, I didn't create the DB)
TABLE: tenant_statements
tenant_id balance property date
TABLE: leases
lease_id tenant_id property unit_number
TABLE: tenants
tenant_id first_name last_name global_comment
My current query:
SELECT *
FROM tenant_statements t
INNER JOIN (
SELECT *
FROM leases
GROUP BY tenant_id
ORDER BY lease_id
)l ON t.tenant_id = l.tenant_id
WHERE t.date = '$date'
AND t.property = '$property'
ORDER BY t.balance DESC
This give's me the appropriate response for joining the two tables: leases and tenant_statements. $date and $property are set via a PHP variable loop and used for presentation.
What I am attempting to do is also grab tenants.global_comment and have it added each result.
the ideal output will be:
tenant_statements t: t.balance, t.date
leases l: l.property, l.unit_number
tenants x: x.first_name, x.last_name, x.global_comment
All in one query.
Can anyone point me in to the right direction? Thank you!
How about something like
SELECT *
FROM tenant_statements t INNER JOIN
(
SELECT *
FROM leases
GROUP BY tenant_id
ORDER BY lease_id
)l ON t.tenant_id = l.tenant_id INNER JOIN
tenants ts ON t.tenant_id = ts.tenant_id
WHERE t.date = '$date'
AND t.property = '$property'
ORDER BY t.balance DESC
Although each join specification joins only two tables, FROM clauses can contain multiple join specifications. This allows many tables to be joined for a single query.
SELECT t.tenant_id,
t.balance,
l.unit_number,
l.property
x.first_name, x.last_name, x.global_comment
fROM tenant_statements t
INNER JOIN leases l ON l.tenant_id = t .tenant_id
INNER JOIN tenants x on x.tenant_id = t.tenant_id

MySQL Update query with left join and group by

I am trying to create an update query and making little progress in getting the right syntax.
The following query is working:
SELECT t.Index1, t.Index2, COUNT( m.EventType )
FROM Table t
LEFT JOIN MEvents m ON
(m.Index1 = t.Index1 AND
m.Index2 = t.Index2 AND
(m.EventType = 'A' OR m.EventType = 'B')
)
WHERE (t.SpecialEventCount IS NULL)
GROUP BY t.Index1, t.Index2
It creates a list of triplets Index1,Index2,EventCounts.
It only does this for case where t.SpecialEventCount is NULL. The update query I am trying to write should set this SpecialEventCount to that count, i.e. COUNT(m.EventType) in the query above. This number could be 0 or any positive number (hence the left join). Index1 and Index2 together are unique in Table t and they are used to identify events in MEvent.
How do I have to modify the select query to become an update query? I.e. something like
UPDATE Table SET SpecialEventCount=COUNT(m.EventType).....
but I am confused what to put where and have failed with numerous different guesses.
I take it that (Index1, Index2) is a unique key on Table, otherwise I would expect the reference to t.SpecialEventCount to result in an error.
Edited query to use subquery as it didn't work using GROUP BY
UPDATE
Table AS t
LEFT JOIN (
SELECT
Index1,
Index2,
COUNT(EventType) AS NumEvents
FROM
MEvents
WHERE
EventType = 'A' OR EventType = 'B'
GROUP BY
Index1,
Index2
) AS m ON
m.Index1 = t.Index1 AND
m.Index2 = t.Index2
SET
t.SpecialEventCount = m.NumEvents
WHERE
t.SpecialEventCount IS NULL
Doing a left join with a subquery will generate a giant
temporary table in-memory that will have no indexes.
For updates, try avoiding joins and using correlated
subqueries instead:
UPDATE
Table AS t
SET
t.SpecialEventCount = (
SELECT COUNT(m.EventType)
FROM MEvents m
WHERE m.EventType in ('A','B')
AND m.Index1 = t.Index1
AND m.Index2 = t.Index2
)
WHERE
t.SpecialEventCount IS NULL
Do some profiling, but this can be significantly faster in some cases.
my example
update card_crowd as cardCrowd
LEFT JOIN
(
select cc.id , count(1) as num
from card_crowd cc LEFT JOIN
card_crowd_r ccr on cc.id = ccr.crowd_id
group by cc.id
) as tt
on cardCrowd.id = tt.id
set cardCrowd.join_num = tt.num;