MySQL Query Optimization for the derived & Sub Query Combination queries - mysql

The following sort of the queries are running on the server which uses the derived table and subquery. The constraint is that the subqueries are generated from the multiple modules based on the current situation so cannot really convert it into the join combination.
Please suggest the possible solution to optimize the query
SELECT COUNT(1)
AS total
FROM member tlb_m
where tlb_m.active = 1
and tlb_m.rank > 0
and tlb_m.member_id not in (5735,134,241,1055,348,272,476,43,7,804,7548,90,229,346,40895)
and tlb_m.type = 'M'
and (tlb_m.hometown_list_id in
(SELECT l2.list_id
FROM ((
SELECT t12.list_id
from list_tree_idx t12
INNER JOIN list_tree_idx t11
ON t12.list_parent_id=t11.list_id
where t11.list_parent_id='205546'
) UNION ALL (
SELECT list_id
from list_tree_idx
where list_parent_id='205546'
) ) as l2
) or tlb_m.hometown_list_id = 205546
)

I would suggest to use a closure table for optimal hierarchical queries.
For example, having a closure table with columns ANCESTOR_ID, CHILD_ID and DEPTH your query will look like this
SELECT COUNT(1) AS total
FROM member AS tlb_m
LEFT JOIN hometown_closure AS c ON c.child_id = tlb_m.hometown_list_id
where tlb_m.active = 1
and tlb_m.rank > 0
and tlb_m.member_id not in (5735,134,241,1055,348,272,476,43,7,804,7548,90,229,346,40895)
and tlb_m.type = 'M'
and c.ancestor_id = 205546

Related

MINUS operator in MySQL query [duplicate]

I am trying to perform a MINUS operation in MySql.I have three tables:
one with service details
one table with states that a service is offered in
another table (based on zipcode and state) shows where this service is not offered.
I am able to get the output for those two select queries separately. But I need a combined statement that gives the output as
'SELECT query_1 - SELECT query_2'.
Service_Details Table
Service_Code(PK) Service Name
Servicing_States Table
Service_Code(FK) State Country PK(Service_Code,State,Country)
Exception Table
Service_Code(FK) Zipcode State PK(Service_Code,Zipcode,State)
MySql does not recognise MINUS and INTERSECT, these are Oracle based operations. In MySql a user can use NOT IN as MINUS (other solutions are also there, but I liked it lot).
Example:
select a.id
from table1 as a
where <condition>
AND a.id NOT IN (select b.id
from table2 as b
where <condition>);
MySQL Does not supports MINUS or EXCEPT,You can use NOT EXISTS, NULL or NOT IN.
Here's my two cents... a complex query just made it work, originally expressed with Minus and translated for MySql
With MINUS:
select distinct oi.`productOfferingId`,f.name
from t_m_prod_action_oitem_fld f
join t_m_prod_action_oitem oi
on f.fld2prod_action_oitem = oi.oid;
minus
select
distinct r.name,f.name
from t_m_prod_action_oitem_fld f
join t_m_prod_action_oitem oi
on f.fld2prod_action_oitem = oi.oid
join t_m_rfs r
on r.name = oi.productOfferingId
join t_m_attr a
on a.attr2rfs = r.oid and f.name = a.name;
With NOT EXISTS
select distinct oi.`productOfferingId`,f.name
from t_m_prod_action_oitem_fld f
join t_m_prod_action_oitem oi
on f.fld2prod_action_oitem = oi.oid
where not exists (
select
r.name,f.name
from t_m_rfs r
join t_m_attr a
on a.attr2rfs = r.oid
where r.name = oi.productOfferingId and f.name = a.name
The tables have to have the same columns, but I think you can achieve what you are looking for with EXCEPT... except that EXCEPT only works in standard SQL! Here's how to do it in MySQL:
SELECT * FROM Servicing_states ss WHERE NOT EXISTS
( SELECT * FROM Exception e WHERE ss.Service_Code = e.Service_Code);
http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/
Standard SQL
SELECT * FROM Servicing_States
EXCEPT
SELECT * FROM Exception;
An anti-join pattern is the approach I typically use. That's an outer join, to return all rows from query_1, along with matching rows from query_2, and then filtering out all the rows that had a match... leaving only rows from query_1 that didn't have a match. For example:
SELECT q1.*
FROM ( query_1 ) q1
LEFT
JOIN ( query_2 ) q2
ON q2.id = q1.id
WHERE q2.id IS NULL
To emulate the MINUS set operator, we'd need the join predicate to compare all columns returned by q1 and q2, also matching NULL values.
ON q1.col1 <=> q2.col2
AND q1.col2 <=> q2.col2
AND q1.col3 <=> q2.col3
AND ...
Also, To fully emulate the MINUS operation, we'd also need to remove duplicate rows returned by q1. Adding the DISTINCT keyword would be sufficient to do that.
In case the tables are huge and are similar, one option is to save the PK to new tables. Then compare based only on the PK. In case you know that the first half is identical or so add a where clause to check only after a specific value or date .
create table _temp_old ( id int NOT NULL PRIMARY KEY )
create table _temp_new ( id int NOT NULL PRIMARY KEY )
### will take some time
insert into _temp_old ( id )
select id from _real_table_old
### will take some time
insert into _temp_new ( id )
select id from _real_table_new
### this version should be much faster
select id from _temp_old to where not exists ( select id from _temp_new tn where to.id = tn.id)
### this should be much slower
select id from _real_table_old rto where not exists ( select id from _real_table_new rtn where rto.id = rtn.id )

How efficiently check record exist more than 2 times in table using sub-query?

I have a query like this . I have compound index for CC.key1,CC.key2.
I am executing this in a big database
Select * from CC where
( (
(select count(*) from Service s
where CC.key1=s.sr2 and CC.key2=s.sr1) > 2
AND
CC.key3='new'
)
OR
(
(select count(*) from Service s
where CC.key1=s.sr2 and CC.key2=s.sr1) <= 2
)
)
limit 10000;
I tried to make it as inner join , but its getting slower . How can i optimize this query ?
The trick here is being able to articulate a query for the problem:
SELECT *
FROM CC t1
INNER JOIN
(
SELECT cc.key1, cc.key2
FROM CC cc
LEFT JOIN Service s
ON cc.key1 = s.sr2 AND
cc.key2 = s.sr1
GROUP BY cc.key1, cc.key2
HAVING COUNT(*) <= 2 OR
SUM(CASE WHEN cc.key = 'new' THEN 1 ELSE 0 END) > 2
) t2
ON t1.key1 = t2.key1 AND
t1.key2 = t2.key2
Explanation:
Your original two subqueries would only add to the count if a given record in CC, with a given key1 and key2 value, matched to a corresponding record in the Service table. The strategy behind my inner query is to use GROUP BY to count the number of times that this happens, and use this instead of your subqueries. The first count condition is your bottom subquery, and the second one is the top.
The inner query finds all key1, key2 pairs in CC corresponding to records which should be retained. And recognize that these two columns are the only criteria in your original query for determining whether a record from CC gets retained. Then, this inner query can be inner joined to CC again to get your final result set.
In terms of performance, even this answer could leave something to be desired, but it should be better than a massive correlated subquery, which is what you had.
Basically get the Columns that must not have a duplicate then join them together. Example:
select *
FROM Table_X A
WHERE exists (SELECT 1
FROM Table_X B
WHERE 1=1
and a.SHOULD_BE_UNIQUE = b.SHOULD_BE_UNIQUE
and a.SHOULD_BE_UNIQUE2 = b.SHOULD_BE_UNIQUE2
/* excluded because these columns are null or can be Duplicated*/
--and a.GENERIC_COLUMN = b.GENERIC_COLUMN
--and a.GENERIC_COLUMN2 = b.GENERIC_COLUMN2
--and a.NULL_COLUMN = b.NULL_COLUMN
--and a.NULL_COLUMN2 = b.NULL_COLUMN2
and b.rowid > a.ROWID);
Where SHOULD_BE_UNIQUE and SHOULD_BE_UNIQUE2 are columns that shouldn't be repeated and have unique columns and the GENERIC_COLUMN and NULL_COLUMNS can be ignored so just leave them out of the query.
Been using this approach when we have issues in Duplicate Records.
With the limited information you've given us, this could be a rewrite using 'simplified' logic:
SEELCT *
FROM CC NATURAL JOIN
( SELECT key1, key2, COUNT(*) AS tally
FROM Service
GROUP
BY key1, key2 ) AS t
WHERE key3 = 'new' OR tally <= 2;
Not sure whether it will perform better but might give you some ideas of what to try next?

How to speedup my update query associated with subquery

I have a query which is pretty that contains LEFT JOIN subquery. It takes 20 minutes to load completely.
Here is my query:
UPDATE orders AS o
LEFT JOIN (
SELECT obe_order_master_id, COUNT(id) AS count_files, id, added
FROM customer_instalments
GROUP BY obe_order_master_id
) AS oci ON oci.obe_order_master_id = SUBSTRING(o.order_id, 4)
SET o.final_customer_file_id = oci.id,
o.client_work_delivered = oci.added
WHERE oci.count_files = 1
Is there any way that I can make this query runs faster?
Move Where condition in Temp Table and replace WHERE with HAVING Clause, this will eliminate unnecessary rows from temp table so reduce the filtering and may help to improve performance
UPDATE orders AS o
LEFT JOIN (
SELECT obe_order_master_id, id, added
FROM customer_instalments
GROUP BY obe_order_master_id
HAVING COUNT(id) = 1
) AS oci ON oci.obe_order_master_id = SUBSTRING(o.order_id, 4)
SET o.final_customer_file_id = oci.id,
o.client_work_delivered = oci.added
I would suggest to create separate column for Order_id substring and make index on it. Then use this column in WHERE.

Query for self join in MySQL

I have a query which gets the correct result but it is taking 5.5 sec to get the output.. Is there any other way to write a query for this -
SELECT metricName, metricValue
FROM Table sm
WHERE createdtime = (
SELECT MAX(createdtime)
FROM Table b
WHERE sm.metricName = b.metricName
AND b.sinkName='xx'
)
AND sm.sinkName='xx'
In your code, the subselect has to be run for every result row of the outer query, which should be quite expensive. Instead, you could select your filter data in a separate query and join both accordingly:
SELECT `metricName`, `metricValue` FROM Table sm
INNER JOIN (SELECT max(`createdtime`) AS `maxTime, `metricName` from Table b WHERE b.sinkName='xx' GROUP BY `metricName` ) filter
ON (sm.`createdtime` = filter.`maxTime`) AND ( sm.`metricName` = filter.`metricName`)
WHERE sm.sinkName='xx'

Slow Execution of MySQL Select Query

I have the following query…
SELECT DISTINCT * FROM
vPAS_Posts_Users
WHERE (post_user_id =:id AND post_type != 4)
AND post_updated >:updated
GROUP BY post_post_id
UNION
SELECT DISTINCT vPAS_Posts_Users.* FROM PAS_Follow
JOIN vPAS_Posts_Users ON
( PAS_Follow.folw_followed_user_id = vPAS_Posts_Users.post_user_id )
WHERE (( PAS_Follow.folw_follower_user_id =:id AND PAS_Follow.folw_deleted = 0 )
OR ( post_type = 4 AND post_passed_on_by = PAS_Follow.folw_follower_user_id
AND post_user_id !=:id ))
AND post_updated >:updated
GROUP BY post_post_id ORDER BY post_posted_date DESC LIMIT :limit
Where :id = 7, :updated = 0.0 and :limit=40 for example
My issue is that the query is taking about a minute to return results. Is there anything in this query that I can do to speed up the result?
I am using RDS
********EDIT*********
I was asked to run the query with an EXPLAIN the result is below
********EDIT**********
View Definitition
CREATE ALGORITHM=UNDEFINED DEFINER=`MySQLUSer`#`%` SQL SECURITY DEFINER VIEW `vPAS_Posts_Users`
AS SELECT
`PAS_User`.`user_user_id` AS `user_user_id`,
`PAS_User`.`user_country` AS `user_country`,
`PAS_User`.`user_city` AS `user_city`,
`PAS_User`.`user_company` AS `user_company`,
`PAS_User`.`user_account_type` AS `user_account_type`,
`PAS_User`.`user_account_premium` AS `user_account_premium`,
`PAS_User`.`user_sign_up_date` AS `user_sign_up_date`,
`PAS_User`.`user_first_name` AS `user_first_name`,
`PAS_User`.`user_last_name` AS `user_last_name`,
`PAS_User`.`user_avatar_url` AS `user_avatar_url`,
`PAS_User`.`user_cover_image_url` AS `user_cover_image_url`,
`PAS_User`.`user_bio` AS `user_bio`,
`PAS_User`.`user_telephone` AS `user_telephone`,
`PAS_User`.`user_dob` AS `user_dob`,
`PAS_User`.`user_sector` AS `user_sector`,
`PAS_User`.`user_job_type` AS `user_job_type`,
`PAS_User`.`user_unique` AS `user_unique`,
`PAS_User`.`user_deleted` AS `user_deleted`,
`PAS_User`.`user_updated` AS `user_updated`,
`PAS_Post`.`post_post_id` AS `post_post_id`,
`PAS_Post`.`post_language_id` AS `post_language_id`,
`PAS_Post`.`post_type` AS `post_type`,
`PAS_Post`.`post_promoted` AS `post_promoted`,
`PAS_Post`.`post_user_id` AS `post_user_id`,
`PAS_Post`.`post_posted_date` AS `post_posted_date`,
`PAS_Post`.`post_latitude` AS `post_latitude`,
`PAS_Post`.`post_longitude` AS `post_longitude`,
`PAS_Post`.`post_location_name` AS `post_location_name`,
`PAS_Post`.`post_text` AS `post_text`,
`PAS_Post`.`post_media_url` AS `post_media_url`,
`PAS_Post`.`post_image_height` AS `post_image_height`,
`PAS_Post`.`post_link` AS `post_link`,
`PAS_Post`.`post_link_title` AS `post_link_title`,
`PAS_Post`.`post_unique` AS `post_unique`,
`PAS_Post`.`post_deleted` AS `post_deleted`,
`PAS_Post`.`post_updated` AS `post_updated`,
`PAS_Post`.`post_original_post_id` AS `post_original_post_id`,
`PAS_Post`.`post_original_type` AS `post_original_type`,
`PAS_Post`.`post_passed_on_by` AS `post_passed_on_by`,
`PAS_Post`.`post_passed_on_caption` AS `post_passed_on_caption`,
`PAS_Post`.`post_passed_on_fullname` AS `post_passed_on_fullname`,
`PAS_Post`.`post_passed_on_avatar_url` AS `post_passed_on_avatar_url`
FROM (`PAS_User` join `PAS_Post` on((`PAS_User`.`user_user_id` = `PAS_Post`.`post_user_id`)));
try this query:
SELECT *
FROM
vPAS_Posts_Users
WHERE
post_user_id =:id
AND post_type != 4
AND post_updated > :updated
UNION
SELECT u.*
FROM vPAS_Posts_Users u
JOIN PAS_Follow f ON f.folw_followed_user_id = u.post_user_id
WHERE
u.post_updated > :updated
AND ( (f.folw_follower_user_id = :id AND f.folw_deleted = 0)
OR (u.post_type = 4 AND u.post_passed_on_by = f.folw_follower_user_id AND u.post_user_id != :id)
)
ORDER BY u.post_posted_date DESC;
LIMIT :limit
Other improvements
Indices:
Be sure you have indices on the following columns:
PAS_User.user_user_id
PAS_Post.post_user_id
PAS_Post.post_type
PAS_Post.post_updated
PAS_Follow.folw_followed_user_id
PAS_Follow.folw_deleted
PAS_Post.post_passed_on_by
After that is done, please 1- check the performance again (SQL_NO_CACHE) and 2- extract another explain plan so we can adjust the query.
EXPLAIN Results
Here are the some suggestions for the query and view first of all using the UNION for the two result sets which might makes your query to work slow instead you can use the UNION ALL
Why i am referring you to use UNION ALL
Reason is both UNION ALL and UNION use temporary table for result generation.The difference in execution speed comes from the fact UNION requires internal temporary table with index (to skip duplicate rows) while UNION ALL will create table without such index.This explains the slight performance improvement when using UNION ALL.
UNION on its own will remove any duplicate records so no need to use the DISTINCT clause, try to only one GROUP BY of the whole result set by subqueries this will also minimize the execution time rather then grouping results in each subquery.
Make sure you have added the right indexes on the columns especially the columns used in the WHERE,ORDER BY, GROUP BY, the data types should be appropriate for each column with respect to the nature of data in it like post_posted_date should be datetime,date with an index also.
Here is the rough idea for the query
SELECT q.* FROM (
SELECT * FROM
vPAS_Posts_Users
WHERE (post_user_id =:id AND post_type != 4)
AND post_updated >:updated
UNION ALL
SELECT vPAS_Posts_Users.* FROM PAS_Follow
JOIN vPAS_Posts_Users ON
( PAS_Follow.folw_followed_user_id = vPAS_Posts_Users.post_user_id
AND vPAS_Posts_Users.post_updated >:updated)
WHERE (( PAS_Follow.folw_follower_user_id =:id AND PAS_Follow.folw_deleted = 0 )
OR ( post_type = 4 AND post_passed_on_by = PAS_Follow.folw_follower_user_id
AND post_user_id !=:id ))
) q
GROUP BY q.post_post_id ORDER BY q.post_posted_date DESC LIMIT :limit
References
Difference Between Union vs. Union All – Optimal Performance Comparison
Optimize Mysql Union
MySQL Performance Blog
From your explain I can see that most of your table don't have any key except for the primary one, I would suggest you to add some extra key on the columns you're going to join, for example on: PAS_Follow.folw_followed_user_id and vPAS_Posts_Users.post_user_id, just this will result in a big performance boost.
Bye,
Gnagno