Optimize the query with index - mysql

MySQL query taking 1.6 seconds for 40000 records in table
SELECT aggsm.topicdm_id AS topid,citydm.city_name
FROM AGG_MENTION AS aggsm
JOIN LOCATIONDM AS locdm ON aggsm.locationdm_id = locdm.locationdm_id
JOIN CITY AS citydm ON locdm.city_id = citydm.city_id
JOIN STATE AS statedm ON citydm.state_id = statedm.state_id
JOIN COUNTRY AS cntrydm ON statedm.country_id = cntrydm.country_id
WHERE cntrydm.country_id IN (1,2,3,4)
GROUP BY aggsm.topicdm_id,aggsm.locationdm_id
LIMIT 0,200000
I have 40000 to 50000 records in AGG_MENTION,LOCATIONDM,CITYDM tables....500records in STATEDM abd 4 records in COUNTRY table.
When i run above query it is taking 1.6 sec..Is there a way to optimize the query or index on which columns will improve the performance....
Following is the EXPLAIN output:
1 SIMPLE aggsm index agg_sm_locdm_fk_idx agg_sm_datedm_fk_idx 4 36313 Using index; Using temporary; Using filesort
1 SIMPLE locdm eq_ref PRIMARY,city_id_UNIQUE,locationdm_id_UNIQUE,loc_city_fk_idx PRIMARY 8 opinionleaders.aggsm.locationdm_id 1
1 SIMPLE citydm eq_ref PRIMARY,city_id_UNIQUE,city_state_fk_idx PRIMARY 8 opinionleaders.locdm.city_id 1
1 SIMPLE statedm eq_ref PRIMARY,state_id_UNIQUE,state_country_fk_idx PRIMARY 8 opinionleaders.citydm.state_id 1 Using where
1 SIMPLE cntrydm eq_ref PRIMARY,country_id_UNIQUE PRIMARY 8 opinionleaders.statedm.country_id 1 Using index

I would reverse the query and start with the STATE first as that is what your criteria is based upon. Since you are not actually doing anything with the country table (except the country ID)... This column also exists in the State table, so you can the State.Country_ID and remove the country table from the join.
Additionally, I would have the following indexes
Table Index
State (Country_ID) as that will be basis of your WHERE criteria.
City (State_ID, City_Name).
Location (City_ID)
Agg_Mention (LocationDM_ID, TopicDM_id).
By having the "City_Name" as part of the index, the query doesn't have to go to the actual page data for it. Since part of the index, it can use it directly.
Many times, the keyword "STRAIGHT_JOIN" included here helps optimizer to run query in the order stated so it doesn't try to take one of the other tables as its primary basis of querying the data. If that doesn't perform well, you can try it again without it.
SELECT STRAIGHT_JOIN
aggsm.topicdm_id AS topid,
citydm.city_name
FROM
STATE AS statedm
JOIN CITY AS citydm
ON statedm.state_id = citydm.state_id
JOIN LOCATIONDM AS locdm
ON citydm.city_id = locdm.city_id
join AGG_MENTION AS aggsm
ON locdm.locationdm_id = aggsm.locationdm_id
WHERE
statedm.country_id IN (1,2,3,4)
GROUP BY
aggsm.topicdm_id,
aggsm.locationdm_id
LIMIT 0,200000

Related

Can I optimise this MySQL query?

My SQL is
SELECT authors.*, COUNT(*) FROM authors
INNER JOIN resources_authors ON authors.author_id=resources_authors.author_id
WHERE
resource_id IN
(SELECT resource_id FROM resources_authors WHERE author_id = '1313')
AND authors.author_id != '1313'
GROUP BY authors.author_id`
I have indexes on all the fields in the query, but I still get a Using temporary; Using Filesort.
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY authors ALL PRIMARY NULL NULL NULL 16025 Using where; Using temporary; Using filesort
1 PRIMARY resources_authors ref author_id author_id 4 authors.author_id 3 Using where
2 DEPENDENT SUBQUERY resources_authors unique_subquery resource_id,author_id,resource_id_2 resource_id 156 func,const 1 Using index; Using where
How can I improve my query, or table structure, to speed this query up?
There's an SQL Fiddle here, if you'd like to experiment: http://sqlfiddle.com/#!2/96d57/2/0
I would approach it a different way by doing a "PreQuery". Get a list of all authors who have have a common resource count to another author, but to NOT include the original author in the final list. Once those authors are determined, get their name/contact info and the total count of common resources, but not the SPECIFIC resources that were in common. That would be a slightly different query.
Now, the query. to help optimize the query, I would have two indexes on
one on just the (author_id)
another combination on (resource_id, author_id)
which you already have.
Now to explain the inner query. Do that part on its own first and you can see the execution plan will utilize the index. The intent here, the query starts with the resource authors but only cares about one specific author (where clause) which will keep this result set very short. That is IMMEDIATELY joined to the resource authors table again, but ONLY based on the same RESOURCE and the author IS NOT the primary one (from the where clause) giving you only those OTHER authors. By adding a COUNT(), we are now identifying how many for each respective offer have common resources, grouped by author returning one entry per author. Finally take that "PreQuery" result set (all records already prequalified above), and join to the authors. Get details and count() and done.
SELECT
A.*,
PreQuery.CommonResources
from
( SELECT
ra2.author_id,
COUNT(*) as CommonResources
FROM
resources_authors ra1
JOIN resources_authors ra2
ON ra1.resource_id = ra2.resource_id
AND NOT ra1.author_id = ra2.author_id
WHERE
ra1.author_id = 1313
GROUP BY
ra2.author_id ) PreQuery
JOIN authors A
ON PreQuery.author_id = A.author_id

How can I optimize MySQL query with multiple joins?

Any inputs on how can I optimize joins in the MySQL query? For example, consider the following query
SELECT E.name, A.id1, B.id2, C.id3, D.id4, E.string_comment
FROM E
JOIN A ON E.name = A.name AND E.string_comment = A.string_comment
JOIN B ON E.name = B.name AND E.string_comment = B.string_comment
JOIN C ON E.name = C.name AND E.string_comment = C.string_comment
JOIN D ON E.name = D.name AND E.string_comment = D.string_comment
Table A,B,C,D are temporary tables and contains 1096 rows and Table E (also temporary table) contains 426 rows. Without creating any index, MySQL EXPLAIN was showing me all the rows being searched from all the Tables. Now, I created a FULLTEXT index for name as name_idx and string_comment as string_idx on all the tables A,B,C,B and E. The EXPLAIN command is still giving me the same result as shown below.
Also, please note that
name and string_comment are of type VARCHAR and idX are of type int(15)
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE A ALL name_idx,string_idx 1096
1 SIMPLE B ALL name_idx,string_idx 1096 Using where
1 SIMPLE C ALL name_idx,string_idx 1096 Using where
1 SIMPLE D ALL name_idx,string_idx 1096 Using where
1 SIMPLE E ALL name_idx,string_idx 426 Using where
Any comments on how can I tune this query?
Thanks.
For each table you should create a composite index on both columns. The syntax varies a bit, but it is something like:
CREATE INDEX comp_E_idx E(name, string_comment)
And repeat for all tables.
Separate indices won't help because when it tries to merge they are useless. It searches for the name in the index really fast, but then has to iterate to find the comment
You're asking if the query can be tune. I would first ask if the data, itself can be tuned. i.e. Please consider putting the data in A, B, C, D and E into a single table with NULL-able columns. This will reduce the complexity of searches from O(N^5) to O(N).

mysql join optimization for big query

i have big query like this and i can't totally rebuild application because of customer:
SELECT count(AdvertLog.id) as count,AdvertLog.advert,AdvertLog.ut_fut_tstamp_dmy as day,
AdvertLog.operation,
Advert.allow_clicks,
Advert.slogan as name,
AdvertLog.log,
(User.tx_reality_credit
+-20
-(SELECT COUNT(advert_log.id) FROM advert_log WHERE ut_fut_tstamp_dmy <= day AND operation = 0 AND advert IN (168))
+(SELECT IF(ISNULL(SUM(log)),0,SUM(log)) FROM advert_log WHERE ut_fut_tstamp_dmy <= day AND operation IN (1, 2) AND advert = 40341 )) AS points
FROM `advert_log` AS AdvertLog
LEFT JOIN `tx_reality_advert` Advert ON Advert.uid = AdvertLog.advert
LEFT JOIN `fe_users` AS User ON (User.uid = Advert.user or User.uid = AdvertLog.advert)
WHERE User.uid = 40341 and AdvertLog.id>0
GROUP BY AdvertLog.ut_fut_tstamp_dmy, AdvertLog.advert
ORDER BY AdvertLog.ut_fut_tstamp_dmy_12 DESC,AdvertLog.operation,count DESC,name
LIMIT 0, 15
It takes 1.5s approximately which is too long.
Indexes:
User.uid
AdvertLog.advert
AdvertLog.operation
AdvertLog.advert
AdvertLog.ut_fut_tstamp_dmy
AdvertLog.id
Advert.user
AdvertLog.log
Output of Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY User const PRIMARY PRIMARY 4 const 1 Using temporary; Using filesort
1 PRIMARY AdvertLog range PRIMARY,advert PRIMARY 4 NULL 21427 Using where
1 PRIMARY Advert eq_ref PRIMARY PRIMARY 4 etrend.AdvertLog.advert 1 Using where
3 DEPENDENT SUBQUERY advert_log ref ut_fut_tstamp_dmy,operation,advert advert 5 const 1 Using where
2 DEPENDENT SUBQUERY advert_log index_merge ut_fut_tstamp_dmy,operation,advert advert,operation 5,2 NULL 222 Using intersect(advert,operation); Using where
Can anyone help me, because i tried different things but no improvements
The query is pretty large, and I'd expect this to take a fair bit of time, but you could try adding an index on Advert.uid, if it's not present. Other than that, someone with much better SQL-foo than I will have to answer this.
First, your WHERE clause is based on a specific "User.ID", yet there is an index on the Advert_Log by the Advert (user ID). So, first, change the WHERE clause to reflect this...
Where
AdverLog.Advert = 40341
Then, remove the "LEFT JOIN" to just a "JOIN" to the user table.
Finally (without a full rewrite of the query), I would tack on the "STRAIGHT_JOIN" keyword...
select STRAIGHT_JOIN
... rest of query ...
Which tells the optimizer to perform the query in the order / relations explicitly stated.
Another area to optimize would be to pre-query the "points" (counts and logs based on advert and operation) once and pull the answer from that (as a subquery) instead of it running through two queries)... but I'd be interested to know impact of above WHERE, JOIN and STRAIGHT_JOIN helps.
Additionally, looking at the join to the user table based on EITHER The Advert_Log.Advert (userID), or the TX_Reality_Credit.User (another user ID which does not appear to be the same since the join between Advert_Log and TX_Reality_Credit (TRC) is based on the TRC.UID) unless that is an incorrect assumption. This could possibly give erroneous results as you are testing for MULTIPLE User IDs... the advert user, and whoever else is the "user" from the "TRC" table... which would result in which user's credit is being applied to the "points" calculation.
To better understand the relationship and context, can you give some more clarification of what is IN these tables from the Advert_Log to TX_Reality_Credit perspective, and the Advert vs UID vs User...

Slow query takes .0007s? Why is this in my slowlog?

SELECT vt.vtid, vt.tag, vt.typeid, vt.id, vt.count, tt.type, u.username, vt.date_added, tc.context, tc.contextid
FROM ( vt, tt, u )
LEFT JOIN tc ON ( vt.vtid = tc.vtid AND tc.userid = vt.userid )
WHERE vt.typeid = tt.typeid
AND vt.verified =0
AND vt.userid = u.userid
ORDER BY vt.date_added DESC
LIMIT 1
takes .0007s to complete
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE vt ref typeid,userid,verified verified 1 const 9 Using where; Using filesort
1 SIMPLE tt eq_ref PRIMARY PRIMARY 4 vt.typeid 1
1 SIMPLE tc ref vtid vtid 4 vt.vtid 3
1 SIMPLE u eq_ref PRIMARY PRIMARY 4 vt.userid 1 Using where
How can I change this to not show up in the slow query log?
Just a guess. It's possible that you set log-queries-not-using-indexes flag. According to documentation, it may cause queries to be logged in slow log even if indexes are used.
I'm pretty sure that a1ex07 is correct.
However if you want to speed this query up slightly you can change your index on tc from being an index on vtid to being an index on (vtid, userid). Compound keys like that are much faster if you're joining on both keys, and are almost exactly as fast if you're just joining on the first field.

How to optimize MySQL Views

I have some querys using views, and these run a lot slower than I would expect them to given all relevant tables are indexed (and not that large anyway).
I hope I can explain this:
My main Query looks like this (grossly simplified)
select [stuff] from orders as ord
left join calc_order_status as ors on (ors.order_id = ord.id)
calc_order_status is a view, defined thusly:
create view calc_order_status as
select ord.id AS order_id,
(sum(itm.items * itm.item_price) + ord.delivery_cost) AS total_total
from orders ord
left join order_items itm on itm.order_id = ord.id
group by ord.id
Orders (ord) contain orders, order_items contain the individual items associated with each order and their prices.
All tables are properly indexed, BUT the thing runs slowly and when I do a EXPLAIN I get
# id select_type table type possible_keys key key_len ref rows Extra
1 1 PRIMARY ord ALL customer_id NULL NULL NULL 1002 Using temporary; Using filesort
2 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1002
3 1 PRIMARY cus eq_ref PRIMARY PRIMARY 4 db135147_2.ord.customer_id 1 Using where
4 2 DERIVED ord ALL NULL NULL NULL NULL 1002 Using temporary; Using filesort
5 2 DERIVED itm ref order_id order_id 4 db135147_2.ord.id 2
My guess is, "derived2" refers to the view. The individual items (itm) seem to work fine, indexed by order _ id. The problem seems to be Line # 4, which indicates that the system doesn't use a key for the orders table (ord). But in the MAIN query, the order id is already defined:
left join calc_order_status as ors on (ors.order _ id = ord.id)
and ord.id (both in the main query and within the view) refer to the primary key.
I have read somewhere than MySQL simpliy does not optimize views that well and might not utilize keys under some conditions even when available. This seems to be one of those cases.
I would appreciate any suggestions. Is there a way to force MySQL to realize "it's all simpler than you think, just use the primary key and you'll be fine"? Or are views the wrong way to go about this at all?
If it is at all possible to remove those joins remove them. Replacing them with subquerys will speed it up a lot.
you could also try running something like this to see if it has any speed difference at all.
select [stuff] from orders as ord
left join (
create view calc_order_status as
select ord.id AS order_id,
(sum(itm.items * itm.item_price) + ord.delivery_cost) AS total_total
from orders ord
left join order_items itm on itm.order_id = ord.id
group by ord.id
) as ors on (ors.order_id = ord.id)
An index is useful for finding a few rows in a big table, but when you query every row, an index just slows things down. So here MySQL probably expects to be using the whole [order] table, so it better not use an index.
You can try if it would be faster by forcing MySQL to use an index:
from orders as ord force index for join (yourindex)