Optimizing primary id relation table

Optimizing primary id relation table - mysql

tag_relation table has tag_id and comment_id fields only and both of them are indexed. (there is no primary) it has InnoDB type.
Following query takes long time to execute. How can I make it faster?
All comment_id, tag_id, status, datetime fields are indexed. I really have no idea how to optimize it further.
SELECT
text
FROM comment
INNER JOIN tag_relation
ON tag_relation.comment_id=comment.comment_id
WHERE tag_id='1022278'
AND status=1
ORDER BY comment.datetime DESC LIMIT 0,20
Main cause of slowness is tag_relation table which has 1.5 million records. When it has less records execution time was faster.
Query plan:

This is your query:
SELECT c.text
FROM comment c INNER JOIN
tag_relation tr
ON tr.comment_id = c.comment_id
WHERE t.tag_id = 1022278 AND c.status = 1
ORDER BY c.datetime DESC
LIMIT 0, 20;
First, notice that I removed the single quotes from the value 1022278. If this is really a number, the single quotes can sometimes confuse SQL optimizers. There are two ways to go about optimizing this query, depending on the selectivity of the various conditions. The first is to have the indexes:
tag_relation(tag_id, comment_id)
comment(comment_id, status, datetime, text)
The second is a covering index for comments, and the most important part is the comment_id column.
The second is:
comment(status, comment_id, datetime)
tag_relation(comment_id, tag_id)
The basic issue is which table is scanned first for the join. Using this index, the query would be processed as:
SELECT c.text
FROM comment c INNER JOIN
tag_relation
WHERE c.status = 1 AND
EXISTS (SELECT 1
FROM tag_relation tr
WHERE tr.comment_id = c.comment_id AND tr.tag_id = 1022278
)
ORDER BY c.datetime DESC
LIMIT 0, 20;
I'm not 100% sure that this avoids the file sort on the result set, but it might work.

If I get it right you have one index for tag_id and another index for comment_id. Try creating an composite index like:
create index ... on tag_relation(tag_id, comment_id)
This will make the index with tag_id redundant so it can be dropped.
AFAIK MySQL cannot do index anding, but even if it could a composite index would be more efficient.

I think the problem is in the "status" field. Although it is indexed, the index is not being used. It says "using where" for that table. You can force the use of the index for status but I'm not sure it will be useful, depending on selectivity, i.e., how many different values can "status" take. Alternatively, the documentation says that if "status" allows for NULL then you'll see the "using where". Does it allow for NULLs? If so, consider restricting it.
I just noticed that I overlooked the "ORDER BY", comment.datetime will need an index.
If you already have an index, then try a subquery:
SELECT text
FROM tag_relation
INNER JOIN (SELECT c.comment_id, c.text, c.datetime
FROM comment c
WHERE c.status = 1) comment
ON tag_relation.comment_id = comment.comment_id
WHERE tag_id='1022278'
ORDER BY comment.datetime DESC LIMIT 0,20

Related

About mysql query with inner Join content

I'm a beginner in php and I want to ask you if the query and table schema I have set up is the right way for performance. Note: If you want me to follow a different way, please provide sample for me, thanks
$digerilanlar = DB::get('
SELECT Count(siparisler.hid) AS siparissayisi,
siparisler.hid, ilanlar.id, ilanlar.seflink, ilanlar.kategori, ilanlar.baslik,
ilanlar.yayin, ilanlar.tutar, ilanlar.sure, ilanlar.onecikan, ilanlar.guncellemetarihi,
uyeler.nick, uyeler.foto, uyeler.online, uyeler.ban FROM ilanlar
inner join uyeler ON uyeler.id=ilanlar.ilansahibi
LEFT JOIN siparisler ON ilanlar.id = siparisler.hid
WHERE ilanlar.kategori= '.$kat->id.' and ilanlar.yayin=1 and uyeler.ban=0
GROUP BY ilanlar.id
ORDER BY guncellemetarihi DESC
LIMIT 0,12');
DATABASE DESİGN
Table engine MyISAM MYSQL versiyon 5.7.14
TABLE:İLANLAR
ilansahibi (int)= index
kategori (int)= index
yayin (int)= index
TABLE:UYELER
ban (int)= index
TABLE:SİPARİSLER
hid (int)= index

This will probably require two temp tables and two sorts:
GROUP BY ilanlar.id
ORDER BY guncellemetarihi DESC
Assuming that guncellemetarihi is update_date, this is not identical, but probably gives you what you want, but with only one temp table and sort:
GROUP BY guncellemetarihi, id
ORDER BY guncellemetarihi DESC, id DESC
COUNT(x) checks x for being NOT NULL. If that is not necessary, simply do COUNT(*).
SELECT COUNT(hid), hid
does not make sense. The COUNT implies that there may be multiple "hids", but hid implies that there is only one. (Since I don't understand to objective, I cannot advise which direction to change things.)
This composite INDEX may help:
ilanlar: INDEX(kategori, yayin, ilansahibi, id)
You should switch from ENGINE=MyISAM to ENGINE=InnoDB.
More on making indexes: Index Cookbook
To discuss further, please provide SHOW CREATE TABLE and EXPLAIN SELECT ...

I have sql data 17949366 in mysql workbench, I try to write query for finding duplicate data

SELECT id, survey_id
From Table1
Where survey_id IN(
select survey_id
from Table1
Group By survey_id
having count(id)>1
)
THIS IS MY QUERY BUT I HAVE BIG DATA I GUESS STILL FETCHING IN IT IN MYSQL WORKBENCH. ANY IDEA I CAN MAKE THIS PROCESS FASTER ?

Sometimes EXISTS performs better because it returns as soon as it finds the row:
SELECT t.id, t.survey_id
From Table1 AS t
WHERE EXISTS(
SELECT 1 FROM Table1
WHERE id <> t.id AND survey_id = t.survey_id
)
I assume id is the primary key in the table.

You can group your data without subqueries:
SELECT id, GROUP_CONCAT(survey_id) as survey_ids
FROM Table1
GROUP BY id
HAVING COUNT(survey_id)>1;

Select count(*),column from table group by column having count(column) > 1
You can simply group by directly. No need for sub query.
Try to add index for column

Use EXPLAIN to see the query execution plan.
On large sets, we will get better performance when an index can be used to satisfy a GROUP BY, rather than a "Using filesort" operation.
Personally, I'd avoid the IN (subquery) and instead use a join to a derived table. I don't know that this has any impact on performance, or in which versions of MySQL there might be a difference. Just my personal preference to write the query this way:
SELECT t.id
, t.survey_id
FROM ( -- inline view
SELECT s.survey_id
FROM Table1 s
GROUP BY s.survey_id
HAVING COUNT(s.id) > 1
) r
JOIN Table1 t
ON t.survey_id = r.survey_id
We do want an index that has survey_id as the leading column. That allows the GROUP BY to be satisfied from the index, avoiding a potentially expensive "Using filesort" operation. That same index will also be used for the join to the original table.
CREATE INDEX Table1_IX2 ON Table1 (survey_id, id, ...)
NOTE: If this is InnoDB and if id is the cluster key, then including the id column doesn't use any extra space (it does enforce some additional ordering), but more importantly it makes the index a covering index for the outer query (query can be satisfied entirely from the index, without lookups of pages in the underlying table.)
With that index defined, we'd expect the EXPLAIN output Extra column show "Using index" for the outer query, and to omit "Using filesort" for the derived table (inline view).
Again, use EXPLAIN to see the query execution plan.

the query running too slow

Given is mySQL table named "user_posts" with the following relevant fields:
user_id
user_status
influencer_status
indexed in all three fields
My running slow query is here and also i have created an dbFiddle . Output of Explain is in the dbfiddle:
SELECT
P.user_post_id,
P.user_id_fk,P.post_type,
P.who_can_see_post,
P.post_image_id,P.post_video_id,
U.user_name, U.user_fullname,U.influencer_status
FROM user_posts P FORCE INDEX (ix_user_posts_post_id_post_type)
INNER JOIN users U FORCE INDEX (ix_status_istatus)
ON P.user_id_fk = U.user_id
WHERE
U.user_status='1' AND
U.influencer_status = '1' AND
(P.who_can_see_post IN('everyone','influencer','friends')) AND
(P.post_type IN('image','video'))
AND p.user_post_id > 30
ORDER BY
P.user_post_id
DESC LIMIT 30
The query takes extremely long, around 6-15 seconds. The database is not very busy otherwise and performs well on other queries.
I am obviously wondering why the query is so slow.
Is there a way to tell exactly what is taking mySQL so long? Or is there any change I need to make to make the query run faster?

The definition of your ix_status_istatus key is preventing it being used to optimise the WHERE clause, as it includes user_id which is not used in the WHERE clause. Redefining the index as
ALTER TABLE `users`
ADD PRIMARY KEY (`user_id`),
ADD KEY ix_status_istatus (user_status, influencer_status);
allows it to be used and should speed up your query, changing the search on users to use index instead of temporary and filesort.
Demo on dbfiddle
Update
Further analysis on dbfiddle suggests that it is also better to remove the FORCE INDEX from the P table as it is not necessary (only the PRIMARY key is required) and changing the JOIN to a STRAIGHT_JOIN i.e. write the JOIN as:
FROM user_posts P
STRAIGHT_JOIN users U FORCE INDEX (ix_status_istatus)
ON P.user_id_fk = U.user_id

I think You should limit result set of joining part using conditions inside ON statement.
It’s like doing filtering during joining instead of joining then filtering.
I’ve checked query plan which shows me full utilization of indexes.
SELECT
P.user_post_id,
P.user_id_fk,P.post_type,
P.who_can_see_post,
P.post_image_id,
P.post_video_id,
U.user_name,
U.user_fullname,
U.influencer_status
FROM user_posts P
INNER JOIN
users U
FORCE INDEX (
users_user_status_index,
users_influencer_status_index
)
ON
U.user_id = P.user_id_fk AND
U.user_status='1' AND
U.influencer_status='1'
WHERE
P.who_can_see_post IN('everyone','influencer','friends') AND
P.post_type IN('image','video') AND
P.user_post_id > 30
ORDER BY
P.user_post_id DESC
LIMIT 30
Indexes that I created:
ALTER TABLE `users`
ADD PRIMARY KEY (`user_id`),
ADD KEY users_user_status_index(user_status),
ADD KEY users_influencer_status_index(influencer_status);
dbfiddle

I'm not sure if I have the correct indexes or if I can improve the speed of my query in MySQL?

My query has a join, and it looks like it's using two indexes which makes it more complicated. I'm not sure if I can improve on this, but I thought I'd ask.
The query produces a list of records with similar keywords the record being queried.
Here's my query.
SELECT match_keywords.padid,
COUNT(match_keywords.word) AS matching_words
FROM keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
WHERE match_keywords.word IS NOT NULL
AND current_program_keywords.padid = 25695
GROUP BY match_keywords.padid
ORDER BY matching_words DESC
LIMIT 0, 11
The EXPLAIN
Word is varchar(40).

You can start by trying to remove the IS NOT NULL test, which is implicitly removed by COUNT on the field. It also looks like you would want to omit 25695 from match_keywords, otherwise 25695 (or other) would surely show up as the "best" match within your 11 row limit?
SELECT match_keywords.padid,
COUNT(match_keywords.word) AS matching_words
FROM keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
WHERE current_program_keywords.padid = 25695
GROUP BY match_keywords.padid
ORDER BY matching_words DESC
LIMIT 0, 11
Next, consider how you would do it as a person.
You would to start with a padid (25695) and retrieve all the words for that padid
From those list of words, go back into the table again and for each matching word,
get their padid's (assumed to have no duplicate on padid + word)
group the padid's together and count them
order the counts and return the highest 11
With your list of 3 separate single-column indexes, the first two steps (both involve only 2 columns) will always have to jump from index back to data to get the other column. Covering indexes may help here - create two composite indexes to test
create index ix_keyword_pw on keyword(padid, word);
create index ix_keyword_wp on keyword(word, padid);
With these composite indexes in place, you can remove the single-column indexes on padid and word since they are covered by these two.
Note: You always have to temper SELECT performance against
size of indexes (the more you create the more to store)
insert/update performance (the more indexes, the longer it takes to commit since it has to update the data, then update all indexes)

Try the following... ensure index on PadID, and one on WORD. Then, by changing the order of the SELECT WHERE qualifier should optimize on the PADID of the CURRENT keyword first, then join to the others... Exclude a join to itself. Also, since you were checking on equality on the inner join to matching keywords... if the current keyword is checked for null, it should never join to a null value, thus eliminating a compare on the MATCH keywords alias as looking at every comparison as looking for NULL...
SELECT STRAIGHT_JOIN
match_keywords.padid,
COUNT(*) AS matching_words
FROM
keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
and match_keywords.padid <> 25695
WHERE
current_program_keywords.padid = 25695
AND current_program_keywords.word IS NOT NULL
GROUP BY
match_keywords.padid
ORDER BY
matching_words DESC
LIMIT
0, 11

You should index the following fields (check to what table corresponds)
match_keyword.padid
current_program_keywords.padid
match_keyword.words
current_program_keywords.words
Hope it helps accelerate

MySQL performance, inner join, how to avoid Using temporary and filesort

I have a table 1 and table 2.
Table 1
PARTNUM - ID_BRAND
partnum is the primary key
id_brand is "indexed"
Table 2
ID_BRAND - BRAND_NAME
id_brand is the primary key
brand_name is "indexed"
The table 1 contains 1 million of records and the table 2 contains 1.000 records.
I'm trying to optimize some query using EXPLAIN and after a lot of try I have reached a dead end.
EXPLAIN
SELECT pm.partnum, pb.brand_name
FROM products_main AS pm
LEFT JOIN products_brands AS pb ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC
LIMIT 0, 10
The query returns this execution plan:
ID, SELECT_TYPE, TABLE, TYPE, POSSIBLE_KEYS, KEY, KEY_LEN , REF, ROWS, EXTRA
1, SIMPLE, pm, range, PRIMARY, PRIMARY, 1, , 1000000, Using where; Using temporary; Using filesort
1, SIMPLE, pb, ref, PRIMARY, PRIMARY, 4, demo.pm.id_pbrand, 1,
The MySQL query optimizer shows a temporary + filesort in the execution plan.
How can I avoid this?
The "EVIL" is in the ORDER BY pb.brand ASC. Ordering by that external field seems to be the bottleneck..

First of all, I question the use of an outer join seeing as the order by is operating on the rhs, and the NULL's injected by the left join are likely to play havoc with it.
Regardless, the simplest approach to speeding up this query would be a covering index on pb.id_brand and pb.brand. This will allow the order by to be evaluated 'using index' with the join condition. The alternative is to find some way to reduce the size of the intermediate result passed to the order-by.
Still, the combination of outer-join, order-by, and limit, leaves me wondering what exactly you are querying for, and if there might not be a better way of expressing the query itself.

Try replacing the join with a subquery. MySQL's optimizer kind of sucks; subqueries often give better performance than joins.

First, try changing your index on the products_brands table. Delete the existing one on brand_name, and create a new one:
ALTER TABLE products_brands ADD INDEX newIdx (brand_name, id_brand)
Then, the table will already have a "orderedByBrandName" index with the ids you need for the join, and you can try:
EXPLAIN
SELECT pb.brand_name, pm.partnum
FROM products_brands AS pb
LEFT JOIN products_main AS pm ON pb.id_brand = pm.id_brand
LIMIT 0, 10
Note that I also changed the order of the tables in the query, so you start with the small one.

This question is somewhat outdated, but I did find it, and so will other people.
Mysql uses temporary if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue.
So you just need to have the join order reversed by using STRAIGHT_JOIN, to bypass the order invented by optimizer:
SELECT STRAIGHT_JOIN pm.partnum, pb.brand_name
FROM products_brands AS pb
RIGHT JOIN products_main AS pm ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC
LIMIT 0, 10
Also make sure that max_heap_table_size AND tmp_table_size variables are set to a number big enough to store the results:
SET global tmp_table_size=100000000;
SET global max_heap_table_size=100000000;
-- 100 megabytes in this example. These can be set in my.cnf config file, too.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Optimizing primary id relation table - mysql

Related

About mysql query with inner Join content

I have sql data 17949366 in mysql workbench, I try to write query for finding duplicate data

the query running too slow

I'm not sure if I have the correct indexes or if I can improve the speed of my query in MySQL?

MySQL performance, inner join, how to avoid Using temporary and filesort

Categories

Resources