Covering index in MySQL for join + where queries - mysql

Suppose my query is:
SELECT adstable.adid FROM `adstable`
inner join userstable on
(adstable.adid = userstable.adid)
WHERE adstable.desktopimp > 100
and adstable.mobileimp > 100
and adstable.userbal > 0.02
and adstable.realbal> 0.02
order by adstable.imptotal asc
Which columns should I index, to ensure a 'covering index' for this query?

Can there be more than one "user" per "ad"? Can there be zero? If neither, then
SELECT a.adid
FROM `adstable` AS a
WHERE a.desktopimp > 100
and a.mobileimp > 100
and a.userbal > 0.02
and a.realbal> 0.02
AND EXISTS (
SELECT *
FROM userstable
WHERE adid = a.adid
)
order by a.imptotal asc
Whether or not that is a better formulation, have these indexes:
userstable: INDEX(adid)
adstable: INDEX(imptotal) -- in case it is better to avoid the sort
INDEX(desktopimp) -- in case it is most selective
INDEX(mobileimp)
INDEX(userbal)
INDEX(realbal)
A "covering" index would include all 6 columns of adstable that are used in the query. That is rather bulky, and of dubious use.
Please provide SHOW CREATE TABLE -- it may be instructive to see the datatypes, engine, etc.
When building an optimal index, start with any columns compared with =. If that takes care of all the WHERE, then move on to the ORDER BY columns. You don't have that case, since you are not using =.

Related

mysql is scanning table despite index

I have the following mysql query that I think should be faster. The database table has 1 million records and the query table 3.5 seconds
set #numberofdayssinceexpiration = 1;
set #today = DATE(now());
set #start_position = (#pagenumber-1)* #pagesize;
SELECT *
FROM (SELECT ad.id,
title,
description,
startson,
expireson,
ad.appuserid UserId,
user.email UserName,
ExpiredCount.totalcount
FROM advertisement ad
LEFT JOIN (SELECT servicetypeid,
Count(*) AS TotalCount
FROM advertisement
WHERE Datediff(#today,expireson) =
#numberofdayssinceexpiration
AND sendreminderafterexpiration = 1
GROUP BY servicetypeid) AS ExpiredCount
ON ExpiredCount.servicetypeid = ad.servicetypeid
LEFT JOIN aspnetusers user
ON user.id = ad.appuserid
WHERE Datediff(#today,expireson) = #numberofdayssinceexpiration
AND sendreminderafterexpiration = 1
ORDER BY ad.id) AS expiredAds
LIMIT 20 offset 1;
Here's the execution plan:
Here are the indexes defined on the table:
I wonder what I am doing wrong.
Thanks for any help
First, I would like to point out some problems. Then I will get into your Question.
LIMIT 20 OFFSET 1 gives you 20 rows starting with the second row.
The lack of an ORDER BY in the outer query may lead to an unpredictable ordering. In particular, the Limit and Offset can pick whatever they want. New versions will actually throw away the ORDER BY in the subquery.
DATEDIFF, being a function, makes that part of the WHERE not 'sargeable'. That is it can't use an INDEX. The usual way (which is sargeable) to compare dates is (assuming expireson is of datatype DATE):
WHERE expireson >= CURDATE() - INTERVAL 1 DAY
Please qualify each column name. With that, I may be able to advise on optimal indexes.
Please provide SHOW CREATE TABLE so that we can see what column(s) are in each index.

MySQL View 20x slower than Select

I have a query that selects ~8000 rows. When I execute the query it takes 0.1 sec.
When I copy the query into a view and execute the view it takes about 2 seconds. In the first row of explain it selects ~570K rows, i dont know why.
I dont understand the first Row and why it shows up only in the view explain
1 PRIMARY ALL NULL NULL NULL NULL
This is the query (yes i know im not a mysql pro and the query is not that efficent, but it works ans 0.1 sek would be ok for me. Does anyone know why it is so slow in a view?
MariaDB 10.5.9
select
`xxxxxxx`.`auftraege`.`Zustandigkeit` AS `Zustandigkeit`,
`xxxxxxx`.`auftraege`.`cms` AS `cms`,
`xxxxxxx`.`auftraege`.`auftrag_id` AS `auftrag_id`,
`xxxxxxx`.`angebot`.`angebot_id` AS `angebot_id`,
`xxxxxxx`.`kunden`.`kunde_id` AS `kid`,
`xxxxxxx`.`angebot`.`kunde_id` AS `kunde_id`,
`xxxxxxx`.`kunden`.`firma` AS `firma`,
`xxxxxxx`.`auftraege`.`gekuendigt` AS `gekuendigt`,
`xxxxxxx`.`kunden`.`ansprechpartnerVorname` AS `ansprechpartnerVorname`,
`xxxxxxx`.`kunden`.`ansprechpartner` AS `ansprechpartner`,
`xxxxxxx`.`auftraege`.`ampstatus` AS `ampstatus`,
`xxxxxxx`.`auftraege`.`autoMahnungen` AS `autoMahnungen`,
`xxxxxxx`.`kunden`.`mail` AS `mail`,
`xxxxxxx`.`kunden`.`ansprechpartnerAnrede` AS `ansprechpartnerAnrede`,
case
`xxxxxxx`.`kunden`.`ansprechpartnerAnrede`
when
'm'
then
concat('Herr ', ifnull(`xxxxxxx`.`kunden`.`ansprechpartnerVorname`, ''), ifnull(`xxxxxxx`.`kunden`.`ansprechpartner`, ''))
else
concat('Frau ', ifnull(`xxxxxxx`.`kunden`.`ansprechpartnerVorname`, ''), ifnull(`xxxxxxx`.`kunden`.`ansprechpartner`, ''))
end
AS `ansprechpartnerfullName`, `xxxxxxx`.`kunden`.`website` AS `website`, `xxxxxxx`.`personal`.`name_betrieb` AS `name_betrieb`, `xxxxxxx`.`kunden`.`prioritaet` AS `prioritaet`, `xxxxxxx`.`auftraege`.`infoemail` AS `infoemail`, `xxxxxxx`.`auftraege`.`keywords` AS `keywords`, `xxxxxxx`.`auftraege`.`ftp_h` AS `ftp_h`, `xxxxxxx`.`auftraege`.`ftp_u` AS `ftp_u`, `xxxxxxx`.`auftraege`.`ftp_pw` AS `ftp_pw`, `xxxxxxx`.`auftraege`.`lgi_h` AS `lgi_h`, `xxxxxxx`.`auftraege`.`lgi_u` AS `lgi_u`, `xxxxxxx`.`auftraege`.`lgi_pw` AS `lgi_pw`, `xxxxxxx`.`auftraege`.`autoRemind` AS `autoRemind`, `xxxxxxx`.`kunden`.`telefon` AS `telefon`, `xxxxxxx`.`kunden`.`mobilfunk` AS `mobilfunk`, `xxxxxxx`.`auftraege`.`kommentar` AS `kommentar`, `xxxxxxx`.`auftraege`.`phase` AS `phase`, `xxxxxxx`.`auftraege`.`datum` AS `datum`, `xxxxxxx`.`angebot`.`typ` AS `typ`,
case
`xxxxxxx`.`auftraege`.`gekuendigt`
when
'1'
then
'Ja'
else
'Nein'
end
AS `Gekuendigt ? `,
(
select
count(`xxxxxxx`.`status`.`aenderung`)
from
`xxxxxxx`.`status`
where
`xxxxxxx`.`status`.`auftrag_id` = `xxxxxxx`.`auftraege`.`auftrag_id`
)
AS `aenderungen`,
`xxxxxxx`.`auftraege`.`vertragStart` AS `vertragStart`,
`xxxxxxx`.`auftraege`.`vertragEnde` AS `vertragEnde`,
case
`xxxxxxx`.`auftraege`.`zahlungsart`
when
'U'
then
'Überweisung'
when
'L'
then
'Lastschrift'
else
'Unbekannt'
end
AS `Zahlungsart`, `xxxxxxx`.`kunden`.`yyyyy_piwik` AS `yyyyy_piwik`,
(
select
max(`xxxxxxx`.`status`.`datum`) AS `mxDTst`
from
`xxxxxxx`.`status`
where
`xxxxxxx`.`status`.`auftrag_id` = `xxxxxxx`.`auftraege`.`auftrag_id`
and `xxxxxxx`.`status`.`typ` = 'SEO'
)
AS `mxDTst`,
(
select
case
`xxxxxxx`.`rechnungen`.`beglichen`
when
'YES'
then
'isOk'
else
'isAffe'
end
AS `neuUwe`
from
(
`xxxxxxx`.`zahlungsplanneu`
join
`xxxxxxx`.`rechnungen`
on(`xxxxxxx`.`zahlungsplanneu`.`rechnungsnummer` = `xxxxxxx`.`rechnungen`.`rechnungsnummer`)
)
where
`xxxxxxx`.`zahlungsplanneu`.`auftrag_id` = `xxxxxxx`.`auftraege`.`auftrag_id`
and `xxxxxxx`.`rechnungen`.`beglichen` <> 'STO' limit 1
)
AS `neuer`,
(
select
group_concat(`xxxxxxx`.`kunden_keywords`.`keyword` separator ',')
from
`xxxxxxx`.`kunden_keywords`
where
`xxxxxxx`.`kunden_keywords`.`kunde_id` = `xxxxxxx`.`kunden`.`kunde_id`
)
AS `keyword`,
(
select
case
count(0)
when
0
then
'Cool'
else
'Uncool'
end
AS `AusfallVor`
from
`xxxxxxx`.`rechnungen`
where
`xxxxxxx`.`rechnungen`.`rechnung_tag` < current_timestamp() - interval 15 day
and `xxxxxxx`.`rechnungen`.`kunde_id` = `xxxxxxx`.`kunden`.`kunde_id`
and `xxxxxxx`.`rechnungen`.`beglichen` = 'NO' limit 1
)
AS `Liquidiert`
from
(
((((`xxxxxxx`.`auftraege`
join
`xxxxxxx`.`angebot`
on(`xxxxxxx`.`auftraege`.`angebot_id` = `xxxxxxx`.`angebot`.`angebot_id`))
join
`xxxxxxx`.`kunden`
on(`xxxxxxx`.`angebot`.`kunde_id` = `xxxxxxx`.`kunden`.`kunde_id`))
left join
`xxxxxxx`.`kunden_keywords`
on(`xxxxxxx`.`angebot`.`kunde_id` = `xxxxxxx`.`kunden_keywords`.`kunde_id`))
join
`xxxxxxx`.`personal`
on(`xxxxxxx`.`kunden`.`bearbeiter` = `xxxxxxx`.`personal`.`personal_id`))
left join
`xxxxxxx`.`status`
on(`xxxxxxx`.`auftraege`.`auftrag_id` = `xxxxxxx`.`status`.`auftrag_id`)
)
group by
`xxxxxxx`.`auftraege`.`auftrag_id`
order by
NULL
UPDATE 1
1. The View Itself (Duration 1.83 sec)
1.1 Create the View: This is the View i created, it only contains the query from above.
1.2 Executing the View: It takes 1.83 sek to execute the view
1.3 Analyze the View: This is the explain of the view
2. The view with added where clause (Duration 1.86 sec)
2.1 Analyze the View with added where clause #rick wanted me to add a where clause to the view, if i understood him correctly. This is the explain of the view, where i added a where clause, takes 1.86 sec.
3. The Query, that is the source of the view (Duration: 0.1 sec)
3.1 Execute the query directly This is the query, that is the source of the view, when i execute it directly to the server. It takes ~0.1 - 0.2 seconds.
3.2 Analyze the direct queryAnd this is the explain of the pure query.
Why the view is so much slower, by only cupsuling the query inside of the view?
Update 2
These are the indexes I have set
ALTER TABLE angebot ADD INDEX angebot_idx_angebot_id (angebot_id);
ALTER TABLE auftraege ADD INDEX auftraege_idx_auftrag_id (auftrag_id);
ALTER TABLE kunden ADD INDEX kunden_idx_kunde_id (kunde_id);
ALTER TABLE kunden_keywords ADD INDEX kunden_keywords_idx_kunde_id (kunde_id);
ALTER TABLE personal ADD INDEX personal_idx_personal_id (personal_id);
ALTER TABLE rechnungen ADD INDEX rechnungen_idx_rechnungsnummer_beglichen (rechnungsnummer,beglichen);
ALTER TABLE rechnungen ADD INDEX rechnungen_idx_beglichen_kunde_id_rechnung (beglichen,kunde_id,rechnung_tag);
ALTER TABLE status ADD INDEX status_idx_auftrag_id (auftrag_id);
ALTER TABLE status ADD INDEX status_idx_typ_auftrag_id_datum (typ,auftrag_id,datum);
ALTER TABLE zahlungsplanneu ADD INDEX zahlungsplanneu_idx_auftrag_id (auftrag_id);
Be consistent between tables. kunde_id, for example, seems to be declared differently between tables. This may be preventing some obvious optimizations. (There are 6 JOINs that say func in EXPLAIN`.)
Remove the extra parentheses in JOINs. They may be preventing what the Optimizer is happy to do -- rearrange the tables in a JOIN.
Turn the query inside out. By this, I mean to do the minimum amount of work to do the main JOIN. Collect mostly id(s). Then do the dependent subqueries in an outer select. Something like:
SELECT ... ( SELECT ... ), ...
FROM ( SELECT a1.id
FROM a AS a1
JOIN b ON ..
JOIN c ON .. )
JOIN a AS a2 ON a2.id = a1.id
JOIN d ON ...
The "inside-out" kludge may eliminate the need for the GROUP BY. (Your query is too complex for me to see for sure.) If so, then I call the problem "explode-implode" -- Your query first JOINs, producing a temp table with lots of rows ("explodes"). Then it does a GROUP BY ("implodes").
More
These indexes will probably help:
status: (auftrag_id, typ, datum, aenderung)
rechnungen: (beglichen, kunde_id, rechnung_tag)
rechnungen: (rechnungsnummer, beglichen)
zahlungsplanneu: (auftrag_id, rechnungsnummer)
kunden_keywords: (kunde_id, keyword) -- (unless `kunde_id` is the PK)
(I see from all 3 EXPLAINs that you probably have sufficient indexes on kunden_keywords and status. Show me what indexes you have, so I can see if the existing indexes are as good as my suggestions.) "Using index" == "covering index".
Near the end is this LEFT JOIN, but I did not spot any use for the table; perhaps it can be removed?
left join `kunden_keywords` on(`angebot`.`kunde_id` = `kunden_keywords`.`kunde_id`))

MYSQL OR query problem (scans full table even when using indexes)

I am using EXPLAIN to get performance analysis of my below query:
SELECT `wf_cart_items` . `id`
FROM `wf_cart_items`
WHERE (`wf_cart_items` . `docket_number` = '405-2844' OR
match( `wf_cart_items` . `multi_docket_number` ) against ( '405-2844' )
)
The problem is that it shows rows to be searched 597151 while individual OR queries examine only 1 row each. How is it possible that when I use OR it is doing a full table scan?
P.S.: I have FULL-TEXT index on multi_docket_number & BTREE index on docket_number
OR is quite tricky for SQL optimizers -- both in the WHERE clause and in ON clauses.
The recommendation is to switch this to union all:
SELECT ci.id
FROM wf_cart_items ci
WHERE ci.docket_number = '405-2844'
UNION ALL
SELECT ci.id
FROM wf_cart_items ci
WHERE MATCH(ci.multi_docket_number) AGAINST ( '405-2844' ) AND
ci.docket_number <> '405-2844';
Based on the naming of your columns, I feat that multi-docket_number actually contains multiple docket numbers. If that is the case, you probably want to fix the data model, but that is another conversation.

Why does this query doesn't use index for ORDER BY?

SELECT `f`.*
FROM `files_table` `f`
WHERE f.`application_id` IN(6)
AND `f`.`project_id` IN(130418)
AND `f`.`is_last_version` = 1
AND `f`.`temporary` = 0
AND f.deleted_by is null
ORDER BY `f`.`date` DESC
LIMIT 5
When I remove the ORDER BY, query executes in 0.1 seconds. With the ORDER BY it takes 3 seconds.
There is an index on every WHERE column and there is also an index on ORDER BY field (date).
What can I do to make this query faster? Why is ORDER BY slowing it down so much? Table has 3M rows.
instead of an index on each column in where be sure you have a composite index that cover all the columns in where
eg
create index idx1 on files_table (application_id, project_id,is_last_version,temporary,deleted_by)
avoid IN clause for single value use = for these
SELECT `f`.*
FROM `files_table` `f`
WHERE f.`application_id` = 6
AND `f`.`project_id` = 130418
AND `f`.`is_last_version` = 1
AND `f`.`temporary` = 0
AND f.deleted_by is null
ORDER BY `f`.`date` DESC
LIMIT 5
the date or others column in select could be useful retrive all info using the index and avoiding the access to the table data .. but for select all (select *)
you probably need severl columns an then the access to the table data is done however .. but you can try an eval the performance ..
be careful to place the data non involved in where at the right of all the column involved in where
create index idx1 on files_table (application_id, project_id,is_last_version,temporary,deleted_by, date)

MySQL queries stuck in "sending data" for 30 seconds after migrating to RDS

This query (along with a few others I think have a related issue) did not take 30 seconds when MySQL was local on the same EC2 instance as the rest of the website. More like milliseconds.
Does anything look off?
SELECT *, chv_images.image_id FROM chv_images
LEFT JOIN chv_storages ON chv_images.image_storage_id =
chv_storages.storage_id
LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
LEFT JOIN chv_categories ON chv_images.image_category_id =
chv_categories.category_id
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND
chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1
LEFT JOIN chv_follows ON chv_follows.follow_followed_user_id =
chv_images.image_user_id
LEFT JOIN chv_follows_projects ON
chv_follows_projects.follows_project_project_id =
chv_images.image_project_id LEFT JOIN chv_projects ON
chv_projects.project_id = follows_project_project_id WHERE
chv_follows.follow_user_id='1' OR (follows_project_user_id = 1 AND
chv_projects.project_privacy = "public" AND
chv_projects.project_is_public_upload = 1) GROUP BY chv_images.image_id
ORDER BY chv_images.image_id DESC
LIMIT 0,15
And this is what EXPLAIN shows:
Thank you
Update: This query has the same issue. It does not have a GROUP BY.
SELECT *, chv_images.image_id FROM chv_images
LEFT JOIN chv_storages ON chv_images.image_storage_id =
chv_storages.storage_id
LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
LEFT JOIN chv_categories ON chv_images.image_category_id =
chv_categories.category_id
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND
chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1
ORDER BY chv_images.image_id DESC
LIMIT 0,15
That EXPLAIN shows several table-scans (type: ALL), so it's not surprising that it takes over 30 seconds.
Here's your EXPLAIN:
Notice the column rows shows an estimated 14420 rows read from the first table chv_images. It's doing a table-scan of all the rows.
In general, when you do a series of JOINs, you can multiple together all the values in the rows column of the EXPLAIN, and the final result is how many row-reads MySQL has to do. In this case it's 14420 * 2 * 1 * 1 * 2 * 1 * 916, or 52,834,880 row-reads. That should put into perspective the high cost of doing several table-scans in the same query.
You might help avoid those table-scans by creating some indexes on these tables:
ALTER TABLE chv_storages
ADD INDEX (storage_id);
ALTER TABLE chv_categories
ADD INDEX (category_id);
ALTER TABLE chv_likes
ADD INDEX (like_content_id, like_content_type, like_user_id);
Try creating those indexes and then run the EXPLAIN again.
The other tables are already doing lookups by primary key (type: eq_ref) or by secondary key (type: ref) so those are already optimized.
Your EXPLAIN shows your query uses a temporary table and filesort. You should reconsider whether you need the GROUP BY, because that's probably causing the extra work.
Another tip is to avoid using SELECT * because it might be forcing the query to read many extra columns that you don't need. Instead, explicitly name only the columns you need.
Is there any indexes in chv_images?
I propose:
CREATE INDEX idx_image_id ON chv_images (image_id);
(Bill's ideas are good. I'll take the discussion a different way...)
Explode-Implode -- If the LEFT JOINs match no more than 1 row, change, for example,
SELECT
...
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
into
SELECT ...,
( SELECT foo FROM chv_meta WHERE image_id = chv_images.image_id ) AS foo, ...
If that can be done for all the JOINs, you can get rid of GROUP BY. This will avoid the costly "explode-implode" where JOINs lead to more rows, then GROUP BY gets rid of the dups. (I suspect you can't move all the joins in.)
OR -> UNION -- OR is hard to optimize. Your query looks like a good candidate for turning into UNION, then making more indexes that will become useful.
WHERE chv_follows.follow_user_id='1'
OR (follows_project_user_id = 1
AND chv_projects.project_privacy = "public"
AND chv_projects.project_is_public_upload = 1
)
Assuming that follows_project_user_id is in `chv_images,
( SELECT ...
WHERE chv_follows.follow_user_id='1' )
UNION DISTINCT -- or ALL, if you are sure there won't be dups
( SELECT ...
WHERE follows_project_user_id = 1
AND chv_projects.project_privacy = "public"
AND chv_projects.project_is_public_upload = 1 )
Indexes needed:
chv_follows: (follow_user_id)
chv_projects: (project_privacy, project_is_public_upload) -- either order
But this has not yet handled the ORDER BY and LIMIT. The general pattern for such:
( SELECT ... ORDER BY ... LIMIT 15 )
UNION
( SELECT ... ORDER BY ... LIMIT 15 )
ORDER BY ... LIMIT 15
Yes, the ORDER BY and LIMIT are repeated.
That works for page 1. If you want the next 15 rows, see http://mysql.rjweb.org/doc.php/pagination#pagination_and_union
After building those two sub-selects, look at them; I think you will be able to optimize each one, and may need new indexes because the Optimizer will start with a different 'first' table.