How to optimize the following SELECT query

How to optimize the following SELECT query - mysql

We have the following table
id # primary key
device_id_fk
auth # there's an index on it
old_auth # there's an index on it
And the following query.
$select_user = $this->db->prepare("
SELECT device_id_fk
FROM wtb_device_auths AS dv
WHERE (dv.auth= :auth OR dv.old_auth= :auth)
LIMIT 1
");
explain, I can't reach the server of the main client, but here's another client with fewer data
Since there's a lot of other updates queries on auth, update queries start getting written to the slow query log and the cpu spikes
If you remove the index from auth, then the select query gets written to the slow query log, but not the update, if you add an index to device_id_fk, it makes no difference.
I tried rewriting the query using union instead of or, but I was told that there was still cpu spike and the select query gets written to the slow query log still
$select_user = $this->db->prepare("
(SELECT device_id_fk
FROM wtb_device_auths
AS dv WHERE dv.auth= :auth)
UNION ALL
(SELECT device_id_fk
FROM wtb_device_auths AS dv
WHERE dv.old_auth= :auth)
LIMIT 1"
);
");
Explain
Most often, this is the only query in the slow query log. Is there a more optimal way to write the query? Is there a more optimal way to add indexes? The client is using an old MariaDB version, the equivalent of MYSQL 5.5, on a centos 6 server running LAMP
Additional info
The update query that gets logged to the slow query log whenever an index is added to auth is
$update_device_auth = $this->db->prepare("UPDATE wtb_device_auths SET auth= :auth WHERE device_id_fk= :device_id_fk");

Your few indexes should not be slowing down your updates.
You need two indexes to make both your update and select perform well. My best guess is you never had both at the same time.
UPDATE wtb_device_auths SET auth= :auth WHERE device_id_fk= :device_id_fk
You need an index on device_id_fk for this update to perform well. And regardless of its index it should be declared a foreign key.
SELECT device_id_fk
FROM wtb_device_auths AS dv
WHERE (dv.auth= :auth OR dv.old_auth= :auth)
LIMIT 1
You need a single combined index on auth, old_auth for this query to perform well.
Separate auth and old_auth indexes should also work well assuming there's no too many duplicates. MySQL will merge the results from the indexes and that merge should be fast... unless a lot of rows match.
If you also search for old_auth alone, add an index on old_auth.
And, as others have pointed out, the select query could return one of several matching devices with a matching auth or old_auth. This is probably bad. If auth and old_auth are intended to identify a device, add a unique constraint.
Alternatively, you need to restructure your data. Multiple columns holding the same value is a red flag. It can result in a proliferation of indexes, as you're experiencing, and also limit how many versions you can store. Instead, have just one auth per row and allow each device to have multiple rows.
create table wtb_device_auths (
id serial primary key,
device_id bigint not null references wtb_devices(id),
auth text not null,
created_at datetime not null default current_timestamp,
index(auth)
);
Now you only need to search one column.
select device_id from wtb_device_auths where auth = ?
Now one device can have many wtb_device_auths rows. If you want the current auth for a device, search for the newest one.
select device_id
from wtb_device_auths
where device_id = ?
order by created_at desc
limit 1
Since each device will only have a few auths, this is likely to be plenty fast with the device_id index alone; sorting the handful of rows for a device will be fast.
If not, you might need an additional combined index like created_at, device_id. This covers searching and sorting by created_at alone as well as queries searching and sorting by both created_at and device_id.

OR usually leads to a slow, full-table scan. This UNION trick, together with appropriate INDEXes is much faster:
( SELECT device_id_fk
FROM wtb_device_auths AS dv
WHERE dv.auth= :auth
LIMIT 1 )
UNION ALL
( SELECT device_id_fk
FROM wtb_device_auths AS dv
WHERE dv.old_auth= :auth
LIMIT 1 )
LIMIT 1
And have these "composite" indexes:
INDEX(auth, device_id)
INDEX(old_auth, device_id)
These indexes can replace the existing indexes with the same first column.
Notice that I had 3 LIMITs; you had only 1.
That UNION ALL involves a temp table. You should upgrade to 5.7 (at least); that version optimizes away the temp table.
A LIMIT without an ORDER BY gives a random row; is that OK?
Please provide the entire text of the slowlog entry for this one query -- it has info that might be useful. If "Rows_examined" is more than 2 (or maybe 3), then something strange is going on.

Related

MySQL 8 - Slow select when order by combined with limit

I'm having trouble understanding my options for how to optimize this specific query. Looking online, I find various resources, but all for queries that don't match my particular one. From what I could gather, it's very hard to optimize a query when you have an order by combined with a limit.
My usecase is that i would like to have a paginated datatable that displayed the latest records first.
The query in question is the following (to fetch 10 latest records):
select
`xyz`.*
from
xyz
where
`xyz`.`fk_campaign_id` = 95870
and `xyz`.`voided` = 0
order by
`registration_id` desc
limit 10 offset 0
& table DDL:
CREATE TABLE `xyz` (
`registration_id` int NOT NULL AUTO_INCREMENT,
`fk_campaign_id` int DEFAULT NULL,
`fk_customer_id` int DEFAULT NULL,
... other fields ...
`voided` tinyint unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`registration_id`),
.... ~12 other indexes ...
KEY `activityOverview` (`fk_campaign_id`,`voided`,`registration_id` DESC)
) ENGINE=InnoDB AUTO_INCREMENT=280614594 DEFAULT CHARSET=utf8 COLLATE=utf8_danish_ci;
The explain on the query mentioned gives me the following:
"id","select_type","table","partitions","type","possible_keys","key","key_len","ref","rows","filtered","Extra"
1,SIMPLE,db_campaign_registration,,index,"getTop5,winners,findByPage,foreignKeyExistingCheck,limitReachedIp,byCampaign,emailExistingCheck,getAll,getAllDated,activityOverview",PRIMARY,"4",,1626,0.65,Using where; Backward index scan
As you can see it says it only hits 1626 rows. But, when i execute it - then it takes 200+ seconds to run.
I'm doing this to fetch data for a datatable that is to display the latest 10 records. I also have pagination that allows one to navigate pages (only able to go to next page, not last or make any big jumps).
To further help with getting the full picture I've put together a dbfiddle. https://dbfiddle.uk/Jc_K68rj - this fiddle does not have the same results as my table. But i suspect this is because of the data size that I'm having with my table.
The table in question has 120GB data and 39.000.000 active records. I already have an index put in that should cover the query and allow it to fetch the data fast. Am i completely missing something here?

Another solution goes something like this:
SELECT b.*
FROM ( SELECT registration_id
FROM xyz
where `xyz`.`fk_campaign_id` = 95870
and `xyz`.`voided` = 0
order by `registration_id` desc
limit 10 offset 0 ) AS a
JOIN xyz AS b USING (registration_id)
order by `registration_id` desc;
Explanation:
The derived table (subquery) will use the 'best' query without any extra prompting -- since it is "covering".
That will deliver 10 ids
Then 10 JOINs to the table to get xyz.*
A derived table is unordered, so the ORDER BY does need repeating.
That's tricking the Optimizer into doing what it should have done anyway.
(Again, I encourage getting rid of any indexes that are prefixes of the the 3-column, optimal, index discussed.)

KEY `activityOverview` (`fk_campaign_id`,`voided`,`registration_id` DESC)
is optimal. (Nearly as good is the same index, but without the DESC).
Let's see the other indexes. I strongly suspect that there is at least one index that is a prefix of that index. Remove it/them. The Optimizer sometimes gets confused and picks the "smaller" index instead of the "better index.
Here's a technique for seeing whether it manages to read only 10 rows instead of most of the table: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#handler_counts

SQL gets slow on a simple query with ORDER BY

I have problem with MySQL ORDER BY, it slows down query and I really don't know why, my query was a little more complex so I simplified it to a light query with no joins, but it stills works really slow.
Query:
SELECT
W.`oid`
FROM
`z_web_dok` AS W
WHERE
W.`sent_eRacun` = 1 AND W.`status` IN(8, 9) AND W.`Drzava` = 'BiH'
ORDER BY W.`oid` ASC
LIMIT 0, 10
The table has 946,566 rows, with memory taking 500 MB, those fields I selecting are all indexed as follow:
oid - INT PRIMARY KEY AUTOINCREMENT
status - INT INDEXED
sent_eRacun - TINYINT INDEXED
Drzava - VARCHAR(3) INDEXED
I am posting screenshoots of explain query first:
The next is the query executed to database:
And this is speed after I remove ORDER BY.
I have also tried sorting with DATETIME field which is also indexed, but I get same slow query as with ordering with primary key, this started from today, usually it was fast and light always.
What can cause something like this?

The kind of query you use here calls for a composite covering index. This one should handle your query very well.
CREATE INDEX someName ON z_web_dok (Drzava, sent_eRacun, status, oid);
Why does this work? You're looking for equality matches on the first three columns, and sorting on the fourth column. The query planner will use this index to satisfy the entire query. It can random-access the index to find the first row matching your query, then scan through the index in order to get the rows it needs.
Pro tip: Indexes on single columns are generally harmful to performance unless they happen to match the requirements of particular queries in your application, or are used for primary or foreign keys. You generally choose your indexes to match your most active, or your slowest, queries. Edit You asked whether it's better to create specific indexes for each query in your application. The answer is yes.

There may be an even faster way. (Or it may not be any faster.)
The IN(8, 9) gets in the way of easily handling the WHERE..ORDER BY..LIMIT completely efficiently. The possible solution is to treat that as OR, then convert to UNION and do some tricks with the LIMIT, especially if you might also be using OFFSET.
( SELECT ... WHERE .. = 8 AND ... ORDER BY oid LIMIT 10 )
UNION ALL
( SELECT ... WHERE .. = 9 AND ... ORDER BY oid LIMIT 10 )
ORDER BY oid LIMIT 10
This will allow the covering index described by OJones to be fully used in each of the subqueries. Furthermore, each will provide up to 10 rows without any temp table or filesort. Then the outer part will sort up to 20 rows and deliver the 'correct' 10.
For OFFSET, see http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or

SQL: Optimize the query on large table with indexing

For example, I have the following table:
table Product
------------
id
category_id
processed
product_name
This table has index on columns id category_id and processed and (category_id, proccessed). The statistic on this table is:
select count(*) from Product; -- 50M records
select count(*) from Product where category_id=10; -- 1M records
select count(*) from Product where processed=1; -- 30M records
My simplest query I want to query is: (select * is the must).
select * from Product
where category_id=10 and processed=1
order by id ASC LIMIT 100
The above query without limit only has about 10,000 records.
I want to call the above query for multiple time. Every time I get out I will update field processed to 0. (so it will not appear on the next query). When I test on the real data, sometime the optimizer try to use id as the key, so it cost a lot of time.
How can I optimize the above query (In general term)
P/S: for avoiding confuse, I know that the best index should be (category, processed, id). But I cannot change the index. My question is just only related to optimize the query.
Thanks

For this query:
select *
from Product
where category_id = 10 and processed = 1
order by id asc
limit 100;
The optimal index is on product(category_id, processed, id). This is a single index with a three-part key, with the keys in this order.

Given that you have INDEX(category_id, processed), there is virtually no advantage in also having just INDEX(category_id). So DROP the latter.
That may have the beneficial side effect of pushing the Optimizer toward the composite INDEX(category_id, processed), which is at least "better" for the query.
Without touching the indexes, you could use a FORCE INDEX mentioning the composite index's name. But I don't recommend it. "It may help today, but hurt tomorrow, after the data changes."
Why do you say "But I cannot change the index."? Newer version of MySQL/MariaDB make ADD/DROP INDEX much faster than older versions. Also, pt-online-schema-change is provides a fast way.

MySQL query never finishes when doing an ORDER BY and doing a COUNT

I am trying to calculate the rank of characters in my table. For whatever reason, the query runs forever whenever I run it with an order by clause.
I have a very similar query running in a different server with a different schema but it's essentially doing the same thing and that does finish almost instantly. I am completely lost as to why this query never finishes and takes forever.
I am indexing almost everything in the characters table and still no luck.
KEY `accountid` (`accountid`),
KEY `party` (`party`),
KEY `ranking1` (`level`,`exp`),
KEY `ranking2` (`gm`,`job`),
KEY `idx_characters_gm` (`gm`),
KEY `idx_characters_fame` (`fame`),
KEY `idx_characters_job` (`job`),
KEY `idx_characters_level` (`level`),
KEY `idx_characters_exp` (`exp`),
When I don't include the ORDER BY it runs just fine and finishes instantly. When I do, it runs forever.
There are only 28,000 characters in the DB so it can't be that intensive to compute a rank, especially when the limit's only 1.
SELECT c.name
, 1+(
SELECT COUNT(*)
FROM msd.characters as rankc
WHERE rankc.level > c.level
LIMIT 1
) as jobRank
FROM characters as c
JOIN accounts as a
ON c.accountid = a.id
WHERE c.gm = 0 AND a.banned = 0
ORDER BY c.`level` DESC, c.exp DESC
LIMIT 1
OFFSET 0;
Any help would be greatly appreciated!
EDIT:
Essentially, each character has a unique job and I want to get the job ranking of that character. The default order of the rankings is by level. That's why I'm doing a comparison in my jobRank SELECT.
Here is an example of my desired result: desired result

The first thing you should do when having query performance problems is run an explain on your query (just prefix your select statement with 'explain' and you can see https://dev.mysql.com/doc/refman/5.7/en/using-explain.html for more details.) While you might think you have all the appropriate indexes in place, sometimes the decisions the DBMS engine makes will not be obvious. Don't be surprised to find one or more full table scans where you thought indexes would be used (that's probably the most common source of 'query never finishes' issues)
Update: based on your explain output, the query is performing multiple full table scans in the subquery which is probably your performance problem. You want to try to get that subquery using an index so I'd suggest eliminating LIMIT 1 from the subquery (the where clause is filtering out levels you don't want and there's no group by clause so there should only be 1 row returned) and see if that does the job.

Using index with IN clause and ordering by primary key

I am having a problem with the following task using MySQL. I have a table Records(id,enterprise, department, status). Where id is the primary key, and enterprise and department are foreign keys, and status is an integer value (0-CREATED, 1 - APPROVED, 2 - REJECTED).
Now, usually the application need to filter something for a concrete enterprise and department and status:
SELECT * FROM Records WHERE status = 0 AND enterprise = 11 AND department = 21
ORDER BY id desc LIMIT 0,10;
The order by is required, since I have to provide the user with the most recent records. For this query I have created an index (enterprise, department, status), and everything works fine. However, for some privileged users the status should be omitted:
SELECT * FROM Records WHERE enterprise = 11 AND department = 21
ORDER BY id desc LIMIT 0,10;
This obviously breaks the index - it's still good for filtering, but not for sorting. So, what should I do? I don't want create a separate index (enterprise, department), so what if I modify the query like this:
SELECT * FROM Records WHERE enterprise = 11 AND department = 21
AND status IN (0,1,2)
ORDER BY id desc LIMIT 0,10;
MySQL definitely does use the index now, since it's provided with values of status, but how quick will the sorting by primary key be? Will it take the recent 10 values for each status available, and then merge them, or will it first merge the ids for each status together, and only after that take the first ten (this way it's gonna be much slower I guess).

All of the queries will benefit from one composite query:
INDEX(enterprise, department, status, id)
enterprise and department can swapped, but keep the rest of the columns in that order.
The first query will use that index for both the WHERE and the ORDER BY, thereby be able to find the 10 rows without scanning the table or doing a sort.
The second query is missing status, so my index is less than perfect. This would be better:
INDEX(enterprise, department, id)
At that point, it works like above. (Note: If the table is InnoDB, then this 3-column index is identical to your 2-column INDEX(enterprise, department) -- the PK is silently included.)
The third query gets dicier because of the IN. Still, my 4 column index will be nearly the best. It will use the first 3 columns, but not be able to do the ORDER BY id, so it won't use id. And it won't be able to comsume the LIMIT. Hence the EXPLAIN will say Using temporary and/or Using filesort. Don't worry, performance should still be nice.
My second index is not as good for the third query.
See my Index Cookbook.
"How quick will sorting by id be"? That depends on two things.
Whether the sort can be avoided (see above);
How many rows in the query without the LIMIT;
Whether you are selecting TEXT columns.
I was careful to say whether the INDEX is used all the way through the ORDER BY, in which case there is no sort, and the LIMIT is folded in. Otherwise, all the rows (after filtering) are written to a temp table, sorted, then 10 rows are peeled off.
The "temp table" I just mentioned is necessary for various complex queries, such as those with subqueries, GROUP BY, ORDER BY. (As I have already hinted, sometimes the temp table can be avoided.) Anyway, the temp table comes in 2 flavors: MEMORY and MyISAM. MEMORY is favorable because it is faster. However, TEXT (and several other things) prevent its use.
If MEMORY is used then Using filesort is a misnomer -- the sort is really an in-memory sort, hence quite fast. For 10 rows (or even 100) the time taken is insignificant.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008