We are encountering deadlocks in our system and the deadlock graph shows this format of wait resources.
waitresource="KEY: 500:xxxxxxxxxx (f4d477997e11)
waitresource="KEY: 500:xxxxxxxxxx (8d4830b45673)"
Now both the victim and the winner have the same HOBTID and are pointing to the same object - In our case a non clustered index on a table.Only the hash value is different and so I'm guessing its different rows.
When I use the
select *,%%lockres%%
from dbo.myTabke
where %%lockres%% IN('(f4d477997e11)','(8d4830b45673)')
Now this yields no results.
Also my index itself which is my deadlock wait resource has columns in this order -
test_NonClustIndex(colA, col B, colC)
The statements in my sessions(both updates and selects on the same table) have predicates like
WHERE colA = #a and colB = #b and colC = #c.
Now for both the winner and the victim sessions, #a and #b is the same value. Only #c is different.
Will I be able to avoid the deadlock if I flip the order of index to
test_NonClustIndex(colC, colA, col B)?
There are other indexes on the table that are don't show up as a wait resource on the deadlock graph.
Regarding not finding the %%lockres%% value, you'll need to specify an index hint for the non-clustered index so that the hash of the non-clustered key is returned instead of the clustered key:
--clustered key hashes
SELECT *,%%lockres%%
FROM dbo.myTabke
WHERE %%lockres%% IN('(f4d477997e11)','(8d4830b45673)');
--non-clustered key hashes
SELECT *,%%lockres%%
FROM dbo.myTabke WITH(INDEX=test_NonClustIndex)
WHERE %%lockres%% IN('(f4d477997e11)','(8d4830b45673)');
I can't say whether changing the non-clustered index key order will help void the deadlock without additional details but I wouldn't expect that to be the solution. Add the deadlock xml to your question.
Related
Here is the table_a schema I have:
Field
type
id(PRIMARY)
bigint
status
tinyint
err_code
bigint
...
...
The sql I want to execute will be:
select * from table_a where id > 123456 and status = -1 and err_code = 100001 order by id asc LIMIT 500
I'd like to query this sql above in real time.
My question is what kind of the index should I use here, I ready create a composite index -- idx_id_status_err_code, but it seems that mysql does not choose it.
There are two possible keys reported by explain statement -- PRIMARY and idx_id_status_err_code, but mysql use primary key instead of idx_id_status_err_code.
Another thing, there are some concurrent write operations, so I add row lock(for update not share mode) to target rows. I'm not sure if these write locks will affect the sql I mentioned above.
Any help is appreciated.
where id > 123456 and status = -1 and err_code = 100001 order by id
needs
INDEX(status, error_code, -- 1st because they are tested with "=", either order
id) -- for range test (>) and for ORDER BY
Since that handles all of the WHERE, GROUP BY, and ORDER BY, the Optimizer can even handle the LIMIT 500, thereby stopping after 500 rows.
When you start an INDEX with the column(s) of the PRIMARY KEY (id), there is little reason for the Optimizer to pick the INDEX instead of simply reaching into the data. This is especially true since you are fetching columns that are not in the index (SELECT *).
Avoid "index hints". What helps today may hurt tomorrow (when the data distribution changes).
You mentioned a "row lock"; let's hear more about why you think you need such. If you are afraid that some other thread will change one of the rows this SELECT picked, then that is better fixed by adding a suitable WHERE to the UPDATE -- to make sure the row still has that status and error_code.
I have the following query:
select * from `tracked_employments`
where `tracked_employments`.`file_id` = 10006000
and `tracked_employments`.`user_id` = 1003230
and `tracked_employments`.`can_be_sent` = 1
and `tracked_employments`.`type` = ‘jobchange’
and `tracked_employments`.`file_type` = ‘file’
order by `tracked_employments`.`id` asc
limit 1000
offset 2000;
and this index
explain tells me that it does not use the index, but when I replace * with id it does use it. Why does it make a difference what columns I select?
Both you and Akina have misconceptions about how InnoDB indexing works.
Let me explain the two ways that that query may be executed.
Case 1. Index is used.
This assumes the datatypes, etc, all match the 5-column composite index that seems to exist on the table. Note: because all the tests are for =, the order of the columns in the WHERE clause and the INDEX does not matter.
In InnoDB, id (or whatever column(s) are in the PRIMARY KEY are implicitly added onto the index.
The lookup will go directly (in the Index's BTree) to the first row that matches all 5 tests. From there, it will scan forward. Each 'row' in the index has the PK, so it can reach over into the data's BTree to find any other columns needed for * (cf SELECT *).
But, it must skip over 2000 rows before delivering the 1000 that are desired. This is done by actually stepping over each one, one at a time. That is, OFFSET is not necessarily fast.
Case 2. Don't bother with the index.
This happens based on some nebulous analysis of the 3000 rows that need to be touched and the size of the table.
The rationale behind possibly scanning the table without using the index is that the bouncing between the index BTree and the data BTree may be more costly than simply scanning the data BTree. Note that the data BTree is already in the desired order -- namely by id. (Assuming that is the PK.) That avoids a sort of up to 1000 rows.
Also, certain datatype issues may prevent the use of the index.
I do need to ask what the client will do with 1000 rows all at once. If it is a web page, that seems awfully big.
Case 3 -- Just SELECT id
In this case, all the info is available in the index, so there is no need to reach into the data's BTree.
The problem I have is the following:
I have a table that contains about 100000000 rows
it has 22 fields - some numeric, some text
it has a primary key id (auto-incremented integer)
it has a field another_id of type bigint, and a unique key on it
it has a field called state that can take only 4 integer values (0 to 3)
I need that the queries of the following form are executed as fast as possible:
SELECT COUNT(*)
FROM my_table
WHERE another_id IN ( <about 100 values> )
AND state = ...
for different values of state.
How should the index look like? I was thinking about two options:
KEY another_id:state (another_id, state)
KEY state:another_id (state, another_id)
Is there any difference in performance between those two variants? Is there anything else to consider?
Edit: engine is InnoDB
For the query you show, you should create the index with state, another_id in that order.
Define the index with any columns referenced in equality conditions first, after them add one column referenced in a range condition or ORDER BY or GROUP BY.
You may also like my answer to Does Order of Fields of Multi-Column Index in MySQL Matter or my presentation How to Design Indexes, Really, or the video.
I agree with the answer above. One clarification though is that you want to have ita hash index not btree index. It should work faster. The hash index wouldn't work well with any queries that involve inequality such as <=
I have a very large table 20-30 million rows that is completely overwritten each time it is updated by the system supplying the data over which I have no control.
The table is not sorted in a particular order.
The rows in the table are unique, there is no subset of columns that I can be assured to have unique values.
Is there a way I can run a SELECT query followed by a DELETE query on this table with a fixed limit without having to trigger any expensive sorting/indexing/partitioning/comparison whilst being certain that I do not delete a row not covered by the previous select.
I think you're asking for:
SELECT * FROM MyTable WHERE x = 1 AND y = 3;
DELETE * FROM MyTable WHERE NOT (x = 1 AND y = 3);
In other words, use NOT against the same search expression you used in the first query to get the complement of the set of rows. This should work for most expressions, unless some of your terms return NULL.
If there are no indexes, then both the SELECT and DELETE will incur a table-scan, but no sorting or temp tables.
Re your comment:
Right, unless you use ORDER BY, you aren't guaranteed anything about the order of the rows returned. Technically, the storage engine is free to return the rows in any arbitrary order.
In practice, you will find that InnoDB at least returns rows in a somewhat predictable order: it reads rows in some index order. Even if your table has no keys or indexes defined, every InnoDB table is stored as a clustered index, even if it has to generate an internal key called GEN_CLUST_ID behind the scenes. That will be the order in which InnoDB returns rows.
But you shouldn't rely on that. The internal implementation is not a contract, and it could change tomorrow.
Another suggestion I could offer:
CREATE TABLE MyTableBase (
id INT AUTO_INCREMENT PRIMARY KEY,
A INT,
B DATE,
C VARCHAR(10)
);
CREATE VIEW MyTable AS SELECT A, B, C FROM MyTableBase;
With a table and a view like above, your external process can believe it's overwriting the data in MyTable, but it will actually be stored in a base table that has an additional primary key column. This is what you can use to do your SELECT and DELETE statements, and order by the primary key column so you can control it properly.
I have a table(users) with columns as
id INT AUTOINVREMENT PRIMARY
uid INT index
email CHAR(128) UNIQUE
activated TINYINT
And I'll need to query this table like this:
SELECT * FROM users WHERE uid = ? AND activated = 1
My questions is, since there's an index set on the 'uid' column, in order to get the best performance for the above query, do I need to set another index to the 'activated' column too? This table(would be a big one) will be heavily accessed by 'INSERT', 'UPDATE' statements as well as 'SELECT' ones.
As I've learned from other sources that indexes goes opposite to 'INSERT' and 'UPDATE' statements, so if the index on the uid column is enough for the query above I won't have to set another index for activated for 'insert & update's performance sake.
MySQL will only use 1 index per table anyway, so having an additional index will not help.
However, if you want really optimal performance, define your index on both columns in this order: (eg. 1 index across 2 columns)
index_name (uid, activated)
That will allow optimized lookups of just uid, or uid AND activated.
It depends upon your data distribution and the selectivity of uid versus the selectivity of uid and activated. If you have lots of unique values of uid and this would have high selectivity ie searching for uid = x only returns a few rows then including activated in the index would provide little value. Whereas if uid = x returns lots of rows and uid = x and activated = 1 returns few rows then there's value in the index.
It's hard to provide a specific answer without know the data distribution.
Creating the index won't make you selects more slow.
However, it will make them significantly faster only if your search for unlike events.
This index will only be useful if the majority of your accounts are activated and you search for not-activated ones, or the other way round: the majority of your accounts are non-activated and you search for activated ones.
Creating this index will also improve UPDATE and DELETE concurrency: without this index, all accounts (both activated and not-activated) for a given uid will be locked for the duration of UPDATE operation in InnoDB.
However, an additional index will of course hamper the DML performance.