MySQL 5.1.* Strange trigger replication behavior - mysql

I have a MySQL master-slave configuration.
On both servers I have two tables: table1 and table2
I also have the following trigger on both servers:
Trigger: test_trigger
Event: UPDATE
Table: table1
Statement: insert into table2 values(null)
Timing: AFTER
The structure of table2 is the following:
+-------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
+-------+---------+------+-----+---------+----------------+
The problem is that, on MySQL 5.1.*, when the slave calls the trigger it adds the id that was inserted on the master and NOT the id it should insert according to its own auto_increment value.
Let's say I have the following data:
On Master:
SELECT * FROM table2;
Empty set (0.08 sec)
On Slave:
SELECT * FROM table2;
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.00 sec)
(just ignore the fact that the slave is not a complete mirror of the master)
Given the above scenario, when I update a row from table1 on Master, the Slave stops and returns an error:
Error 'Duplicate entry '1' for key 'PRIMARY'' on query.
I don't see why the slave tries to insert a specific ID.
It's very strange that on MySQL 5.0.* this doesn't happen.

Switch to row-based replication if possible.
Auto increment is pretty much broken for anything but the most basic cases with statement based replication.
For any statement which generates more than one auto_increment value (via triggers, multi row inserts, etc.) only the 1-st auto_increment value will always be correct on the slave (only the 1-st is logged).
If the slave reads an auto_increment value from the log, but does not 'use' it, the value gets used for the next statement (which can be completely unrelated). This happens when the slave skips the corresponding insert statement for some reason (an ignored table/db in the configuration, a conditional insert in a proc/trigger, etc.).
I had a similar problem with an audit-log type table (a trigger inserts an event in table2 for every change to table1) along with several other auto-increment related problems.
I'm not sure this solution will fit your case but I'm going to post it just in case:
Add a 'updated_count' field to table1. It starts at 0 (on insert) and gets incremented by 1 on every update (using BEFORE INSERT/UPDATE triggers).
Remove table2's auto_increment and change its PK to a composite key (table1_pk,table1_update). Then use table1's PK and 'updated_count' in the AFTER INSERT/UPDATE triggers for table2's PK.

Related

Mysql High Concurrency Updates

I have a mysql table:
CREATE TABLE `coupons` (
`id` INT NOT NULL AUTO_INCREMENT,
`code` VARCHAR(255),
`user_id` INT,
UNIQUE KEY `code_idx` (`code`)
) ENGINE=InnoDB;
The table consists of thousands/millions of codes and initially user_id is NULL for everyone.
Now I have a web application which assigns a unique code to thousands of users visiting the application concurrently. I am not sure what is the correct way to handle this considering very high traffic.
The query I have written is:
UPDATE coupons SET user_id = <some_id> where user_id is NULL limit 1;
And the application runs this query with say a concurrency of 1000 req/sec.
What I have observed is the entire table gets locked and this is not scaling well.
What should I do?
Thanks.
As it is understood, coupons is prepopulated and a null user_id is updated to one that is not null.
explain update coupons set user_id = 1 where user_id is null limit 1;
This is likely requiring an architectural solution, but you may wish to review the explain after ensuring that the table has indexes for the columns treated, and that the facilitate rapid updates.
Adding an index to coupons.user_id, for example alters MySQL's strategy.
create unique index user_id_idx on coupons(user_id);
explain update coupons set user_id = 1 where user_id is null limit 1;
+----+-------------+---------+------------+-------+---------------+-------------+---------+-------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+-------------+---------+-------+------+----------+------------------------------+
| 1 | UPDATE | coupons | NULL | range | user_id_idx | user_id_idx | 5 | const | 6 | 100.00 | Using where; Using temporary |
+----+-------------+---------+------------+-------+---------------+-------------+---------+-------+------+----------+------------------------------+
1 row in set (0.01 sec)
So you should work with a DBA to ensure that the database entity is optimized. Trade-offs need to be considered.
Also, since you have a client application, you have the opportunity to pre-fetch null coupons.user_id and do an update directly on coupons.id. Curious to hear of your solution.
This question might be more suitable for DBA's (and I'm not a DBA) but I'll try to give you some ideas of what's going on.
InnoDB does not actually lock the whole table when you perform you update query. What it does is the next: it puts a record lock which prevents any other transaction from inserting, updating, or deleting rows where the value of coupons.user_id is NULL.
With your query you have at the moment(which depends on user_id to be NULL), you cannot have concurrency because your transaction will run one after another, not in parallel.
Even an index on your coupons.user_id won't help, because when putting the lock InnoDB create a shadow index for you if you don't have one. The outcome would be the same.
So, if you want to increase your throughput, there are two options I can think of:
Assign a user to a coupon in async mode. Put all assignment request in a queue then process the queue in background. Might not be suitable for your business rules.
Decrease the number of locked records. The idea here is to lock as less records as possible while performing an update. To achieve this you can add one or more indexed columns to your table, then use the index in your WHERE clause of Update query.
An example of column is a product_id, or a category, maybe a user location(country, zip).
then your query will look something like this:
UPDATE coupons SET user_id = WHERE product_id = user_id is NULL LIMIT 1;
And now InnoDB will lock only records with product_id = <product_id>. this way you you'll have concurrency.
Hope this helps!

Insert into mysql 2 records with the second record referencing the newly created id from record 1?

I need to insert data into a mysql table in one query. The query inserts more than 1 record, and the 2nd record needs to get the id of the first one and populate it in a parentid column. I'm new at scripting queries, and I have no idea how to accomplish this.
example:
| id | parentid |
| 1 | null |
| 2 | 1 |
You could use LAST_INSERT_ID, but I don't think this would be considered "one query":
START TRANSACTION;
BEGIN;
INSERT INTO tablename (parent_id) VALUES
(NULL);
INSERT INTO tablename (parent_id) VALUES
(LAST_INSERT_ID());
COMMIT;
No, it cannot be done in a single query.
MySQL does not implement the standard SQL feature of "deferrable constraints" that would be necessary for this query (INSERT) to succeed. A solution is possible in PostgreSQL or Oracle, however.
This is not possible to achieve in MySQL since during the insertion of the second row, its foreign key constraint will fail because the first row does not yet "officially" exist -- though was inserted. However, if the FK constraint check is deferred until the end on the SQL statement (or until the end of the transaction), the query would complete successfully... but that's not implemented in MySQL.

How can I force MySQL to obtain a table-lock for a transaction?

I'm trying to perform an operation on a MySQL database table using the InnoDB storage engine. This operation is an INSERT-or-UPDATE type operation where I have an incoming set of data and there may be some data already in the table which must be updated. For example, I might have this table:
test_table
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| value | varchar(255) | NO | | NULL | |
+-------+--------------+------+-----+---------+----------------+
... and some sample data:
+----+-------+
| id | value |
+----+-------+
| 1 | foo |
| 2 | bar |
| 3 | baz |
+----+-------+
Now, I want to "merge" the following values:
2, qux
4, corge
My code ultimately ends up issuing the following queries:
BEGIN;
SELECT id, value FROM test WHERE id=2 FOR UPDATE;
UPDATE test SET id=2, value='qux' WHERE id=2;
INSERT INTO test (id, value) VALUES (4, 'corge');
COMMIT;
(I'm not precisely sure what happens with the SELECT ... FOR UPDATE and the UPDATE because I'm using MySQL's Connector/J library for Java and simply calling the updateRow method on a ResultSet. For the sake of argument, let's assume that the queries above are actually what are being issued to the server.)
Note: the above table is a trivial example to illustrate my question. The real table is more complicated and I'm not using the PK as the field to match when executing SELECT ... FOR UPDATE. So it's not obvious whether the record needs to be INSERTed or UPDATEd by just looking at the incoming data. The database MUST be consulted to determine whether to use an INSERT/UPDATE.
The above queries work just fine most of the time. However, when there are more records to be "merged", the SELECT ... FOR UPDATE and INSERT lines can be interleaved, where I cannot predict whether SELECT ... FOR UPDATE or INSERT will be issued and in what order.
The result is that sometimes transactions deadlock because one thread has locked a part of the table for the UPDATE operation and is waiting on a table lock (for the INSERT, which requires a lock on the primary-key index), while another thread has already obtained a table lock for the primary key (presumably because it issued an INSERT query) and is now waiting for a row-lock (or, more likely, a page-level lock) which is held by the first thread.
This is the only place in the code where this table is updated and there are no explicit locks currently being obtained. The ordering of the UPDATE versus INSERT seems to be the root of the issue.
There are a few possibilities I can think of to "fix" this.
Detect the deadlock (MySQL throws an error) and simply re-try. This is my current implementation because the problem is somewhat rare. It happens a few times per day.
Use LOCK TABLES to obtain a table-lock before the merge process and UNLOCK TABLES afterward. This evidently won't work with MariaDB Galera -- which is likely in our future for this product.
Change the code to always issue INSERT queries first. This would result in any table-level locks being acquired first and avoid the deadlock.
The problem with #3 is that it will require more complicated code in a method that is already fairly complicated (a "merge" operation is inherently complex). That more-complicated code also means roughly double the number of queries (SELECT to determine if the row id already exists, then later, another SELECT ... FOR UPDATE/UPDATE to actually update it). This table is under a reasonable amount of contention, so I'd like to avoid issuing more queries if possible.
Is there a way to force MySQL to obtain a table-level lock without using LOCK TABLES? That is, in a way that will work if we move to Galera?
I think you may be able to do what you want by acquiring a set of row and gap locks:
START TRANSACTION;
SELECT id, value
FROM test
WHERE id in (2, 4) -- list all the IDs you need to UPSERT
FOR UPDATE;
UPDATE test SET value = 'qux' WHERE id = 2;
INSERT INTO test (id, value) VALUES (4, 'corge');
COMMIT;
The SELECT query will lock the rows that already exist, and create gap locks for the rows that don't exist yet. The gap locks will prevent other transactions from creating those rows.

Reliably assign a row to a user in MySQL

I have a table with millions of rows:
id | info | uid
The uid is null by default. I want to select 10 rows and assign them to a uid, but I want to avoid any potential concurrency issues. So I think the only way to do that is to somehow select 10 rows based on certain criteria, lock those rows and then make my changes before unlocking them.
Is there a way to do row-locking in MySQL and PHP? Or is there some other way I can gaurantee that this doesnt happen:
user a queries the table where uid is null
finds row 1
user b queries the table where uid is null
finds row 1
user a process row and sets it back to null
user b process row and sets it back to null
See my problem?
What you probably need is SELECT ... FOR UPDATE. With this, retrieved rows are locked until a COMMIT or a ROLLBACK is made. So you can do something like :
START TRANSACTION;
SELECT * FROM yourTable WHERE uid IS NULL FOR UPDATE;
-- UPDATE to whatever you want
COMMIT;

MySQL - Huge difference in cardinality on what should be a duplicate table

On my development server I have a column indexed with a cardinality of 200.
The table has about 6 million rows give or take and I have confirmed it is an identical row count on the production server.
However the production servers index has a cardinality of 31938.
They are both mysql 5.5 however my dev server is Ubuntu server 13.10 and the production server is Windows server 2012.
Any ideas on what would cause such a difference in what should be the exact same data?
The data was loaded into the production server from a MySQL dump of the dev server.
EDIT: Its worth noting that I have queries that take about 15 minutes to run on my dev server that seem to run forever on the production server due to what i believe to be these indexing issues. Different amounts of rows are being pulled within sub-queries.
Mysql checksums might help you verify that the tables are the same
-- a table
create table test.t ( id int unsigned not null auto_increment primary key, r float );
-- some data ( 18000 rows or so )
insert into test.t (r) select rand() from mysql.user join mysql.user u2;
-- a duplicate
create table test.t2 select * from test.t;
-- introduce a difference somewhere in there
update test.t2 set r = 0 order by rand() limit 1;
-- and prove the tables are different easily:
mysql> checksum table test.t;
+--------+------------+
| Table | Checksum |
+--------+------------+
| test.t | 2272709826 |
+--------+------------+
1 row in set (0.00 sec)
mysql> checksum table test.t2
-> ;
+---------+-----------+
| Table | Checksum |
+---------+-----------+
| test.t2 | 312923301 |
+---------+-----------+
1 row in set (0.01 sec)
Beware the checksum locks tables.
For more advanced functionality, the percona toolkit can both checksum and sync tables (though it's based on master/slave replication scenarios so it might not be perfect for you).
Beyond checksumming, you might consider looking at REPAIR OR OPTIMIZE.