We have table MySql 5.5:
CREATE TABLE IF NOT EXISTS `invoices` (
`id` varchar(36) NOT NULL,
`client_id` smallint(4) NOT NULL,
`invoice_number` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `client_id_2` (`client_id`,`invoice_number`),
KEY `client_id` (`client_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
We insert data into that table like this:
INSERT INTO `invoices` ( `id` , `client_id` , `invoice_number` )
VALUES (
UUID(),
10 ,
( SELECT (MAX(`invoice_number`) +1) as next_invoice_number FROM `invoices` WHERE `client_id` = 10 )
);
"10" is client_id value.
It works but, it has bad concurrency. How can I have working solution, which has good concurrency?
Composite-primary-key auto increment is not a solution. We need autoincrement per client_id value. Composite-primary-key auto increment gives autoincrement all over table not per client_id column value.
Not sure what you meant by bad concurrency here. Though every DML operation runs on implicit transaction, you can as well wrap it in a explicit transaction block by using begin transaction ... end construct.
It seems that mysql is really locking the whole table on insert into ... select.
Below a strategy which is working for us (for a similar problem) in pseudocode
function insert_user(){
begin_transaction
next_invoice_number = select_max_invoice_number + 1
insert_user(next_invoice_number)
end_transaction
}
function perform_insert(){
try
insert_user
catch RecordNotUniqueError
perform_insert
end
}
This requires performing a query in some high level programming language.
You basically start a tansaction, where you first perform a query to read the next invoice number for the user. Afterwards you perform the insert query with next_invoice_number and hope for the best. In case there is a concurrent process trying to insert with the same invoice number for the same user, the transaction will fail for one of the processes. It can then try to repeat it. At the end there should be no concurrent operation for the same invoice number and every transaction will succeed.
I see a number of issues here.
First, for each invoice registration you are scanning the same table to find what the next invoice number should be used for this particular customer.
A far faster solution is to have a table with two columns: Customer_ID (key) and Last invoice ID.
Whenever you need to register a new invoice, you simply get-and-update the new invoice number from this new table and use it in the insert.
Second, what makes you think that the operation you are showing in your example should not lock the table?
Since this is happening only sometimes, the best solution is to minimize the probability of a collision, and the approach presented here will certainly do that.
Reformulate the query this way. This is probably simpler and faster.
INSERT INTO `invoices` ( `id` , `client_id` , `invoice_number` )
SELECT UUID(),
10 ,
MAX(`invoice_number`) +1
FROM `invoices`
WHERE `client_id` = 10;
Is this in a transaction by itself? With autocommit=1?
Or is this a part of a much larger set of commands? And possibly they are part of what is leading to the error?
How will you subsequently get the UUID and/or invoice_number for the client? Doesn't the application need to display them and/or store them in some other table?
Related
I recently encountered an error in my application with concurrent transactions. Previously, auto-incrementing for compound key was implemented using the application itself using PHP. However, as I mentioned, the id got duplicated, and all sorts of issues happened which I painstakingly fixed manually afterward.
Now I have read about related issues and found suggestions to use trigger.
So I am planning on implementing a trigger somewhat like this.
DELIMITER $$
CREATE TRIGGER auto_increment_my_table
BEFORE INSERT ON my_table FOR EACH ROW
BEGIN
SET NEW.id = SELECT MAX(id) + 1 FROM my_table WHERE type = NEW.type;
END $$
DELIMITER ;
But my doubt regarding concurrency still remains. Like what if this trigger was executed concurrently and both got the same MAX(id) when querying?
Is this the correct way to handle my issue or is there any better way?
An example - how to solve autoincrementing in compound index.
CREATE TABLE test ( id INT,
type VARCHAR(192),
value INT,
PRIMARY KEY (id, type) );
-- create additional service table which will help
CREATE TABLE test_sevice ( type VARCHAR(192),
id INT AUTO_INCREMENT,
PRIMARY KEY (type, id) ) ENGINE = MyISAM;
-- create trigger which wil generate id value for new row
CREATE TRIGGER tr_bi_test_autoincrement
BEFORE INSERT
ON test
FOR EACH ROW
BEGIN
INSERT INTO test_sevice (type) VALUES (NEW.type);
SET NEW.id = LAST_INSERT_ID();
END
db<>fiddle here
creating a service table just to auto increment a value seems less than ideal for me. – Mohamed Mufeed
This table is extremely tiny - you may delete all records except one per group with largest autoincremented value in this group anytime. – Akina
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=61f0dc36db25dd5f0cf4647d8970cdee
You may schedule excess rows removing (for example, daily) in service event procedure.
I have managed to solve this issue.
The answer was somewhat in the direction of Akina's Answer. But not quite exactly.
The way I solved it did indeed involved an additional table but not like the way He suggested.
I created an additional table to store meta data about transactions.
Eg: I had table_key like this
CREATE TABLE `journals` (
`id` bigint NOT NULL AUTO_INCREMENT,
`type` smallint NOT NULL DEFAULT '0',
`trans_no` bigint NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `transaction` (`type`,`trans_no`)
)
So I created a meta_journals table like this
CREATE TABLE `meta_journals` (
`type` smallint NOT NULL,
`next_trans_no` bigint NOT NULL,
PRIMARY KEY (`type`),
)
and seeded it with all the different types of journals and the next sequence number.
And whenever I insert a new transaction to the journals I made sure to increment the next_trans_no of the corresponding type in the meta_transactions table. This increment operation is issued inside the same database TRANSACTION, i.e. inside the BEGIN AND COMMIT
This allowed me to use the exclusive lock acquired by the UPDATE statement on the row of meta_journals table. So when two insert statement is issued for the journal concurrently, One had to wait until the lock acquired by the other transaction is released by COMMITing.
I am currently facing an issue with designing a database table and updating/inserting values into it.
The table is used to collect and aggregate statistics that are identified by:
the source
the user
the statistic
an optional material (e.g. item type)
an optional entity (e.g. animal)
My main issue is, that my proposed primary key is too large because of VARCHARs that are used to identify a statistic.
My current table is created like this:
CREATE TABLE `Statistics` (
`server_id` varchar(255) NOT NULL,
`player_id` binary(16) NOT NULL,
`statistic` varchar(255) NOT NULL,
`material` varchar(255) DEFAULT NULL,
`entity` varchar(255) DEFAULT NULL,
`value` bigint(20) NOT NULL)
In particular, the server_id is configurable, the player_id is a UUID, statistic is the representation of an enumeration that may change, material and entity likewise. The value is then aggregated using SUM() to calculate the overall statistic.
So far it works but I have to use DELETE AND INSERT statements whenever I want to update a value, because I have no primary key and I can't figure out how to create such a primary key in the constraints of MySQL.
My main question is: How can I efficiently update values in this table and insert them when they are not currently present without resorting to deleting all the rows and inserting new ones?
The main issue seems to be the restriction MySQL puts on the primary key. I don't think adding an id column would solve this.
Simply add an auto-incremented id:
CREATE TABLE `Statistics` (
statistis_id int auto_increment primary key,
`server_id` varchar(255) NOT NULL,
`player_id` binary(16) NOT NULL,
`statistic` varchar(255) NOT NULL,
`material` varchar(255) DEFAULT NULL,
`entity` varchar(255) DEFAULT NULL,
`value` bigint(20) NOT NULL
);
Voila! A primary key. But you probably want an index. One that comes to mind:
create index idx_statistics_server_player_statistic on statistics(server_id, player_id, statistic)`
Depending on what your code looks like, you might want additional or different keys in the index, or more than one index.
Follow the below hope it will solve your problem :-
- First use a variable let suppose "detailed" as money with your table.
- in your project when you use insert statement then before using statement get the maximum of detailed (SELECT MAX(detailed)+1 as maxid FROM TABLE_NAME( and use this as use number which will help you to FETCH,DELETE the record.
-you can also update with this also BUT during update MAXIMUM of detailed is not required.
Hope you understand this and it will help you .
I have dug a bit more through the internet and optimized my code a lot.
I asked this question because of bad performance, which I assumed was because of the DELETE and INSERT statements following each other.
I was thinking that I could try to reduce the load by doing INSERT IGNORE statements followed by UPDATE statements or INSERT .. ON DUPLICATE KEY UPDATE statements. But they require keys to be useful which I haven't had access to, because of constraints in MySQL.
I have fixed the performance issues though:
By reducing the amount of statements generated asynchronously (I know JDBC is blocking but it worked, it just blocked thousand of threads) and disabling auto-commit, I was able to improve the performance by 600 times (from 60 seconds down to 0.1 seconds).
Next steps are to improve the connection string and gaining even more performance.
I want to update the statistic count in mysql.
The SQL is as follow:
REPLACE INTO `record_amount`(`source`,`owner`,`day_time`,`count`) VALUES (?,?,?,?)
Schema :
CREATE TABLE `record_amount` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'id',
`owner` varchar(50) NOT NULL ,
`source` varchar(50) NOT NULL ,
`day_time` varchar(10) NOT NULL,
`count` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `src_time` (`owner`,`source`,`day_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
However, it caused a DEADLOCK exception in multi-processes running (i.e. Map-Reduce).
I've read some materials online and confused about those locks. I know innodb uses row-level lock. I can just use the table-lock to solve the business problem but it is a little extreme. I found some possible solutions:
change REPLACE INTO to transaction with SELECT id FOR UPDATE and UPDATE
change REPLACE INTO to INSERT ... ON DUPLICATE KEY UPDATE
I have no idea that which is practical and better. Can someone explain it or offer some links for me to read and study? Thank you!
Are you building a summary table, one source row at a time? And effectively doing UPDATE ... count = count+1? Throw away the code and start over. MAP-REDUCE on that is like using a sledge hammer on a thumbtack.
INSERT INTO summary (source, owner, day_time, count)
SELECT source, owner, day_time, COUNT(*)
FROM raw
GROUP BY source, owner, day_time
ON DUPLICATE KEY UPDATE count = count + VALUES(count);
A single statement approximately like that will do all the work at virtually disk I/O speed. No SELECT ... FOR UPDATE. No deadlocks. No multiple threads. Etc.
Further improvements:
Get rid of the AUTO_INCREMENT; turn the UNIQUE into PRIMARY KEY.
day_time -- is that a DATETIME truncated to an hour? (Or something like that.) Use DATETIME, you will have much more flexibility in querying.
To discuss further, please elaborate on the source data (`CREATE TABLE, number of rows, frequency of processing, etc) and other details. If this is really a Data Warehouse application with a Summary table, I may have more suggestions.
If the data is coming from a file, do LOAD DATA to shovel it into a temp table raw so that the above INSERT..SELECT can work. If it is of manageable size, make raw Engine=MEMORY to avoid any I/O for it.
If you have multiple feeds, my high-speed-ingestion blog discusses how to have multiple threads without any deadlocks.
I have a problem similar to
SQL: selecting rows where column value changed from previous row
The accepted answer by ypercube which i adapted to
CREATE TABLE `schange` (
`PersonID` int(11) NOT NULL,
`StateID` int(11) NOT NULL,
`TStamp` datetime NOT NULL,
KEY `tstamp` (`TStamp`),
KEY `personstate` (`PersonID`, `StateID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `states` (
`StateID` int(11) NOT NULL AUTO_INCREMENT,
`State` varchar(100) NOT NULL,
`Available` tinyint(1) NOT NULL,
`Otherstatuseshere` tinyint(1) NOT NULL,
PRIMARY KEY (`StateID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SELECT
COALESCE((#statusPre <> s.Available), 1) AS statusChanged,
c.PersonID,
c.TStamp,
s.*,
#statusPre := s.Available
FROM schange c
INNER JOIN states s USING (StateID),
(SELECT #statusPre:=NULL) AS d
WHERE PersonID = 1 AND TStamp > "2012-01-01" AND TStamp < "2013-01-01"
ORDER BY TStamp ;
The query itself worked just fine in testing, and with the right mix of temporary tables i was able to generate reports with daily sum availability from a huge pile of data in virtually no time at all.
The real problem comes in when i discovered that the tables where using the MyISAM engine, which we have completely abandoned, recreated the tables to use InnoDB, and noticed the query no longer works as expected.
After some bashing head into wall i have discovered that MyISAM seems to go over the columns each row in order (selecting statusChanged before updating #statusPre), while InnoDB seems to do all the variable assigning first, and only after that it populates result rows, regardless if the assigning happens in the select or where clauses, in functions (coalesce, greater etc), subqueries or otherwise.
Trying to accomplish this in a query without variables seems to always end the same way, a subquery requiring exponentially more time to process the more rows are in the set, resulting in a excrushiating minutes (or hours) long wait to get beginning and ending events for one status, while a finished report should include daily sums of multiple.
Can this type of query work on the InnoDB engine, and if so, how should one go about it?
or is the only feasible option to go for a database product that supports WITH statements?
Removing
KEY personstate (PersonID, StateID)
fixes the problem.
No idea why tho, but it was not really required anyway, the timestamp key is the more important one and speeds up the query nicely.
I have a table called promotion_codes
CREATE TABLE promotion_codes (
id int(10) UNSIGNED NOT NULL auto_increment,
created_at datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
code varchar(255) NOT NULL,
order_id int(10) UNSIGNED NULL DEFAULT NULL,
allocated_at datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This table is pre-populated with available codes that will be assigned to orders that meet a specific criteria.
What I need to ensure is that after the ORDER is created, that I obtain an available promotion code and update its record to reflect that it has been allocated.
I am not 100% sure how to not grab the same record twice if simultaneous requests come in.
I have tried locking the row during a select and locking the row during a update - both still seem to allow a second (simultaneous) attempt to grab the same record - which is what I want to avoid
UPDATE promotion_code
SET allocated_at = "' . $db_now . '", order_id = ' . $donation->id . '
WHERE order_id IS NULL LIMIT 1
You can add a second table which holds all used codes. So you can use an unique constraint in the assignment table to make sure that one code is not assigned twice.
CREATE TABLE `used_codes` (`usage` INTEGER PRIMARY KEY auto_increment,
`id` INTEGER NOT NULL UNIQ, -- This makes sure, that there are no two assignments of one code
allocated_at datetime NOT NULL);
You add the ID of an used code into the used_codes table, and query which code you used afterwards. When this two operations are in one transaction, the entire transaction will fail when there is a second try to use the same code.
I did not test the following code, you might to adjust it.
Also you need to make sure that you have your server meets the requirements for transactions.
-- There are changes which have to be atomic, so don't use autocommit
SET autocommit = 0;
BEGIN TRANSACTION
INSERT INTO `used_codes` (`id`, `allocated_at`) VALUES
(SELECT `id` FROM `promotion_codes`
WHERE NOT `id` in (SELECT `id` FROM `used_codes`)
LIMIT 1), now());
SELECT `code` FROM `promotion_codes` WHERE `id` =
-- You might need to adjust the extraction of insertion ID, since
-- I don't know if parallel running transactions can see the maximum
-- their maximum IDs. But there should be a way to extract the last assigned
-- ID within this transaction.
(SELECT `id` FROM `used_codes` HAVING `usage` = max(`usage`));
COMMIT
You can use the returned code if the transaction sucseeded. If there where more than one processes running to use the same code, only one of them succed, while the rest fails with insert errors about the duplicated row. In your software you need to distinguish between the duplicated row error and other errors, and reexecute the statement on duplication errors.