I have application which is sending bulk invitation to my users. for that I am inserting thousands of records in a one table. My stored procedure accept comma separated string from user as a parameter, and then split by comma and parse each email from that string in a loop, and insert individual email as a record in the table.
The main problem is when multiple users when send their request to this stored procedure at the same time, mysql throwing "dead lock" error, because each user connecting with different connection to mysql.
So, my question is what is the proper solution to do this kind of task? or this is problem with my database configuration? I am using Amazon RDS (mysql) large instance. and my user can send 2000 emails at a time. and one more thing, I am not using transaction... commit... rollback.
I have posted this use case as a question earlier, but I didn't get any proper answer. Here is that links :
1) Deadlock found when trying to get lock; try restarting transaction
2) https://stackoverflow.com/questions/19091968/deadlock-found-when-trying-to-get-lock-try-restarting-transaction-2nd-try
Thanks
Related
I am working on a high scale application of the order of 35000 Qps, using Hibernate and MySQL.
A large table has AutoIncrement Primary key, and generation defined is IDENTITY at Hibernate. Show Sql is true as well.
Whenever an Insert happens I see only one query being fired in DB, which is an
Insert statement.
Few Questions Follow:
1) I was wondering how does Hibernate get the AutoIncrement Value after insert?
2) If the answer is "SELECT LAST_INSERT_ID()", why does it not show up at VividCortex or in Show Sql Logs...?
3) How does "SELECT LAST_INSERT_ID()" account for multiple autoincrements in different tables?
4) If MySql returns a value on Insert, why aren't the MySql clients built so that we can see what is being returned?
Thanks in Advance for all the help.
You should call SELECT LAST_INSERT_ID().
Practically, you can't do the same thing as the MySQL JDBC driver using another MySQL client. You'd have to write your own client that reads and writes the MySQL protocol.
The MySQL JDBC driver gets the last insert id by parsing packets of the MySQL protocol. The last insert id is returned in this protocol by a MySQL result set.
This is why SELECT LAST_INSERT_ID() doesn't show up in query metrics. It's not calling that SQL statement, it's picking the integer out of the result set at the protocol level.
You asked how it's done internally. A relevant line of code is https://github.com/mysql/mysql-connector-j/blob/release/8.0/src/main/protocol-impl/java/com/mysql/cj/protocol/a/result/OkPacket.java#L55
Basically, it parses an integer from a known position in a packet as it receives a result set.
I'm not going to go into any more detail about parsing the protocol. I don't have experience coding a MySQL protocol client, and it's not something I wish to do.
I think it would not be a good use of your time to implement your own MySQL client.
It probably uses the standard JDBC mechanism to get generated values.
It's not
You execute it imediately after inserting in one table, and you thus get the values that have been generated by that insert. But that's not what is being used, so it's irrelevant
Not sure what you mean by that: the MySQL JDBC driver allows doing that, using the standard JDBC API
(Too long for a comment.)
SELECT LAST_INSERT_ID() uses the value already available in the connection. (This may explain its absence from any log.)
Each table has its own auto_inc value.
(I don't know any details about Hibernate.)
35K qps is possible, but it won't be easy.
Please give us more details on the queries -- SELECTs? writes? 35K INSERTs?
Are you batching the inserts in any way? You will need to do such.
What do you then use the auto_inc value in?
Do you use BEGIN..COMMIT? What value of autocommit?
I am using Node.js with the express.js framework on top of it and MySql database.
I have an endpoint for registration that takes 3 params:
Email, username, and password
And then it queries the database using SELECT to see if the username or email are taken. And if not it continues to hash the password, create a new row in the database, send an email confirmation and so on.
The problem is when someone submits two post request quickly since it takes some time to process the data insertion the request let two users have the same username/email.
Basically what is happening is that the second request query the database before the first request even insert the data (the new user) and therefore the results the second request return are that the username and the email are free.
I was wondering how can I prevent issues like that in the future.
In a race condition like this, the place where the buck should stop is the database itself. So, you should add a unique constraint on the username field, if you don't already have one:
ALTER TABLE users ADD CONSTRAINT username_unique UNIQUE (username);
What will happen now if two threads come in at almost the same time, is that each request will work its way through the code. But only one request will obtain a lock to write the new user record to the table. The other request would fail with a database error, which your Node application should able to catch.
Note that you might also want a unique constraint on the email field.
Rather than making modifications to the schema I suggest you deal with this in your node APP.
Using mutex locks are the ideal approach to solving these kinds of problems. There are different packages like redlock etc to solve these issues although redlock needs redis to work. There are other modules that don't require redis.
Also have a read Mutex locks in node
I have a big amount of data in a mysql database. I want to poll data from database and push them in a activemq in camel. the connection between database and queue will be lost every 15 minutes. some of the messages are lost during connection interruption. I need to know which messages are lost to poll them again from database. the messages should not be send more that one time. and this should be done without any changes in database schema.(i can not add any Boolean status field to my database).
any suggestion is welcomed.
Essentially, you need to have some unique identifier in the data you pull from the source database. Maybe it is whatever has already been defined as the primary key. Or, maybe the table has some timestamp field. Or, maybe some combination of fields will be unique.
Once you identify that, when you are putting the data into the target, reject any key that is already in the target. You could use Camel's "idempotency" features, but if you are able to check for the key in the target database, you probably won't need anything else.
If you have to make the decision about what to send, but do not have access to your remote database from App #1, you'll need to keep a record on the other side of the firewall.
You would need to do this, even if the connection did not break every 15 minutes...because you could have failures for other reasons.
If you can have an Idempotency database for App#1, another approach could be to transfer data from the local database to some other local table, and read from this. Then you poll this other table, and delete whenever the send is successful.
Example:
It looks like you're using MySql. If both databases are on MySql, you could look into MySql data-replication, rather than using your own app, with Camel.
I am sending bulk invitations to my users from my website. For that I am passing comma separated string which contains emails to my stored procedure in mysql. In this stored procedure one while loop parses each email (using substring() separated by comma) and checks existing database and then Insert to table if it's absent or generates email link with guid if that email already exists. The process is working fine for small batches (eg. below 200-250 emails), but if batch is larger (250+ emails), whole process stucks and next requests are getting deadlock errors (original error is: "Deadlock found when trying to get lock; try restarting transaction"). So, I have planned to do a while loop at my javascript or c# file (programming language file) and send each email to store procedure.
In above scenario number of mysql connections would increase and max connection error might be occurre.
So, I want to ask, what is the best method to do this kind of jobs with mysql?
I think giving emails to procedure one at a time is a correct solution yet you don't need to make new connection or even new request for each item. Most languages support prepared statement execution (and here's the answer on how to use them in C#).
The deadlocks in turn can be cause by your own code but without a snips of it it's hard to tell. Maybe your procedure isn't re-entrant or the data can be accessed from some other location.
I'm getting data from an MSSQL DB ("A") and inserting into a MySQL DB ("B") using the date created in the MSSQL DB. I'm doing it with simple logics, but there's got to be a faster and more efficient way of doing this. Below is the sequence of logics involved:
Create one connection for MSSQL DB and one connection for MySQL DB.
Grab all of data from A that meet the date range criterion provided.
Check to see which of the data obtained are not present in B.
Insert these new data into B.
As you can imagine, step 2 is basically a loop, which can easily max out the time limit on the server, and I feel like there must be a way of doing this must faster and during when the first query is made. Can anyone point me to right direction to achieve this? Can you make "one" connection to both of the DBs and do something like below?
SELECT * FROM A.some_table_in_A.some_column WHERE
"it doesn't exist in" B.some_table_in_B.some_column
A linked server might suit this
A linked server allows for access to distributed, heterogeneous
queries against OLE DB data sources. After a linked server is created,
distributed queries can be run against this server, and queries can
join tables from more than one data source. If the linked server is
defined as an instance of SQL Server, remote stored procedures can be
executed.
Check out this HOWTO as well
If I understand your question right, you're just trying to move things in the MSSQL DB into the MySQL DB. I'm also assuming there is some sort of filter criteria you're using to do the migration. If this is correct, you might try using a stored procedure in MSSQL that can do the querying of the MySQL database with a distributed query. You can then use that stored procedure to do the loops or checks on the database side and the front end server will only need to make one connection.
If the MySQL database has a primary key defined, you can at least skip step 3 ("Check to see which of the data obtained are not present in B"). Use INSERT IGNORE INTO... and it will attempt to insert all the records, silently skipping over ones where a record with the primary key already exists.