MySQL is handling one SQL query at the time? - mysql

If you got 100 000 users, is MySQL executing one SQL query at the time?
Because in my PHP code I check if a certain row exists; if it doesn't it creates one. If it does, it just updates the row counter.
It crossed my mind that perhaps 100 users are checking if the row exists at the same time, and when it doesn't they all create one row each.
If MySQL is handling them sequentially I know that it won't be an issue, then one user will check if it exists, if not, create it. The other user will check if it exists, and since that's the case, it just updates the counter.
But if they all check if it exists at the same time and let's say it doesn't, then they all create one row and the whole table structure will fail.
Would be great if someone could shed some light on this topic.

Use a UNIQUE constraint or, if viable, make the primary key one of your data items and the SQL server will prevent duplicate rows from being created. You can even use the "ON DUPLICATE KEY UPDATE ..." syntax to specify the alternate operation if the row already exists.
From your comments, it sounds like you could use the user_id as your primary key, in which case, you'd be able to use something like this:
INSERT INTO usercounts (user_id,usercount)
VALUES (id-goes-here,1)
ON DUPLICATE KEY UPDATE usercount=usercount+1;

If you put the check and insert into a transaction then you can avoid this problem. This way, the check and create will be run as one one query and there shouldn't be any confusion

Related

Adding a UNIQUE key to a large existing MySQL table which is receiving INSERTs/DELETEs

I have a very large table (dozens of millions of rows) and a UNIQUE index needs to be added to a column on that table. I know for a fact that the table does contain duplicated values on that key, which I need to clean up (by deleting rows/resetting the value of the column to something unique that I can automatically generate). A plus is that the rows which are already duplicated do not get modified anymore.
What would be the right approach to perform a change like this, given that I will be probably using the Percona pt-osc tool and there are continuous deletes/inserts on the table? My plan was:
Add code that ensures no dupe IDs get inserted anymore. Probably I need to add a separate table for this temporarily, since I want the database to enforce this for me and not the application - so insert into the "shadow table" with a unique index in a transaction together with my main table, rollback all inserts that try to insert duplicate values
Backfill the table by zapping all invalid column values which are within the primary key range below $current_pkey_value
Then add the index and use pt-osc to changeover the table
Is there anything I am missing?
Since we use pt-online-schema-change we are using triggers for performing the synchronisation from the existing table to a temp table. The tool actually has a special configuration key for this, --no-check-unique-key-change, which will do exactly what we need - agree to perform the ALTER TABLE and set up triggers in such a way that if a conflict occurs, INSERT .. IGNORE will be applied and the first row having used the now-unique value will win in the insert during synchronisation. For us this is a good tradeoff because all the duplicates we have seen resulted from data races, not from actual conflicts in the value generation process.

Which technique is more efficient for replacing records

I have an app that has to import TONS of data from a remote source. From 500 to 1500 entries per call.
Sometimes some of the data coming in will need to replace data already stored in the dB. If I had to guess, I would say once in 300 or 400 entries would one need to be replaced.
Each incoming entry has a unique ID. So I am trying to figure out if it is more efficient to always issue a delete command based on this ID or to check if there is already an entry THEN delete.
I found this SO post where it talks about the heavy work a dB has to do to delete something. But it is discussing a different issue so I'm not sure if it applies here.
Each incoming entry has a unique ID. So I am trying to figure out if it is more efficient to always issue a delete command based on this ID or to check if there is already an entry THEN delete.
Neither. Use INSERT ... ON DUPLICATE KEY UPDATE ....
Since you are using MySQL and you have a unique key then let MySQL do the work.
You can use
INSERT INTO..... ON DUPLICATE KEY UPDATE......
MySQL will try to insert a new record in the table, is the unique value exists in the table then MySQL will update all the field that you have set after the update
You can read more about the INSERT INTO..... ON DUPLICATE KEY UPDATE...... syntax on
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

Check for the respective data in a specific column and if not detected, then insert. Otherwise update

I need to insert new row into table foo. But before insert those data, I need to check there was already inserted a row for the respective user name. If there has been already inserted, then I need to update the current data with the new data.
I know to do this using PHP if condition. But I love to do this using MySQL functions/statements by just a one line. Please can anyone help me?
For the example, kindly use the following statement. It should be updated.
$in = "insert into foo(username, text) values('user-x', 'user-x-text')";
Mysql_query($in);
When searching for similar questions, I got this post: Similar question with an answer. But I was struggle to use that solution since I don't know, the process occur by that code snippet will get down the server resources like speed etc. Because this script will run about 20 times per user.
Thank you.
I think INSERT ... ON DUPLICATE KEY UPDATE should be able to work
Make username a UNIQUE index, it doesn't have to be a primary key
If I'm not mistaken, DUPLICATE KEY will run only when you have a collision in any of the columns you supply that is either a primary key or unique index. In your case, text column is neither so it will be ignored for collisions.
INSERT... ON DUPLICATE KEY UPDATE works on unique indexes as confirmed by the MySql docs

MySQL - Split up INSERT in to 2 queries maybe

I have an INSERT query which looks like:
$db->Query("INSERT INTO `surfed` (user, site) VALUES('".$data['id']."', '".$id."')");
Basically I want to insert just like the above query but if the site is already submitted by another user I don't want it to then re-submit the same $id in to the site column. But multiple users can view the same site and all users need to be in the same row as the site that they have viewed which causes the surfed table to have 10s of thousands of inserts which dramatically slows down the site.
Is there any way to maybe split up the insert in some way so that if a site is already submitted it won't then submit it again for another user. Maybe there's a way to use UPDATE so that there isn't an overload of inserts?
Thanks,
I guess the easiest way to do it would be setting up a stored procedure which executes a SELECT to check if the user-site-combination is already in the table. If not, you execute the insert statement. If that combination already exist, you're done and don't execute the insert.
Check out the manual on stored procedures
http://dev.mysql.com/doc/refman/5.1/en/create-procedure.html
You need to set a conditional statement that asks whether the id already exists then if it does update otherwise insert
If you don't need to know whether you actually inserted a line, you can use INSERT IGNORE ....
$db->Query("INSERT IGNORE INTO `surfed` (user, site) VALUES('".$data['id']."', '".$id."')");
But this assumes that you have a unique key defined for the columns.
IGNORE here will ignore the Integrity constraint violation error triggered by attempting to add the same unique key twice.
The MySQL Reference Manual on the INSERT syntax has some informations on that http://dev.mysql.com/doc/refman/5.5/en/insert.html

Preventing duplicate rows based on a column (MySQL)?

I'm building a system that updates its local database from other APIs frequently. I have Python-scripts set as cron jobs, and they do the job almost fine.
However, the one flaw is, that the scripts take ages to perform. When they are ran for the first time, the process is quick, but after that it takes nearly 20 minutes to go through a list of 200k+ items received from the third-party API.
The problem is that the script first gets all the rows from the database and adds their must-be-unique column value to a list. Then, when going through the API results, it checks if the current items must-be-unique value exists in the list. This gets really heavy, as the list has over 200k values in it.
Is there a way to check in an INSERT-query that, based on a single column, there is no duplicate? If there is, simply not add the new row.
Any help will be appreciated =)
If you add a UNIQUE key to the column(s) that have to contain UNIQUE values, MySQL will complain when you insert a row that violates this constraint.
You then have three options:
INSERT IGNORE will try to insert, and in case of violation, do nothing.
INSERT ... ON DUPLICATE KEY UPDATE will try to insert, and in case of violation, update the row to the new values
REPLACE will try to insert, and in case of violation, DELETE the offending existing row, and INSERT the new one.