php mySQL check for unique value needed on Unique col? - mysql

is it necessary to check for Unique Value before insert it in to a database? if the unique_col is predefine to be Unique Keys.
for example
SELECT unique_col FROM table WHERE unique_col != unique_val
INSERT INTO table (unique_col) VALUE(:unique_value)

Is it necessary to check? That depends how you are handling the error.
In general, the database is going to do the check anyway, so an additional check on your part is redundant. If you do the check, another thread might insert the same value between your check and the insert, so you can still get an error (this is called a race condition).
So, don't do the check, but do check for the error.

Related

Is there any disadvantages of unique column in MYSQL

i'd like to ask a question regarding Unique columns in MySQL.
Would like to ask experts on which is a better way to approach this problem, advantages or disadvantages if there is any.
Set a varchar column as unique
Do a SQL INSERT IGNORE
If affected rows > 0 proceed with running the code
versus
Leave a varchar column as not-unique
Do a search query to look for identical value
If there is no rows returned in query, Do a SQL INSERT
proceed with running the code
Neither of the 2 approaches is good.
You don't do INSERT IGNORE nor do you search. The searching part is also unreliable, because it fails at concurrency and compromises the integrity. Imagine this scenario: you and I try to insert the same info into the database. We connect at the same time. Code in question determines that there's no such record in the database, for both of us. We both insert the same data. Now your column isn't unique, therefore we'll end up with 2 records that are the same - your integrity now fails.
What you do is set the column to unique, insert and catch the exception in the language of your choice.
MySQL will fail in case of duplicate record, and any proper db driver for MySQL will interpret this as an exception.
Since you haven't mentioned what the language is, it's difficult to move forward with examples.
Defining a column as an unique index has a few advantages, first of all when you define it as an "unique index" MySQL can optimize your index for unique values (same as a primary key) because mysql doesn't have to check if there are more rows with the same value so it can use an optimized algoritme for the lookups.
Also you are assured that there never will be a double entry in your database instead of handeling this in multiple places in your code.
When you don't define it as UNIQUE you first need to check if an records exists in your table, and then insert something wich requires 2 queries (and even a full table lock) instead of 1 wich decreases your performance and is more error prone
http://dev.mysql.com/doc/refman/5.0/en/constraint-primary-key.html
I'm leaving the fact that you would use the INSERT IGNORE wich IGNORES the exception when the entry allready exists in the database (Still you could use it for high performance operations maybe in some sort of special case). A normal INSERT will give you the feedback if an entry allready exists
Putting a constraint like UNIQUE is better when it comes to query performance and data reliability. But there is also a trade-off when it comes to writing. So It's up to you which do you prefer. But in your case, since you also do INSERT IF NOT EXIST query, so I guess, it's better to just use the Constraint.

MySQL DUPLICATE KEY UPDATE fails to update due to a NOT NULL field which is already set

I have a MySQL DB which is using strict mode so I need to fill all NOT NULL values when I insert a row. The API Im creating is using just DUPLICATE KEY UPDATE functionality to do both inserts/updates.
The client application complains if any NOT NULL attributes are inserted which is expected.
Basic example (id is primary key and theare are two fields that are NOT NULL aaa and xxx)
INSERT INTO tablename (aaa, xxx, id ) VALUES ( "value", "value", 1)
ON DUPLICATE KEY UPDATE aaa=VALUES(aaa), xxx=VALUES(xxx)
All good so far. Once it is inserted, the system would allow doing updates. Nevertheless, I get the following error when updating only one of the fields.
INSERT INTO tablename (aaa, id ) VALUES ( "newValue", 1)
ON DUPLICATE KEY UPDATE aaa=VALUES(aaa)
java.sql.SQLException: Field 'xxx' doesn't have a default value
This Exception is a lie as the row is already inserted and xxx attribute has "value" as value. I would expect the following sentence to be equivalent to:
UPDATE tablename SET aaa="newValue" WHERE id=1
I would be glad if someone can shed some light about this issue.
Edit:
I can use the SQL query in PhpMyAdmin successfully to update just one field so I am afraid that this is not a SQL problem but a driver problem with JDBC. That may not have solution then.
#Marc B: Your insight is probably true and would indicate what I just described. That would mean that there is a bug in JDBC as it should not do that check when the insert is of ON DUPLICATE type as there may be a default value for the row after all. Can't provide real table data but I believe that all explained above is quite clear.
#ruakh: It does not fail to insert, neither I am expecting delayed validation. One requirement I have is to have both insert/updates done using the same query as the servlet does not know if the row exists or not. The JAVA API service only fails to update a row that has NOT NULL fields which were already filled when the insert was done. The exception is a lie because the field DOES have a default value as it was inserted before the update.
This is a typical case of DRY / SRP fail; in an attempt to not duplicate code you've created a function that violates the single responsibility principle.
The semantics of an INSERT statement is that you expect no conflicting rows; the ON DUPLICATE KEY UPDATE option is merely there to avoid handling the conflict inside your code, requiring another separate query. This is quite different from an UPDATE statement, where you would expect at least one matching row to be present.
Imagine that MySQL would only check the columns when an INSERT doesn't conflict and for some reason a row was just removed from the database and your code that expects to perform an update has to deal with an exception it doesn't expect. Given the difference in statement behaviour it's good practice to separate your insert and update logic.
Theory aside, MySQL puts together an execution plan when a query is run; in the case of an INSERT statement it has to assume that it might succeed when attempted, because that's the most optimal strategy. It prevents having to check indices etc. only to find out later that a column is missing.
This is per design and not a bug in JDBC.

Enforcing unique columns

If a column is made unique in a database table structure, is there any need to do a check to see if a new value to be inserted already exists in the table via script? Or would it be fine just to insert values letting the DBMS filter non-new values?
When you will try to insert a duplicate value in a unique column, your insert query will fail. So it might be a good idea to make sure you are checking to see if your insert queries went well or not. Althought regardless of the situation you should always check if your insert query went through or not :)
You should always validate your data before inserting it on the database. That being said, what will happen if you try to insert a non-unique value on a unique defined column is an SQLexception.
In order to validate this before insertion, you could for example do a
select 1
from mytable_with_unique_column
where my_unique_column = myNewValue
If the query returns anything, then simply do not try to insert as that will throw an SQLException.
Verification of unique constraint is definitely an overkill.
When you put unique constraint on your column, an implicit index is created for this column. Thus, DBMS can (and will) verify your data much faster. Unfortunately, when you try to insert duplicate value into your column, you will get constraint violation exception you have to deal with (but you have to deal with such error while using script verification either).
Good luck.
You can combine the insert statement and validation select into one statement:
insert into mytable_with_unique_column (...) values (...)
where not exists
(
select 1
from mytable_with_unique_column
where my_unique_column = myNewValue
)
This will only insert a new row if there isn't already a row with the given unique value.

MySQL is handling one SQL query at the time?

If you got 100 000 users, is MySQL executing one SQL query at the time?
Because in my PHP code I check if a certain row exists; if it doesn't it creates one. If it does, it just updates the row counter.
It crossed my mind that perhaps 100 users are checking if the row exists at the same time, and when it doesn't they all create one row each.
If MySQL is handling them sequentially I know that it won't be an issue, then one user will check if it exists, if not, create it. The other user will check if it exists, and since that's the case, it just updates the counter.
But if they all check if it exists at the same time and let's say it doesn't, then they all create one row and the whole table structure will fail.
Would be great if someone could shed some light on this topic.
Use a UNIQUE constraint or, if viable, make the primary key one of your data items and the SQL server will prevent duplicate rows from being created. You can even use the "ON DUPLICATE KEY UPDATE ..." syntax to specify the alternate operation if the row already exists.
From your comments, it sounds like you could use the user_id as your primary key, in which case, you'd be able to use something like this:
INSERT INTO usercounts (user_id,usercount)
VALUES (id-goes-here,1)
ON DUPLICATE KEY UPDATE usercount=usercount+1;
If you put the check and insert into a transaction then you can avoid this problem. This way, the check and create will be run as one one query and there shouldn't be any confusion

Best way to test for duplicate keys in a database

This is more of a correctness question. Say I have a table with a primary key column in my database. In my DAO code I have a function called insertRow(string key) that will return true if the key doesn't exist in the table and insert a new row with the key. Otherwise, if a row already exists with that key it returns false. Is it better/worse to have insertRow first check for the existence of the key or just go ahead and do the insert and catch the duplicate key error? Or is saving on a single select statement too trivial an optimization to even bother worrying about?
So in sudo code:
boolean insertRow(String key){
//potentially a select + insert
if(select count(*) from mytable where key = "somekey" == 0){
insert into mytable values("somekey")
return true;
}
return false;
}
or
boolean insertRow(String key){
try{
//always just 1 insert
insert into mytable values("somekey")
return true;
} catch (DuplicateKeyException ex){}
return false;
}
Insert the row, catch the duplicate key error. My personal choice
I reckon this might perform better, depending on the cost of throwing the exception against the cost of hitting the db twice.
Only by testing both scenarios wilil you know for sure
Try the insert, then catch the error.
Otherwise, you could still have a concurrency issue between two active SPIDs (lets say two web users on the system at the same time), in which case, you'd have to catch the error anyway:
User1: Check for key "newkey"? Not in database.
User2: Check for key "newkey"? Not in database.
User1: Insert key "newkey". Success.
User2: Insert key "newkey". Duplicate Key Error.
You can mitigate this by using explicit transactions or setting the transaction-isolation level, but its just easier to use the second technique, unless you are sure only one application thread is running against the database at all times.
In my opinion, this is an excellent case for using exceptions (since the duplicate is exceptional), unless you're counting on there to, most of the time, be a row already (i.e., you're doing "insert, but update if exists" logic.)
If the purpose of the code is to update, then you should either use the select or an INSERT ... ON DUPLICATE KEY UPDATE clause (if supported by your database engine.) Alternatively, make a stored procedure that handles this logic for you.
Second one because first option hits twice the db while second one just once.
The short answer is that you need to test it for yourself. My gut feeling is that doing a small select to check for the existence will perform better, but you need to verify that for yourself at volume and see whichever performs better.
In general, I don't like to leave my error checking entirely to the exception engine of whatever it is I'm doing. In other words, if I can check to see if what I'm doing is valid rather than just having an exception thrown, that's generally what I do.
I would suggest, however, using an EXISTS query rather than count(*)
if(exists (select 1 from mytable where key = "somekey"))
return false
else
insert the row
All that being said (from an abstract, engine-neutral perspective), I'm pretty sure that MySQL has some keywords that can be used to insert a row into a table only if the primary key doesn't exist. This may be your best bet, assuming you're OK with using MySQL-specific keywords.
Another option would be to place the logic entirely in the SQL statement.
another two options in mysql are to use
insert ignore into....
and
insert into .... on duplicate key update field=value
including on duplicate key update field=field
See: http://dev.mysql.com/doc/refman/5.0/en/insert.html
Edit:
You can test affected_rows for whether or not the insert had an effect or not.
Now that I've found Martin Fowler's book online, a decent way to do it is with a key table- see pg 222 for more info.