I have a MySQL table, that has a unique index constraint over multiple (2) columns: "username", and "provider". The index works fine and does not allow inserting rows with duplicated usernane+provider combination.
But I do want to allow insertion of the same usernane+provider combination in case another column, "active", has the value false. I do not have a problem with multiple duplicate rows all marked as active FALSE.
MySQL, unlike Microsoft SQL Server, does not support conditional/filterable indexing.
While I could not find a more appropriate solution, I thought to use a trigger (after insertion to the table) to check whether the value that was just inserted is duplicated, considering the "active" value.
Can I get the trigger to fail the INSERT statement? Is there a better solution for this case?
Related
I have a very large table (dozens of millions of rows) and a UNIQUE index needs to be added to a column on that table. I know for a fact that the table does contain duplicated values on that key, which I need to clean up (by deleting rows/resetting the value of the column to something unique that I can automatically generate). A plus is that the rows which are already duplicated do not get modified anymore.
What would be the right approach to perform a change like this, given that I will be probably using the Percona pt-osc tool and there are continuous deletes/inserts on the table? My plan was:
Add code that ensures no dupe IDs get inserted anymore. Probably I need to add a separate table for this temporarily, since I want the database to enforce this for me and not the application - so insert into the "shadow table" with a unique index in a transaction together with my main table, rollback all inserts that try to insert duplicate values
Backfill the table by zapping all invalid column values which are within the primary key range below $current_pkey_value
Then add the index and use pt-osc to changeover the table
Is there anything I am missing?
Since we use pt-online-schema-change we are using triggers for performing the synchronisation from the existing table to a temp table. The tool actually has a special configuration key for this, --no-check-unique-key-change, which will do exactly what we need - agree to perform the ALTER TABLE and set up triggers in such a way that if a conflict occurs, INSERT .. IGNORE will be applied and the first row having used the now-unique value will win in the insert during synchronisation. For us this is a good tradeoff because all the duplicates we have seen resulted from data races, not from actual conflicts in the value generation process.
i'd like to ask a question regarding Unique columns in MySQL.
Would like to ask experts on which is a better way to approach this problem, advantages or disadvantages if there is any.
Set a varchar column as unique
Do a SQL INSERT IGNORE
If affected rows > 0 proceed with running the code
versus
Leave a varchar column as not-unique
Do a search query to look for identical value
If there is no rows returned in query, Do a SQL INSERT
proceed with running the code
Neither of the 2 approaches is good.
You don't do INSERT IGNORE nor do you search. The searching part is also unreliable, because it fails at concurrency and compromises the integrity. Imagine this scenario: you and I try to insert the same info into the database. We connect at the same time. Code in question determines that there's no such record in the database, for both of us. We both insert the same data. Now your column isn't unique, therefore we'll end up with 2 records that are the same - your integrity now fails.
What you do is set the column to unique, insert and catch the exception in the language of your choice.
MySQL will fail in case of duplicate record, and any proper db driver for MySQL will interpret this as an exception.
Since you haven't mentioned what the language is, it's difficult to move forward with examples.
Defining a column as an unique index has a few advantages, first of all when you define it as an "unique index" MySQL can optimize your index for unique values (same as a primary key) because mysql doesn't have to check if there are more rows with the same value so it can use an optimized algoritme for the lookups.
Also you are assured that there never will be a double entry in your database instead of handeling this in multiple places in your code.
When you don't define it as UNIQUE you first need to check if an records exists in your table, and then insert something wich requires 2 queries (and even a full table lock) instead of 1 wich decreases your performance and is more error prone
http://dev.mysql.com/doc/refman/5.0/en/constraint-primary-key.html
I'm leaving the fact that you would use the INSERT IGNORE wich IGNORES the exception when the entry allready exists in the database (Still you could use it for high performance operations maybe in some sort of special case). A normal INSERT will give you the feedback if an entry allready exists
Putting a constraint like UNIQUE is better when it comes to query performance and data reliability. But there is also a trade-off when it comes to writing. So It's up to you which do you prefer. But in your case, since you also do INSERT IF NOT EXIST query, so I guess, it's better to just use the Constraint.
I have a MySQL DB which is using strict mode so I need to fill all NOT NULL values when I insert a row. The API Im creating is using just DUPLICATE KEY UPDATE functionality to do both inserts/updates.
The client application complains if any NOT NULL attributes are inserted which is expected.
Basic example (id is primary key and theare are two fields that are NOT NULL aaa and xxx)
INSERT INTO tablename (aaa, xxx, id ) VALUES ( "value", "value", 1)
ON DUPLICATE KEY UPDATE aaa=VALUES(aaa), xxx=VALUES(xxx)
All good so far. Once it is inserted, the system would allow doing updates. Nevertheless, I get the following error when updating only one of the fields.
INSERT INTO tablename (aaa, id ) VALUES ( "newValue", 1)
ON DUPLICATE KEY UPDATE aaa=VALUES(aaa)
java.sql.SQLException: Field 'xxx' doesn't have a default value
This Exception is a lie as the row is already inserted and xxx attribute has "value" as value. I would expect the following sentence to be equivalent to:
UPDATE tablename SET aaa="newValue" WHERE id=1
I would be glad if someone can shed some light about this issue.
Edit:
I can use the SQL query in PhpMyAdmin successfully to update just one field so I am afraid that this is not a SQL problem but a driver problem with JDBC. That may not have solution then.
#Marc B: Your insight is probably true and would indicate what I just described. That would mean that there is a bug in JDBC as it should not do that check when the insert is of ON DUPLICATE type as there may be a default value for the row after all. Can't provide real table data but I believe that all explained above is quite clear.
#ruakh: It does not fail to insert, neither I am expecting delayed validation. One requirement I have is to have both insert/updates done using the same query as the servlet does not know if the row exists or not. The JAVA API service only fails to update a row that has NOT NULL fields which were already filled when the insert was done. The exception is a lie because the field DOES have a default value as it was inserted before the update.
This is a typical case of DRY / SRP fail; in an attempt to not duplicate code you've created a function that violates the single responsibility principle.
The semantics of an INSERT statement is that you expect no conflicting rows; the ON DUPLICATE KEY UPDATE option is merely there to avoid handling the conflict inside your code, requiring another separate query. This is quite different from an UPDATE statement, where you would expect at least one matching row to be present.
Imagine that MySQL would only check the columns when an INSERT doesn't conflict and for some reason a row was just removed from the database and your code that expects to perform an update has to deal with an exception it doesn't expect. Given the difference in statement behaviour it's good practice to separate your insert and update logic.
Theory aside, MySQL puts together an execution plan when a query is run; in the case of an INSERT statement it has to assume that it might succeed when attempted, because that's the most optimal strategy. It prevents having to check indices etc. only to find out later that a column is missing.
This is per design and not a bug in JDBC.
MySQL: In update trigger's body, can I obtain the value of a column that is specified in the where clause of the triggering query if the where clause does not match any rows at all?
I have to do the following, but NOT USING direct query such as ON DUPLICATE KEY UPDATE so on:
If I have:
UPDATE my_table SET idiotism_level=5 WHERE name='Pencho'
... and the where clause match NO ROWS, I'd want to automatically trigger an insertion of a row having name='Pencho' before the update, and then the UPDATE would presumably match, and work properly.
Is it possible ?
This could be make in a RULE in other database systems (PostgreSQL), that does not exists in MySQL. It's a Rule and not a trigger as you should analyse the query and not the result of the query.
But for MySQL you can make pre-query jobs by using MySQL-Proxy. You should be able to alter your update query and build an insert, By running some 'check row exists' extra query from the MySQL-Proxy (I'm not saying this is a nice solution, but if you have no way to make the code to act better you can fix it at this level).
No. An update trigger fires once for each row that gets updated, not once for each update command that's executed. There's no way to make the trigger fire if nothing is updated. You would need to handle this in your application by checking the number of updated rows returned by your query.
If name has a unique index on it you can use REPLACE
REPLACE INTO my_table (idiotism,name) VALUES ( 5,'Pencho');
If you got 100 000 users, is MySQL executing one SQL query at the time?
Because in my PHP code I check if a certain row exists; if it doesn't it creates one. If it does, it just updates the row counter.
It crossed my mind that perhaps 100 users are checking if the row exists at the same time, and when it doesn't they all create one row each.
If MySQL is handling them sequentially I know that it won't be an issue, then one user will check if it exists, if not, create it. The other user will check if it exists, and since that's the case, it just updates the counter.
But if they all check if it exists at the same time and let's say it doesn't, then they all create one row and the whole table structure will fail.
Would be great if someone could shed some light on this topic.
Use a UNIQUE constraint or, if viable, make the primary key one of your data items and the SQL server will prevent duplicate rows from being created. You can even use the "ON DUPLICATE KEY UPDATE ..." syntax to specify the alternate operation if the row already exists.
From your comments, it sounds like you could use the user_id as your primary key, in which case, you'd be able to use something like this:
INSERT INTO usercounts (user_id,usercount)
VALUES (id-goes-here,1)
ON DUPLICATE KEY UPDATE usercount=usercount+1;
If you put the check and insert into a transaction then you can avoid this problem. This way, the check and create will be run as one one query and there shouldn't be any confusion