If I do a SQL dump, I only get one option to choose insert, update, or replace for the queries. Can I have it INSERT if not exists otherwise UPDATE?
Probably an easy answer, but I'm just not sure.
I believe REPLACE will work for you (replace mysqldump option), since it behaves like this: If a row exists in the destination table that matches the row data in the REPLACE statement (based on PRIMARY KEY or UNIQUE KEY value(s)), that data in that row is deleted and replaced with the data in the source (dump file) row. If there is no such match, the source row is inserted into the destination table.
Related
I have some data which I want to add to an existing mysql database. The new data may have entries, which are already saved on DB. Since some of my columns are unique, I get, as expected, an ER_DUP_ENTRY error.
Bulk Insert
Let's say I want to use following statement to save "A", "B" and "C" in a column names of table mytable and "A" is already saved there.
insert into mytable (names) values ("A"), ("B"), ("C");
Is there a way to directly use bulk insert to save "B" and "C" while ignoring "A"? Or do I have to build an insert statement for every new row? This leads to another question:
Normalize Data
Should I assure not to upload duplicate entries before the actual insert statement? In my case I would need to select the data from database, eliminate duplicates and then perform the above seen insert. Or is that a task which is supposed to be done by a database?
If you have UNIQUE constraints that are blocking import, you have a few ways you can work around that:
INSERT IGNORE INTO mytable ...
If any individual rows violate a UNIQUE constraint, they are skipped. Other rows are inserted.
REPLACE INTO mytable ...
If any rows violate a UNIQUE constraint, DELETE the existing row, then INSERT the new row. Keep in mind side-effects of doing this, like if you have foreign keys that cascade on delete referencing the deleted row. Or if the INSERT generates a new auto-increment id.
INSERT INTO mytable ... ON DUPLICATE KEY UPDATE ...
More flexibility. This does not delete the original row, but allows you to set new values for any columns you choose on a case by case basis. See also my answer to "INSERT IGNORE" vs "INSERT ... ON DUPLICATE KEY UPDATE"
If you want to use bulk-loading with mysqlimport or the SQL statement equivalent LOAD DATA INFILE, there are options that match the INSERT IGNORE or REPLACE solutions, but not the INSERT...ON DUPLICATE KEY UPDATE solution.
Read docs for more information:
https://dev.mysql.com/doc/refman/8.0/en/insert.html
https://dev.mysql.com/doc/refman/8.0/en/replace.html
https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html
https://dev.mysql.com/doc/refman/8.0/en/mysqlimport.html
https://dev.mysql.com/doc/refman/8.0/en/load-data.html
In some situations, I like to do this:
LOAD DATA into a temp table
Clean up the data
Normalize as needed. (2 SQLs per column that needs normalizing -- details)
Augment Summary table(s) (INSERT .. ON DUPLICATE KEY .. SELECT x, y, count(*), sum(z), .. GROUP BY x,y)
Copy clean data from temp table to real table(s) ("Fact" table). (INSERT [IGNORE] .. SELECT [DISTINCT] .. or IODKU with SELECT.)
More on Normalizing:
I do it outside any transactions. There are multiple reasons why this is better.
At worst (as a result of other failures), I occasionally throw an unused entry in the normalization table. No big deal.
No burning of AUTO_INCREMENT ids (except in edge cases).
Very fast.
Since REPLACE is a DELETE plus INSERT it is almost guaranteed to be worse than IODKU. However, both burn ids when the rows exist.
If at all possible, do not "loop" through the rows; instead find SQL statements to handle them all at once.
Depending on the details, de-dup in step 2 (if lots of dups) or in step 5 (dups are uncommon).
I am attempting to import data into a table that has a field as follows:
result_id
This field is set to AUTO_INCREMENT, PRIMARY and UNIQUE.
The data I am importing has information in the result_id field that is the same (in places) as the current data in the table. SQL won't let me import as there are duplicates (which is fair enough).
Is there a way to get SQL to append the data I am importing and not use the duplicate data in result_id, basically to continue the number within the SQL field. The reason I am asking is that I am importing about 25,000 records and I don't want to manually have to remove or alter the result_id information from the data being imported.
Thanks,
H.
How are you importing your data to MySQL?
If you are using SQL queries/script, then there should be something like INSERT INTO.... Open the file in some text editor and replace all INSERT by INSERT IGNORE. This will ignore inserting rows with duplicate primary keys.
Or alternatively if you want to replace older data with same primary keys to that in your import script, then simply use REPLACE query in place of INSERT query.
Hope it helps...
[EDIT]
Since you have Primary key, auto increment. In your table in which you want to import data, add a dummy column say "dummy" and allow it to be NULL. Now, in your import script there will be statement like INSERT INTO () values (). Now in the list of column names replace "result_id" by "dummy" and execute the script.
After executing script simply remove "dummy" column from table. Though it is bit dirty and time consuming but will do your work.
The REPLACE INTO function in MySQL works in such a way that it deletes and inserts the row. In my table, the primary key (id) is auto-incremented, so I was expecting it to delete and then insert a table with id at the tail of the database.
However, it does the unexpected and inserts it with the same id! Is this the expected behaviour, or am I missing something here? (I am not setting the id when calling the REPLACE INTO statement)
This is an expected behavior if you have another UNIQUE index in your table which you must have otherwise it would add the row as you would expect. See the documentation:
REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted. See Section 13.2.5, “INSERT Syntax”.
https://dev.mysql.com/doc/refman/5.5/en/replace.html
This really also makes lot of sense because how else would mySQL find the row to replace? It could only scan the whole table and that would be time consuming. I created an SQL Fiddle to demonstrate this, please have a look here
That is expected behavior. Technically, in cases where ALL unique keys (not just primary key) on the data to be replaced/inserted are a match to an existing row, MySQL actually deletes your existing row and inserts a new row with the replacement data, using the same values for all the unique keys. So, if you look to see the number of affected rows on such a query you will get 2 affected rows for each replacement and only one for the straight inserts.
I have to pull data from a SQL database table to my DB2 table. If records already exist UPDATE, for new records INSERT, for extra records in destination table DELETE those extra records. Destination table looks exactly like source table. For INSERT/UPDATE I am fine, how do I do DELETE from dest table?
DB2 has a MERGE command. This allows you to write a single SQL statement to do an INSERT, UPDATE and DELETE based on conditions you define. It is a very clean way of doing this.
So what you will do is add an "Execute SQL Task" element to your SSIS package, and add the DB2 merge statement to the task.
See this link (at the bottom are examples) - http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=%2Fcom.ibm.db2.udb.admin.doc%2Fdoc%2Fr0010873.htm
if all you want is a copy of the source table... then avoid complexity and delete the target entirely first - then everything is just an insert.
In MySQL, I'm trying to find an efficient way to perform an UPDATE if a row already exists in a table, or an INSERT if the row doesn't exist.
I've found two possible ways so far:
The obvious one: open a transaction, SELECT to find if the row exists, INSERT if it doesn't exist or UPDATE if it exists, commit transaction
first INSERT IGNORE into the table (so no error is raised if the row already exists), then UPDATE
The second method avoids the transaction.
Which one do you think is more efficient, and are there better ways (for example using a trigger)?
INSERT ... ON DUPLICATE KEY UPDATE
You could also perform an UPDATE, check the number of rows affected, if it's less than 1, then it didn't find a matching row, so perfom the INSERT.
There is another way - REPLACE.
REPLACE INTO myTable (col1) VALUES (value1)
REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted. See Section 12.2.5, “INSERT Syntax”.
In mysql there's a REPLACE statement that, I believe, does more or less what you want it to do.
REPLACE INTO would be a solution, it uses the UNIQUE INDEX for replacing or inserting something.
REPLACE INTO
yourTable
SET
column = value;
Please be aware that this works differently from what you might expect, the REPLACE is quite literally. It first checks if there is a UNIQUE INDEX collision which would prevent an INSERT, it removes (DELETE) all rows which collide and then INSERTs the row you've given it.
This, for example, leads to subtle problems like Triggers not firing (because they check for an update, which never occurs) or values reverted to the defaults (because you must specify all values).
If you're doing a lot of these, it might be worth writing them to a file, and then using 'LOAD DATA INFILE ... REPLACE ...'