I'm attempting to insert, update, and delete all in one MySQL query. I have a DB with about 100 records with a primary key. I'm updating the DB from a CSV file. What I would like to happen is if a record is in the csv and not in the db, then add it. If it's in the db and the csv, update it. If it's in the db and not in the csv, delete it. I have the insert and update part working, but I'm hung up on the delete part.
Here is my query so far:
INSERT INTO mydb
(tourID,agent) VALUES (:tourID,:agent)
ON DUPLICATE KEY UPDATE
tourID=:tourID
Is there anything like 'on non duplicate key delete'?
Have an extra column named "toDelete". At the beginning of your transaction, set it to true for all rows. When you update a row, set it to false. When you are done, delete every row where toDelete is still true.
Here's some pseudocode:
For each record in the CSV file, run your INSERT INTO query.
Run a DELETE FROM mydb WHERE tourID NOT IN (comma-separated list of tourIDs from CSV)
How you come up with that comma-separated list of tourIDs from CSV depends on how you're processing your CSV file.
Related
I accidentally inserted the same value in the same field for many rows as illustrated in the image below:
I updated my table and the same file name was uploaded into every row in the table.
What SQL command can I use to revert this unintended modification?
Generally what I like to do when doing a delete statement that isn't by primary key is to select the result first..
SELECT * FROM table WHERE name='Sprouts';
If that result set is correct, then you can feel fairly safe swapping in the delete
DELETE FROM table WHERE name='Sprouts';
I added a record in a table in MySQL database, now I am trying to delete that particular record from table, but it is not getting deleted & showing this message:
#1062 - Duplicate entry '3107' for key 'PRIMARY'
How do I delete this entry?
You can not get this error during DELETE. UPDATE or INSERT can return that
So either you are not running DELETE or you have trigger on your table, that is making some other changes, that give you that error.
If child tables have ON DELETE CASCADE,check them for triggers as well.
delete from table_name [where condition];
use this statement, replace table_name with your table name and put the condition.
If you do not specify where clause, every records will be deleted.
I can't seem to find the answer to this anywhere. I am reading a csv into a data frame using the read.csv function. Then I am writing the data frame contents to a mysql table using dbWriteTable. This works great for the initial run to create the table, but I each run after this needs to do either an insert or an update depending on whether the record already exists in the table.
The 1st column in the data frame is the primary key, and the other records contain data that might change every time I pull a new copy of the csv. Each time I pull the CSV, if the primary key already exists, I want it to update that record with the new data, and if the primary key does not exist(eg: a new key since the last run), I want it to just insert the record into the table.
This is my current dbWriteTable. This creates the table just fine the 1st time it's run, and also inserts a "Timestamp" column into the table that is set to "on update CURRENT_TIMESTAMP" so that I know when each record was last updated.
dbWriteTable(mydb, value=csvData, name=Table, row.names=FALSE, field.types=list(PrimaryKey="VARCHAR(10)",Column2="VARCHAR(255)",Column3="VARCHAR(255)",Timestamp="TIMESTAMP"), append=TRUE)
Now the next time I run this, I simply want it to update any PrimaryKeys that are already in the table, and add any new ones. I also don't want to lose any records in the event a PrimaryKey disappears from the CSV source.
Is it possible to do this kind of update using dbWriteTable, or some other R function?
If that's not possible, is it possible to just run a mysql query that would delete any duplicate PrimaryKey records and keep just the 1 record with the most current timestamp? So I would run a dbWriteTable to append the new data, and then run a MySQL query to prune out the older records.
Obviously I couldn't define that 1st column as an actual PrimaryKey in the DB as my append/delete solution wouldn't work due to duplicate keys, and that's fine, I can always add an auto increment integer column to the table for the "real" primary key if needed.
Thoughts?
Consider using a temp table (an exact replica of final table but with less records) and then run an INSERT and UPDATE query into final table which will handle both cases without overlap (plus primary keys are constraints and queries will error out if attempts are made to duplicate any):
records to append if not exists - using the LEFT JOIN NULL query
records to update if does exist. - using the UPDATE INNER JOIN query
Concerning the former there is a regular debate among SQL coders if LEFT JOIN NULL or NOT IN or NOT EXISTS is the optimal solution which of course "depends". Left Join used here does avoid subqueries. But consider those avenues if needed.
# DELETE LAST SET OF TEMP DATA
dbSendQuery(mydb, "DELETE FROM tempTable")
# APPEND R DATA FRAME TO TEMP DATA
dbWriteTable(mydb, value=csvData, name=tempTable, row.names=FALSE,
field.types=list(PrimaryKey="VARCHAR(10)", Column2="VARCHAR(255)",
Column3="VARCHAR(255)", Timestamp="TIMESTAMP"),
append=TRUE, overwrite=FALSE)
# LEFT JOIN ... NULL QUERY TO APPEND NEW RECORDS NOT IN TABLE
dbSendQuery(mydb, "INSERT INTO finalTable (Column1, Column2, Column3, Timestamp)
SELECT Column1, Column2, Column3, Timestamp
FROM tempTable f
LEFT JOIN finalTable t
ON f.PrimaryKey = t.PrimaryKey
WHERE f.PrimaryKey IS NULL;")
# UPDATE INNER JOIN QUERY TO UPDATE MATCHING RECORDS
dbSendQuery(mydb, "UPDATE finalTable f
INNER JOIN tempTable t
ON f.PrimaryKey = t.PrimaryKey
SET f.Column1 = t.Column1,
f.Column2 = t.Column2,
f.Column3 = t.Column3,
f.Timestamp = t.Timestamp;")
For the most part, queries above will be compliant in most SQL backends should you ever need to change databases. Some RDMS do not support UPDATE INNER JOIN but equivalent alternatives are available. Finally, the beauty of this route is all processing is handled in the SQL engine and not in R.
Sounds like you're trying to do an upsert.
I'm kind of rusty with MySQL but the general idea is that you need to have a staging table to upload the new CSV, and then in the database itself do the insert/update.
For that you need to use dbSendQuery with INSERT ON DUPLICATE UPDATE.
http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
Database: MySQL
I have a temporary table where all incoming data are loaded from a CSV file.
I have a master table where data is transferred from the temporary table.
Now, i need to check that if any row in temporary table already exist in master table.
If it exists, i need to update the data in master table
else, i need to insert that row to master table.
The problem to me here is there is not pk column in temporary table so that i can know which row to compare between temp and master table.
Is there any way to check about already existing row without primary key?
there is no guarantee that any column will be unique.
Please help me out.
Leave the decision to the master table.
From MySQL documentation
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE col_name=expr, ... ]
So if you do INSERT INTO yourMasterTable SELECT fromYourTempTable ON DUPLICATE KEY ... will handle your needs
I have two MySQL databases with identical table structure, each populated with several thousand records. I need to merge the two into a single database but I can't import one into the other because of duplicate IDs. It's a relational database with many linked tables (fields point to other table record IDs).
Edit: The goal is to have data from both databases in one final database, without overwriting data, and updating foreign keys to match with new record IDs.
I'm not sure how to go about merging the two databases. I could write a script I suppose, but there's many tables and it would take a while to do. I wondered if anyone else had encountered this problem, and the best way to go about it?
Just ignore the duplicates. The first time the key is inserted, it'll be inserted. The second time it will be ignored.
INSERT IGNORE INTO myTable (SELECT * FROM myOtherTable );
See the full INSERT syntax here.
The trick was to increment the IDs in one database by 1000 (or something won't overlap data in the target database), then import it.
Thanks for everyone's answers.
Are the duplicate IDs supposed to correspond to each other? You could create a new table with an auto increment field and save the existing keys as two columns.
That would just be a 'bulk copy' though. If there is some underlying relationship then that would dictate how to combine the data.
If you have two tables A1 and A2 and you want to merge this to AA you can do this:
INSERT INTO aa SELECT * FROM A1;
INSERT INTO aa SELECT * FROM A2 ON DUPLICATE KEY
UPDATE aa.nonkeyfield1 = a1.nonkeyfield1,
aa.nonkeyfield2 = a1.nonkeyfield2, ....;
This will overwrite fields with duplicate keys with A2 data.
A slightly slower method with simpler syntax is:
INSERT INTO aa SELECT * FROM A1;
REPLACE INTO aa SELECT * FROM A2;
This will do the same thing, but will not update duplicate rows, but instead delete the row from A1 first and then reinsert the data from A2.
If you want to merge a whole database with foreign keys, this will not work, because it will break the links between tables.
If you have a whole database and you do not want to overwrite data
I'd import the first database as normal into database A.
import the second database into a database B.
Set all foreign keys as on update cascade.
Double check this.
Now run the following statement on all tables on database B.
SELECT #increment:= MAX(pk) FROM A.table1;
UPDATE B.table1 SET pk = pk + #increment WHERE pk IS NOT NULL
ORDER BY pk DESC;
(The where clause is to stop MySQL from giving an error in strict mode)
If you write a script with those two lines per table in your database you can then insert all tables into database AA, remember to disable foreign key checks during the update with
SET foreign_key_checks = 0;
... do lots of inserts ...
SET foreign_key_checks = 1;
Good luck.
Create a new database table with an autoincrimented primary key as the first column. Then add the column names from your databases and import each one. Then just drop the old primary field, and rename the new one to match your primary name.