Suppose I have a MySQL table with three fields: key, value1, value2
I want to load data for two fields (key,value1) from file inserts.txt.
Content of inserts.txt:
1;2
3;4
with:
LOAD DATA LOCAL INFILE
"inserts.txt"
REPLACE
INTO TABLE
`test_insert_timestamp`
FIELDS TERMINATED BY ';'
But in case of REPLACE, I want to leave the value2 unchanged.
How could I achieve this?
The REPLACE statement consists in the following algorithm:
MySQL uses the following algorithm for REPLACE (and LOAD DATA ... REPLACE):
Try to insert the new row into the table
While the insertion fails because a duplicate-key error occurs for a
primary key or unique index:
Delete from the table the conflicting row that has the duplicate key
value
Try again to insert the new row into the table
(https://dev.mysql.com/doc/refman/5.7/en/replace.html)
So you can't keep a value from a line which is going to be deleted.
What you want to do is emulating a "ON DUPLICATE KEY UPDATE" logic.
You can't do that within a single LOAD DATA query. What you have to do is to load your data in a temporary table first, then to make an INSERT from your temporary table to your destination table, where you will be able to use the "ON DUPLICATE KEY UPDATE" feature.
The whole process is fully detailed in the most upvoted answer of this question : MySQL LOAD DATA INFILE with ON DUPLICATE KEY UPDATE
Related
I have a CSV file that I am loading into my database. I want the previous data in the table to be overwritten and not appended every time I load my CSV file. Is it possible to do this within a single query?
Is the only solution to TRUNCATE the table and then utilize the LOAD DATA INFILE queries?
Assuming you have a primary key, you can use REPLACE. As the documentation states:
The REPLACE and IGNORE modifiers control handling of input rows that
duplicate existing rows on unique key values:
If you specify REPLACE, input rows replace existing rows. In other words, rows that have the same value for a primary key or unique index
as an existing row. See Section 13.2.9, “REPLACE Statement”.
However, if you want to replace the existing table, then truncate the table first and then load.
I am googling around and reading posts on this site but not clear what I should I do go with insert unique, records.
I basically have a giant file that that has a single column of data in it that needs to be imported into a db table where several thousand of the records from my text file already exist.
There are no id's in the text file I need to import and what I am reading insert ignore looks to be solving for duplicate ID's. I want new ids created for any of the new records added but obviously I can't have duplicates.
This would ideally be done with load data infile...but really not sure:
Insert IGNORE?
ON Duplicate Key?
The easiest way to achieve what you want is to read in the entire file using LOAD DATA, and then insert all non duplicates into the table which already exists.
LOAD DATA LOCAL INFILE 'large_file.txt' INTO TABLE new_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
(name);
This assumes that the first line of your giant file does not contain a column header. If it does, then you can add IGNORE 1 LINES to the statement above.
You can now INSERT the names in new_table into your already-existing table using INSERT INTO IGNORE:
INSERT IGNORE INTO your_table_archive (name)
SELECT name
FROM new_table
Any duplicate records which MySQL encounters will not be inserted, and instead the original ones will be retained. And you can drop new_table once you have what you need from it.
Best way to illustrate my problem is with a quick example.
Imagine the following file (loaded into MySQL using "Load Data Infile in Table" command)
Color,Shape
red,square
blue,triangle
green,circle
(Note: My primary key = Color. Unique Key = Shape)
No matter how many times I use the command I still (correctly) just have 3 records, as it doesn't allow duplicate records.
However, if I amend record 3 within MySQL and change it from circle to diamond and re-run the Load Data command I end up with 4 records.
Color,Shape
red,square
blue,triangle
green,diamond
green,circle
I now have 2 x green values in my Primary Key field. If I try to edit one of them I get a "Duplicate Entry for Primary Key Field" error.
I would have expected the Load Data Infile command to skip the record as it creates a duplicate value in the Primary Key field. Instead it seems to only ignore it if the entire record is a duplicate. It doesn't seem to validate fields to ensure that the Primary Key field is always unique.
Why is it failing to do this?
Here's my query for loading mysql table using csv file.
LOAD DATA LOCAL INFILE table.csv REPLACE INTO TABLE table1 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\'' LINES TERMINATED BY 'XXX' IGNORE 1 LINES
SET date_modified = CURRENT_TIMESTAMP;
Suppose my CSV contains 500 records with 15 columns. I changed three rows and terminated them with 'XXX'. I now want to update the mysql table with this file. My primary key is auto-incremented value. When I run this query, all the 500 rows are getting updated with old data and the rows I changed are getting added as new ones. I dont want the new ones. I want my table to be replaced with csv as-is. I tried changing my primary key to non-AI, it still didnt work. Any pointers please?? Thanks.
I am making some assumptions here.
1) You dont have the autonumber value in your file.
Since your primary key is not in your file MySQL will not be able to match rows. A autonumber primary key is a artificial key thus it is not part of the data. MySQL adds this artificial primary key when the row is inserted.
Lets assume your file contained some unique identifier lets call it Identification_Number. This number is both in the file and your table uses it as a primary key in this case MySQL will be able to identify the rows from the file and match them to the rows in the table.
While a lot of people will only use autonumbers in a database I always check if there is not a natural key in the data. If I identify one I do some performance testing with this natural key in a table. Then based on the performance metrics of both I then decide on a key.
Hopefully I did not get your question wrong but I suspect this might be the case.
I'm fetching data from a text file or log periodically and it gets inserted in the database every time fetched. Is there a way in MySQL that the insert is only done when the log files are updated or I have to do it using the programming language ? I mean Is there a type of insert that when It sees a duplicate primary key, It doesn't give an error of "Duplicate Entry" .. It just ignore.
Put the fetch in a logrotate postrotate script, and fetch from the just rotated log.
Ignoring duplicates can be done with either INSERT IGNORE OR INSERT .... ON DUPLICATE KEY UPDATE syntax (which will either ignore the lines causing a duplcate unique key, or give you the possibility to alter some values in the existing row.)