Mysql losing 11 records on insert - mysql

I download an XML file containing 1048 records, and then I successfully create a table($today) in my DB, and load the XML data into the MySQL table.
I then run a second script which contains this query:
INSERT INTO
t1
(
modelNumber,
salePrice
)
SELECT modelNumber,salePrice
FROM `'.$today.'`
ON DUPLICATE KEY UPDATE t1.modelNumber=`'.$today.'`.modelNumber,
t1.salePrice=`'.$today.'`.salePrice
");
It works, but I'm losing 11 records. The total count is 1037, while the $today table has the exact amount of records contained in the XML file (1048).
How can I correct this problem?

Runs some queries on the $today to find your 11 duplicates.
The ON DUPLICATE KEY clause will suppress these 11 records.

If there is a duplicate key in your file, you update the old row
ON DUPLICATE KEY UPDATE
Means that if the insert doesn't work because of a duplicate key, you get the update mentioned after that line.
There are probably 11 entries that are duplicate keys, and they update rather then insert. I would change it to this (a bit of a hack, but the quickest way I can think without any more info to find the culprints)
INSERT INTO
t1
(
modelNumber,
salePrice
)
SELECT modelNumber,salePrice
FROM `'.$today.'`
ON DUPLICATE KEY UPDATE t1.modelNumber=`'.$today.'`.modelNumber,
t1.salePrice= '999999999'
");
Then you can look for entries with that salePrice fo 9999999 , and you at least know what (or even if) duplicate keys you need to look for in your XML

Related

duplicate entry for key primary errors while replicating a table

I have some CRM data that exists in a MS SQL server, that I must move to mysql daily. I've got some python-pandas, read_sql() and to_sql() scripts that move the tables. I'm running into duplicate primary keys errors after doing some upsert logic. I have the GUID from CRM as the primary key for the table - in MySQL it is a varchar(64) datatype. Unsure what's triggering the duplicate warning.
mysql_table:
GUID-PK Name favorite_number modifiedon
00000B9D... Ben 10 '2017-01-01'
000A82A5... Harry 9 '2017-05-15'
000A9896... Fred 5 '2017-12-19'
(the GUIDs are longer, i'm shortening for the example)
I pull all the new records from MS SQL into a temporary table in MySQL based on modified dates that are greater than my current table. Some of these could be new records some could be records that already exist in my current table but have been updated.
mysql_temp_table:
GUID-PK Name favorite_number modifiedon
00000B9D... Ben 15 '2018-01-01'
000A82BB... John 3 '2018-03-15'
000A4455... Ray 13 '2018-04-01'
I want to replace any modified records, straight up, so I delete all the common records from the mysql_table. In this example, I want to remove Ben from the mysql_table, so that it can be replaced by Ben from the mysql_temp_table:
DELETE FROM mysql_table WHERE GUID-PK IN (SELECT GUID-PK FROM mysql_temp_table)
Then I want to just move the whole temp table into the replicated table with:
INSERT INTO mysql_table (SELECT * FROM temp_table)
But that gives me an error:
"Duplicate entry '0' for key 'PRIMARY'") [SQL: 'INSERT INTO mysql_table SELECT * FROM mysql_temp_table'
I can see that many of the GUID's start with '000', it seems like this is being interpreted as '0'. Shouldn't this be caught in the Delete-IN statement from above. i'm stuck on where to go next. Thanks in advance.
I suspect that the DELETE statement operation is failing with an error.
That's because the dash character isn't a valid character in an identifier. If the column name is really GUID-PK, then that needs to be properly escaped in the SQL text, either by enclosing it in backticks (the normal pattern in MySQL), or if sql_mode includes ANSI_QUOTES, then the identifiers can be enclosed in double quotes.
Another possibility is that temp_table does not have a PRIMARY or UNIQUE KEY constraint defined on the GUID-PK column, and there are multiple rows in temp_table that have the same value for GUID-PK, leading to a duplicate key exception on the INSERT into mysql_table.
Another guess (since we're not seeing the definition of the temp_table) is that the columns are in a different order, such that SELECT * FROM temp_table isn't returning columns in the order expected in mysql_table. I'd address that issue by explicitly listing the columns, of both the target table for the INSERT, and in the SELECT list.
Given that that GUID-PK column is a unique key, I would tend to avoid two separate statements (a DELETE followed by an INSERT), and just use INSERT ... ON DUPLICATE KEY UPDATE statement.
INSERT INTO mysql_table (`guid-pk`, `name`, `favorite_number`, `modifiedon` )
SELECT s.`guid-pk`, s.`name`, s.`favorite_number`, s.`modifiedon`
FROM temp_table s
ORDER
BY s.`guid-pk`
ON DUPLICATE KEY
UPDATE `name` = VALUES( `name` )
, `favorite_number` = VALUES( `favorite_number` )
, `modifiedon` = VALUES( `modifiedon` )
You may have AUTOCOMMIT disabled.
If you are performing both actions in the same TRANSACTION and do not have AUTOCOMMIT enabled your second READ COMMITTED statement will fail. INSERTS, UPDATES, and DELETES are executed using the READ COMMITTED Isolation Level
Your INSERT is being performed on the data set as it appeared before your DELETE. You need to either:
A. Explicitly COMMIT your DELETE within the TRANSACTION
or
B. Split the two statements into individual TRANSACTIONs
or
C. Renable AUTOCOMMIT
If this is not the case you will need to investigate your data sets for your DELETE and INSERT statements, because a DELETE will not just fail silently.

MS Access Unique Constraint with more than 10 fields

I know that this is a limitation of Access, but does anyone know of a good workaround that would allow me to avoid duplicate records in a situation where my table has 30 fields and I don't want any duplicate combinations of those 30 fields?
I'm basically batch loading financial data on a regular basis and I only want to add records if some of the information for a particular project id has changed since the last load. When I run the append query that adds new records I was hoping to use the constraint to block the inserts, but trying to figure out another solution.
To only insert the non-duplicating records, you need to filter out the duplicate ones in the query with a WHERE NOT EXISTS subquery, like this:
INSERT INTO tTgt (project_id, field1, ..., field30)
SELECT project_id, field1, ..., field30
FROM tSrc
WHERE NOT EXISTS (
SELECT project_id
FROM tTgt
WHERE tTgt.project_id = tSrc.project_id
AND tTgt.field1 = tSrc.field1
...
AND tTgt.field30 = tSrc.field30
)
The subquery will be rather lengthy, but in the end it's the same work that an index would have to do.

Which one faster on Check and Skip Insert if existing on SQL / MySQL

I have read many article about this one. I want to hear from you.
My problem is:
A table: ID(INT, Unique, Auto Increase) , Title(varchar), Content(text), Keywords(varchar)
My PHP Code will always do insert new record, but not accept duplicated record base on Title or Keywords. So, the title or keyword can't be Primary field. My PHP Code need to do check existing and insert like 10-20 records same time.
So, I check like this:
SELECT * FROM TABLE WHERE TITLE=XXX
And if return nothing, then I do INSERT.
I read some other post. And some guy say:
INSERT IGNORE INTO Table values()
An other guy suggest:
SELECT COUNT(ID) FROM TABLE
IF it return 0, then do INSERT
I don't know which one faster between those queries.
And I have 1 more question, what is different and faster on those queries too:
SELECT COUNT(ID) FROM ..
SELECT COUNT(0) FROM ...
SELECT COUNT(1) FROM ...
SELECT COUNT(*) FROM ...
All of them show me total of records in table, but I don't know do mySQL think number 0 or 1 is my ID field? Even I do SELECT COUNT(1000) , I still get total records of my table, while my table only have 4 columns.
I'm using MySQL Workbench, have any option for test speed on this app?
I would use insert on duplicate key update command. One important comment from the documents states that: "...if there is a single multiple-column unique index on the table, then the update uses (seems to use) all columns (of the unique index) in the update query."
So if there is a UNIQUE(Title,Keywords) constraint on the table in the example, then, you would use:
INSERT INTO table (Title,Content,Keywords) VALUES ('blah_title','blah_content','blah_keywords')
ON DUPLICATE KEY UPDATE Content='blah_content';
it should work and it is one query to the database.
SELECT COUNT(*) FROM .... is faster than SELECT COUNT(ID) FROM .. or build something like this:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=3;

Which is a faster way for checking for duplicate entries, then creating a new entry?

I want to check if an entry exist, if it does I'll increment it's count field by 1, if it doesn't I'll create a new entry and have it's count initialize to 1. Simple enough, right? It seems so, however, I've stumbled upon a lot of ways to do this and I'm not sure which way is the fastest.
1) I could use this to check for an existing entry, then depending, either update or create:
if(mysql_num_rows(mysql_query("SELECT userid FROM plus_signup WHERE userid = '$userid'")))
2) Or should I use WHERE_EXISTS?
SELECT DISTINCT store_type FROM stores
WHERE EXISTS (SELECT * FROM cities_stores
WHERE cities_stores.store_type = stores.store_type);
3) Or use this to insert an entry, then if it exists, update it:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
UPDATE table SET c=c+1 WHERE a=1;
4) Or perhaps I can set the id column as a unique key then just wait to see if there's a duplicate error on entry? Then I could update that entry instead.
I'll have around 1 million entries to search through, the primary key is currently a bigint. All I want to match when searching through the entries is just the bigint id field, no two entries have the same id at the moment and I'd like to keep it that way.
Edit: Oh shoot, I created this in the wrong section. I meant to put it into serverfault.
I believe it's 3.
Set an INDEX or a UNIQUE constraint and then use the syntax of number 3.
It depends which case will happen more often.
If it is more likely that the record does not exists I'd go for an INSERT IGNORE INTO, checking affected rows afterwards; if this is 0 the record already exists, so an UPDATE is issued.
Otherwise I'd go for INSERT INTO ... ON DUPLICATE KEY UPDATE.

Multiple set and where clauses in Update query in mysql

I don't think this is possible as I couldn't find anything but I thought I would check on here in case I am not searching for the correct thing.
I have a settings table in my database which has two columns. The first column is the setting name and the second column is the value.
I need to update all of these at the same time. I wanted to see if there was a way to update these values at the same time one query like the following
UPDATE table SET col1='setting name' WHERE col2='1 value' AND SET col1='another name' WHERE col2='another value';
I know the above isn't a correct SQL format but this is the sort of thing that I would like to do so was wondering if there was another way that this can be done instead of having to perform separate SQL queries for each setting I want to update.
Thanks for your help.
You can use INSERT INTO .. ON DUPLICATE KEY UPDATE to update multiple rows with different values.
You do need a unique index (like a primary key) to make the "duplicate key"-part work
Example:
INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE b = VALUES(b), c = VALUES(c);
-- VALUES(x) points back to the value you gave for field x
-- so for b it is 2 and 5, for c it is 3 and 6 for rows 1 and 4 respectively (if you assume that a is your unique key field)
If you have a specific case I can give you the exact query.
UPDATE table
SET col2 =
CASE col1
WHEN 'setting1'
THEN 'value'
ELSE col2
END
, SET col1 = ...
...
I decided to use multiple queries all in one go. so the code would go like
UPDATE table SET col2='value1' WHERE col1='setting1';
UPDATE table SET col2='value2' WHERE col1='setting1';
etc
etc
I've just done a test where I insert 1500 records into the database. Do it without starting a DB transaction and it took 35 seconds, blanked the database and did it again but starting a transaction first, then once the 1500th record inserted finish the transaction and the time it took was 1 second, so definetely seems like doing it in a db transaction is the way to go.
You need to run separate SQL queries and make use of Transactions if you want to run as atomic.
UPDATE table SET col1=if(col2='1 value','setting name','another name') WHERE col2='1 value' OR col2='another value'
#Frits Van Campen,
The insert into .. on duplicate works for me.
I am doing this for years when I want to update more than thousand records from an excel import.
Only problem with this trick is, when there is no record to update, instead of ignoring, this method inserts a record and on some instances it is a problem. Then I need to insert another field, then after import I have to delete all the records that has been inserted instead of update.