I have read a lot before asking but since I am a noob eith MYSQL the answers were a bit confusing to me so let me try to put my own question and see if someone can help.
I have a table called "parse" and this table has data I need to throw into another table called "updates".
Ok, so "updates" is created but now I have updated the data from parse (that is the source of the data) and both tables have a unique ID called "link".
I was planning to run "INSERT ON DUPLICATE" but not sure how to do it. So basically, the field "link" is the unique ID that needs to remain the same in both tables, but the rest will update like price for example in the table "updates".
Can you guide me on how to do this?
Thanks
The appropriate statement would have this form:
INSERT INTO destination (
-- List columns in `destination` here.
)
SELECT
-- List columns from `source` here. Ensure they correspond to the columns listed for `destination` exactly otherwise you'll get an error (or worse: unintentional data corruption).
FROM
source
ON DUPLICATE KEY UPDATE
-- List both `destination` and `source` columns again here, excluding immutable and key columns. Refer to the source column via `VALUES()`.
In your case, something like this:
INSERT INTO "updates" (
link, /* PK */
a,
b,
c,
d
)
SELECT
link,
a,
b,
c,
d
FROM
"parse"
ON DUPLICATE KEY UPDATE
a = VALUES(a),
b = VALUES(b),
c = VALUES(c),
d = VALUES(d);
Related
I'm a beginner to SQL so this is probably a pretty newbie question, but I can't seem to get my head straight on it. I have a pair of tables called MATCH and SEGMENT.
MATCH.id int(11) ai pk
MATCH.name varchar(45)
etc.
SEGMENT.id int(11) ai pk
SEGMENT.name varchar(45)
etc.
Each row in MATCH can have one or more SEGMENT rows associated with it. The name in MATCH is unique on each row. Right now I do an inner join on the name fields to figure out which segments go with which match. I want to copy the tables to a new set of tables and set up a foreign key in SEGMENT that contains the unique ID from the MATCH row both to improve performance and to fix some problems where the names aren't always precisely the same (and they should be).
Is there a way to do a single INSERT or UPDATE statement that will do the name comparisons and add the foreign key to each row in the SEGMENT table - at least for the rows where the names are precisely the same? (For the ones that don't match, I may have to write a SQL function to "clean" the name by removing extra blanks and special characters before comparing)
Thanks for any help anyone can give me!
Here's one way I would consider doing it: add the FK column, add the constraint definition, then populate the column with an UPDATE statement using a correlated subquery:
ALTER TABLE `SEGMENT` ADD COLUMN match_id INT(11) COMMENT 'FK ref MATCH.id' ;
ALTER TABLE `SEGMENT` ADD CONSTRAINT fk_SEGMENT_MATCH
FOREIGN KEY (match_id) REFERENCES `MATCH`(id) ;
UPDATE `SEGMENT` s
SET s.match_id = (SELECT m.id
FROM MATCH m
WHERE m.name = s.name) ;
A correlated subquery (like in the example UPDATE statement above) usually isn't the most efficient approach to getting a column populated. But it seems a lot of people think it's easier to understand than the (usually) more efficient alternative, an UPDATE using a JOIN operation like this:
UPDATE `SEGMENT` s
JOIN `MATCH` m
ON m.name = s.name
SET s.match_id = m.id
Add an ID field you your MATCH Table and populate it.
them add a column MATCHID (which will be your foriegn key) to your SEGMENT table - Note you wont be able to set this as a Foreign Key till you have mapped the records correctly
Use the following query to update the foreign keys:
UPDATE A
FROM SEGMENT A
INNER JOIn MATCH B
on A.NAME=B.NAME
SET MATCHID = B.ID
I'm trying to write a query to update a FK column in table B using the primary key column in table A. If there are duplicate entries in table A, I'd like to use the max id of the duplicate entry to insert into table B.
I have the first part of the query written but I'm unsure about the duplicate entry part.
Here's what I have so far...
UPDATE calliope_media.videos v
JOIN calliope_media.video_ingress_queue viq ON v.provider_unique_id = viq.provider_unique_id
SET v.video_ingress_id = viq.id;
This is how your query should look.
UPDATE B
SET B.the_column_ID = (SELECT MAX(A.some_ID)
FROM A
WHERE A.matching_value = B.matching_value)
This is the overall structure. I haven't adapted to your specific requirements, since I don't fully understand them. But this should get you back on track.
I have this Query:
INSERT INTO `items` (`id`,`image`)
VALUES(112,'asdf.jpg'),(113,'foobar.png')
ON DUPLICATE KEY UPDATE
`id`VALUES(`id`),
`image` = IF(image_uploaded = 0, VALUES(`image`),image);
The worse: its properly working but not as i want.
What i want: The images should only be updated if the field "image_uploaded" is set to 0.
Any ideas?
The Background: I have a DB-Table with data. Now each night a cronjob calls an API-Fn to get new Data from another DB and write it down to my table. This function is getting all items from the second DB so its currently just overwriting my existing data. Now my Application allows changes on data i got from the 2nd DB and overwrites the changes in my own Table. So the Problem is: I need the ability to edit Data via my App AND update Data via API without colliding. The User may change the "image" but all other things should be updated from the 2nd DB. The Image should only be overwritten if it wasn't uploaded manually.
Without playing around with ON DUPLICATE KEY... I'm not sure, if it can handle this situation. I'd work around it by using another (temporary) table.
It's the same structure as your target_table plus a column (in the example sid) to indicate if the entry exists in your target_table or not.
CREATE TEMPORARY TABLE tmp_whatever (
sid int,
id int, image varchar(50)
);
Now we insert the data you want to insert into your target_table into the newly created table first and check with coalesce() and left joining, if the entry already exists. I'm assuming here, that id is your primary key.
INSERT INTO tmp_whatever (sid, id, image)
SELECT
COALESCE(t.id, 0),
id, image
FROM (
SELECT 112 AS id,'asdf.jpg' AS image
UNION ALL
SELECT 113,'foobar.png'
) your_values v
LEFT JOIN your_target_table t ON v.id = t.id;
Then we update the target_table...
UPDATE your_target_table t INNER JOIN tmp_whatever w ON t.id = w.id AND w.sid <> 0
SET t.image = w.image
WHERE t.image_uploaded = 0;
And finally we insert the rows not already existing...
INSERT INTO your_target_table (id, image)
SELECT
id, image
FROM tmp_whatever
WHERE sid = 0;
While I was writing this down, it came to my mind, that I might have had wrong assumptions on what your problem is. This
The worse: its properly working but not as i want.
is definitely not the way to ask a question or describe a problem. I answered because I have a good day :) (and it's the reason why you get downvotes, btw)
Anyway, another cause of "not as i want" could be, that you're missing a unique index in your table. Though a primary key is a unique key, afaik/iirc ON DUPLICATE KEY relies on a unique index on another column to work correctly.
I've got a MySQL table that has a lot of entries. Its got a unique key defined as (state, source) so there are no duplicates for that combination of columns. However now I am realizing that much of the state data is not entered consistently. For example in some rows it is entered as "CA" and others it might be spelled out as "California."
I'd like to update all the entries that say "California" to be "CA" and if it creates a conflict in the unique key, drop the row. How can I do that?
You may be better off dumping your data and using an external tool like Google Refine to clean it up. Look at using foreign keys in the future to avoid these issues.
I don't think you can do this in one SQL statement. And if you have foreign key relationships from other tables to the one you are trying to clean-up then you definitely do not want to do this in one step (even if you could).
CREATE TABLE state_mappings (
`old` VARCHAR(64) NOT NULL,
`new` VARCHAR(64) NOT NULL
);
INSERT INTO state_mappings VALUES ('California', 'CA'), ...;
INSERT IGNORE INTO MyTable (state, source)
SELECT sm.new, s.source from states s JOIN state_mappings sm
ON s.state = sm.old;
// Update tables with foreign keys here
DELETE FROm MyTable WHERE state IN (SELECT distinct old FROM state_mappings);
DROP TABLE state_mappings;
I'm no SQL pro, so these statements can probably be optimized, but you get the gist.
I'm trying to remove doublettes (sometimes triplettes, unfortunately!) from a MySQL table. My issue is that the only unique data available are the primary key, so in order to identify doublettes, you have to take account all the columns.
I've managed to identify all records that have doublettes and copied them along with their doublettes (including their primary keys) to the table temp. The source table is called translation and it has an integer primary key with the name TranslationID. How do I move on from here? Thanks!
edit Available columns are:
TranslationID
LanguageID
Translation
Etymology
Type
Source
Comments
WordID
Latest
DateCreated
AuthorID
Gender
Phonetic
NamespaceID
Index
EnforcedOwner
The duplicity issue resides with the rows with the Latest column assigned 1.
edit #2 Thank you, everyone for your time! I've solved the problem by using WouterH's answer, resulting in the following query:
DELETE from translation USING translation, translation as translationTemp
WHERE translation.Latest = 1
AND (NOT translation.TranslationID = translationTemp.TranslationID)
AND (translation.LanguageID = translationTemp.LanguageID)
AND (translation.Translation = translationTemp.Translation)
AND (translation.Etymology = translationTemp.Etymology)
AND (translation.Type = translationTemp.Type)
AND (translation.Source = translationTemp.Source)
AND (translation.Comments = translationTemp.Comments)
AND (translation.WordID = translationTemp.WordID)
AND (translation.Latest = translationTemp.Latest)
AND (translation.AuthorID = translationTemp.AuthorID)
AND (translation.NamespaceID = translationTemp.NamespaceID)
You can remove duplicates without temporary table or subquery. Delete all rows that have the same data but a different TranslationID
DELETE from translation USING translation, translation as translationTemp
WHERE (NOT translation.TranslationID = translationTemp.TranslationID)
AND (translation.LanguageID = translationTemp.LanguageID)
AND (translation.Translation = translationTemp.Translation)
AND (translation.Etymology = translationTemp.Etymology)
AND // compare other fields here
Create a SELECT statement with your current SELECT as a sub-select, so that you can return a col of IDs that should be removed. Then apply that SELECT in a DELETE FROM statement.
Example (pseudo code):
SELECT1 = SELECT ... AS temp; # the table you have right now
SELECT2 = SELECT TranslationID FROM (SELECT1)
Final query will look like this:
DELETE FROM table_name WHERE TranslationID IN (SELECT2);
You just need to insert the SELECT with sub-select in the final query.
Top stop duplicates in future you can change your engine to the InnoDB engine like this:
ALTER TABLE table_name ENGINE=InnoDB;
Then add a Unique constraint to the TranslationID field.
If the doublettes/triplettes are identical except for the primary key, then you can select all records from temp which are identical to another except for having a larger primary key than that other; this will give you temp w/ the record w/ the minimum key for each doublet/triplette. You can then delete these records from translation.
Instead of identifying the lines that aren't unique, I would try to copy the valid data to a new table, and then remove the old one and replace it by this new, cleaned table.
I can see of two ways:
Using the DISTINCT keyword in your SQL query (source);
Using a GROUP BY statement on all columns (source).