fixing duplicate fields in mysql table with update statement - mysql

I have inherited a table with a field "sku" with should be unique, but thanks to a failing sku-generating method is now littered with dozens of duplicates all around.
I need to quickly fix these duplicates (other parts of the application are failing when encountering these duplicate records) by running an update and appending the record ID to the SKU (which is a valid solution for the time being for this application).
I'm trying to run:
UPDATE
main_product_table
SET sku = CONCAT(sku, '-', CAST(product_id as CHAR) )
WHERE sku IN (
SELECT sku FROM main_product_table
GROUP BY sku
HAVING COUNT(*) > 1
);
But I receive:
You can't specify target table 'main_product_table' for update in FROM clause
Is there a way to accomplish the same? Is mysql complaining about me having main_product_table both in the update and in the subquery to get the duplicates?
Thanks!

Try this:
UPDATE
main_product_table
SET sku = CONCAT(sku, '-', CAST(product_id as CHAR) )
WHERE sku IN (
select * from ( SELECT sku FROM main_product_table
GROUP BY sku
HAVING COUNT(*) > 1) as p
);
Added table alias in inner query.

Related

What sql query to use for only deleting duplicate results for wp_comments table?

I need to finish the select query below. The query shows me the count of comments with the same comment_id.I just ultimately want to delete the duplicates and leave the non duplicates alone.This is a wordpress database
screenshot of my current query results
SELECT `comment_ID`, `comment_ID`, count(*) FROM `wp_comments` GROUP BY `comment_ID` HAVING COUNT(*) > 1 ORDER BY `count(*)` ASC
example of 2 entries I need to delete one
First back up your bad table in case you goof something up.
CREATE TABLE wp_commments_bad_backup SELECT * FROM wp_comments;
Do you actually have duplicate records here (duplicate in all columns) ? If so, try this
CREATE TABLE wp_comments_deduped SELECT DISTINCT * FROM wp_comments;
RENAME TABLE wp_comments TO wp_comments_not_deduped;
RENAME TABLE wp_comments_deduped TO wp_comments;
If they don't have exactly the same contents and you don't care which contents you keep from each pair of duplicate rows, try something like this:
CREATE TABLE wp_comments_deduped
SELECT comment_ID,
MAX(comment_post_ID) comment_post_ID,
MAX(comment_author) comment_author,
MAX(comment_author_email) comment_author_email,
MAX(comment_author_url) comment_author_url,
MAX(comment_author_IP) comment_author_IP,
MAX(comment_date) comment_date,
MAX(comment_date_gmt) comment_date_gmt,
MAX(comment_content) comment_content,
MAX(comment_karma) comment_karma,
MAX(comment_approved) comment_approved,
MAX(comment_agent) comment_agent,
MAX(comment_type) comment_type,
MAX(comment_parent) comment_parent,
MAX(user_id) user_id
FROM wp_comments
GROUP BY comment_ID;
RENAME TABLE wp_comments TO wp_comments_not_deduped;
RENAME TABLE wp_comments_deduped TO wp_comments;
Then you'll need to doublecheck whether your deduplicating worked:
SELECT comment_ID, COUNT(*) num FROM wp_comments GROUP BY comment_ID;
Then, once you're happy with it, put back WordPress's indexes.
Pro tip: Use a plugin like Duplicator when you migrate from one WordPress setup to another; its authors have sorted out all this data migration for you.
I would recommand add a unique key to the table make it auto incremental call it tempId , so you would be able to to distinguish between one duplicate set, use below query to remove duplicate copies and at the end remove that '`tempid' column:
DELETE FROM `wp_comments`
WHERE EXISTS (
SELECT `comment_ID` , MIN(`tempid`) AS `tempid`
FROM `wp_comments` as `dups`
GROUP BY `comment_ID`
HAVING
COUNT(*) > 1
AND `dups`.`comment_ID` = `wp_comments`.`comment_ID`
AND `dups`.`tempid` = `wp_comments`.`tempid`
)
I'm not clear on why there appear to be two different fields both named 'column_ID' from the same table, but I believe this will delete only the first of the two identical records. Before running a DELETE statement, however, be sure to make a backup of the original table.
DELETE
TOP 1 *
FROM
'wp_comments'
WHERE
comment_ID IN
(
SELECT
comment_ID,
r,
(comment_ID + '_' + r) AS unique
FROM
(
SELECT
`comment_ID`,
`comment_ID`,
RANK() OVER (PARTITION BY 'comment_id' ORDER BY 'comment_id') AS r
FROM
'wp_comments'
)
WHERE
r>1
)

mySQL UPDATE WHERE with subquery gives error

I want to update a table with a subquery and always get an error.
Now i made a very simplified version (which makes not much sense but shows my error)
UPDATE a_test SET categoryID = '2956' WHERE id IN (
(
SELECT id from a_test
)
)
This ends in this error:
#1093 - Table 's_articles_categories' is specified twice, both as a target for 'UPDATE' and as a separate source for data
Why do i get this error?
When i use aliasses for the table a_test i get the same error.
This is the full query i want to use with the same error:
UPDATE s_articles_categories SET categoryID = '2956' WHERE id IN
(
SELECT s_articles_categories.id FROM `s_articles`
LEFT JOIN s_articles_categories ON s_articles.id = s_articles_categories.articleID
WHERE s_articles_categories.categoryID NOT IN (
SELECT id FROM s_categories
WHERE s_categories.id NOT IN (SELECT parent FROM s_categories WHERE parent IS NOT null GROUP BY parent)
)
)
One solution to the simplified query is to wrap the subquery inside another subquery:
UPDATE a_test
SET categoryID = '2956'
WHERE id IN (SELECT id FROM (SELECT id FROM a_test) x );
This trick forces MySQL to materialize the subquery on a_test, so that the values coming from the subquery aliased as x are not affected as the update proceeds.

Mysql Update statement to delete duplicates

I would like to make an update on my unique id like replacing white space, but on that case the update statement breaks because of redundant s I tried to make an update as it follows but
wrong syntax
UPDATE __cat_product
SET product_id = REPLACE(product_id,' ','')
ON DUPLICATE KEY DELETE product_id
WHERE product_id LIKE "%A %"
how to do this in the right way?
There is no ON DUPLICATE KEY syntax that is part of an UPDATE statement.
I'm having difficulty figuring out what you actually want to do.
You want to update a column in a table, to remove all space characters. And when you tried to run that statement, the statement encountered an error like Duplicate entry 'foo' for key 'PRIMARY'.
Assuming that product_id is varchar, try this on for size:
UPDATE __cat_product t
JOIN ( SELECT MAX(s.product_id) AS product_id
, REPLACE(s.product_id,' ','')
FROM __cat_product s
WHERE s.product_id LIKE '%A %'
AND NOT EXISTS ( SELECT 1
FROM __cat_product d
WHERE d.product_id = REPLACE(s.product_id,' ','')
)
GROUP BY REPLACE(s.product_id,' ','')
) r
ON r.product_id = t.product_id
SET t.product_id = REPLACE(t.product_id,' ','')
The inline view aliased as r gets the values of product_id that can actually be updated, because the "updated" product_id value doesn't already exist in the table. And the GROUP BY ensures that we're only getting one product_id per targeted replacement value.
There may be rows in the table that still match the predicate product_id LIKE '%A %'. The spaces cannot be removed from product_id column on those rows without throwing a duplicate key exception.
A separate DELETE statement could be used to remove those rows.

Using var into an insert update request does not work

I try to insert values into a table using following request :
INSERT IGNORE itemmaster
(
SKU,
product_id
)
SELECT
#massimport_SKU := main.SKU,
#massimport_product_id := prd.entity_id
FROM itemmaster AS main
INNER JOIN product_entity AS prd ON prd.sku = main.SKU
ON DUPLICATE KEY UPDATE product_id = #massimport_product_id
SKU is a unique key.
The problem is the value of product_id is always the same id. If I only execute the select, product_id are different but after insert, only one value in product_id column. I think this is a problem with var #massimport_product_id cause if I useON DUPLICATE KEY UPDATE product_id = prd.entity_id instead the request work perfectly.
But cause it's an automatic generated request who work well in all other case, I hope somebody coul me explain why this append.
Thx
Don't use session variables, that can't work.
Use this:
INSERT IGNORE itemmaster( SKU, product_id )
SELECT main.SKU, prd.entity_id
FROM itemmaster AS main
INNER JOIN product_entity AS prd ON prd.sku = main.SKU
ON DUPLICATE KEY UPDATE product_id = prd.entity_id
;
Demo --> http://www.sqlfiddle.com/#!2/592187/1
However INNER JOIN in this query always returns records (values of SKU column), that already exist in itemmaster table, so INSERT is useles in this case, and the same can be done simpler, using multitable update:
UPDATE itemmaster main
JOIN product_entity AS prd ON prd.sku = main.SKU
SET main.product_id = prd.entity_id;
Demo: http://www.sqlfiddle.com/#!2/762eb4/1

mySQL - INSERT query that matches the same records as this SELECT query?

I've got a select query I'm using to pick out contacts in my DB that haven't been spoken to in a while. I'd like to run an INSERT query to enter in a duplicate note for all the records that are returned with this select query... problem is I'm not exactly sure how to do it.
The SELECT query itself is likely a bit of a convoluted mess. I basically want to have the most recent note from each partner selected, then select ONLY partners that haven't got a note from a certain date and back... the SELECT query goes:
SELECT * FROM
(
SELECT * FROM
(
SELECT
partners.partners_id,
partners.CompanyName,
notes.Note,
notes.DateCreated
FROM
notes
JOIN
partners ON notes.partners_id = partners.partners_id
ORDER BY notes.DateCreated DESC
) AS Part1
GROUP BY partners_id
ORDER BY DateCreated ASC
) AS Part2
WHERE
DateCreated <= '2013-01-15'
How would a run an INSERT query that would only go into the same records as this SELECT?
The insert would enter records such as:
INSERT INTO notes
(
notes_id,
partners_id,
Note,
CreatedBy,
DateCreated
)
SELECT
UUID(),
partners.partners_id,
'Duplicated message!',
'User',
'2013-02-14'
FROM
partners
If you want to do this all in SQL, you could use an UPDATE statement.
UPDATE tablename SET note='duplicate' where id in ( your statement here);
Note that in order for this to work 'id' needs to be a column from 'tablename'. Then, your statement has to return a single column, not *. The column returned needs to be the id that will let your update statement know which rows to update in 'tablename'.