I have dups in my mysql database that I can find with this query:
select checksum, count(*) c from MY_DATA group by checksum having c > 1;
I get a list of about 200 dups. What I want to do is update a column in the same table to mark them as dups.
I tried this:
update MY_DATA SET DONT_PARSE=1
where (select count(*) c from MY_DATA group by checksum having c > 1);
I get the error: You can't specify target table 'MY_DATA' for update in FROM clause
Anyone have a solution for this?
Use join:
update my_data d join
(select checksum, count(*) as cnt
from MY_DATA
group by checksum
having cnt > 1
) c
on d.checksum = c.checksum
set DONT_PARSE = 1 ;
Unfortunately, MySQL doesn't allow you to reference the table being updated in a subquery in the update (or delete). However, join can usually be used to get around this.
your query seems also not correct (seems that the where clause is not complete) so you could use a INNER JOIN on the subquery that retrieve duplicated
You could try using
update MY_DATA m
INNER JOIN (
select checksum, count(*) c
from MY_DATA
group by checksum
having c > 1
) t on t.checksum = m.checksum
SET DONT_PARSE=1
Related
Thanks to this question Rename Mysql Duplicate Value I was able to come up with this queryn to elminate the duplicate rows.
UPDATE table1
inner join (SELECT OBJECTID,CONCAT(IDENT,'_1') as IDENT FROM table1
GROUP BY IDENT HAVING COUNT(*) > 1) t
on t.OBJECTID = table1.OBJECTID
SET table1.IDENT = t.IDENT;
This works well but I want to only rename the rows where the column IDENT is duplicated and the NAME column is different. Any ideas how to do this?
Change the grouping to be both NAME and IDENT.
UPDATE table1
JOIN (
SELECT MAX(objectid) AS max_id, name, CONCAT(ident, '_1') AS new_ident
FROM table1
GROUP BY name, ident
HAVING COUNT(*) > 1
) AS t ON t.max_id = table1.objectid
SET table1.ident = t.new_ident
I'm struggling to do proper sql script to increment field on specific way.
Those two script are without any exception, but nothing happened on the results.
Script 1:
UPDATE
myTable T1,
(
SELECT id,
(#s:=#s+1) AS seq
FROM myTable, (SELECT (#s:=0) AS s ) s
WHERE infotext IS NULL ORDER BY grouptext
) T2
SET sequence = seq
WHERE T1.id = T2.id
Script 2:
UPDATE myTable AS target
INNER JOIN (
SELECT supfault_id,
(#s:=#s+1) AS seq
FROM myTable, (SELECT (#s:=0) AS s ) s
WHERE infotext IS NULL ORDER BY grouptext
) AS ordered ON ordered.id = target.id
SET sequence = seq
This one get the last desc value from table1 and increment by one then update the table2:
set #inc = 0;
select cast(valToIncrement as signed) into #inc from
(select REPLACE(fkid,' ','') as valToIncrement from tbl_1 ORDER BY fkid)as a ORDER BY valToIncrement desc limit 1;
update tbl_2 set fkid = #inc + 1 where fkid = 122;
Subqueries working well separately, so I wondered why I can't update my sequence value by seq from subquery.
I'm not expert, but I felt that need to be used some virtual table for my subquery.
Here is solution for inner join case:
CREATE TEMPORARY TABLE supportGroupSeqcalculation AS
SELECT supfault_id,
(#s:=#s+1) AS seq
FROM myTable, (SELECT (#s:=0) AS s ) s
WHERE infotext IS NULL
ORDER BY grouptext;
UPDATE myTable AS target
INNER JOIN supportGroupSeqcalculation AS ordered ON ordered.supfault_id = target.supfault_id
SET sequence = seq;
DROP TEMPORARY TABLE supportGroupSeqcalculation;
We can get into temporary table specific order and record it as sequence value.
It is not necessarily to drop temporary table, it exists only in current session.
I have a query, a generalized version of which I've reproduced below:
SELECT TT.column
FROM Table1 TT
JOIN Table2 T USING (PRIMARYKEY)
GROUP BY T.Date
I want to take the output of this query -- a single column output with multiple rows sorted by date -- and group concat it in another query as a derived table:
SELECT
T.column2,
GROUP_CONCAT(
SELECT TT.column
FROM Table1 TT
JOIN Table2 T USING (PRIMARYKEY)
GROUP BY T.Date) AS concat_output
FROM Table1 TT
JOIN Table2 T USING (PRIMARYKEY)
GROUP BY T.Date
However, this returns an error at the line of the GROUP_CONCAT command.
Thoughts on how to make this work?
EDIT: To give some more detail on why I wanted the derived table to work:
At the moment, without using GROUP_CONCAT, I get multiple rows that look like
a
a
b
b
a
a
c
c
d
a
If I try to GROUP_CONCAT as described by Mukesh's answer, using DISTINCT I get the following, for example, as a row: a, b, c, d when really I want a,b,a,c,d,a.
Thoughts?
Try this query
SELECT
T.column2,
GROUP_CONCAT(
DISTINCT TT.column
) AS concat_output
FROM Table1 TT
JOIN Table2 T USING (PRIMARYKEY)
GROUP BY T.Date
More detail to refer this link
http://www.mysqltutorial.org/mysql-group_concat/
Due to its geographic capabilities I'm migrating my database from MySQL to PostgreSQL/PostGIS, and SQL that used to be so trivial is now are becoming painfully slow to overcome.
In this case I use a nested query to obtain the results in two columns, having in 1st column an ID and in the 2nd a counting result and insert those results in table1.
EDIT: This is the original MySQL working code that I need to be working in PostgreSQL:
UPDATE table1 INNER JOIN (
SELECT id COUNT(*) AS cnt
FROM table2
GROUP BY id
) AS c ON c.id = table1.id
SET table1.cnt = c.cnt
The result is having all rows with the same counting result, that being the 1st counting result of the nested select.
In MySQL this would be solved easily.
How would this work in PostgreSQL?
Thank you!
UPDATE table1 dst
SET cnt = src.cnt
FROM (SELECT id, COUNT (*) AS cnt
FROM table2
GROUP BY id) as src
WHERE src.id = dst.id
;
What would be the best way to return one item from each id instead of all of the other items within the table. Currently the query below returns all manufacturers
SELECT m.name
FROM `default_ps_products` p
INNER JOIN `default_ps_products_manufacturers` m ON p.manufacturer_id = m.id
I have solved my question by using the DISTINCT value in my query:
SELECT DISTINCT m.name, m.id
FROM `default_ps_products` p
INNER JOIN `default_ps_products_manufacturers` m ON p.manufacturer_id = m.id
ORDER BY m.name
there are 4 main ways I can think of to delete duplicate rows
method 1
delete all rows bigger than smallest or less than greatest rowid value. Example
delete from tableName a where rowid> (select min(rowid) from tableName b where a.key=b.key and a.key2=b.key2)
method 2
usually faster but you must recreate all indexes, constraints and triggers afterward..
pull all as distinct to new table then drop 1st table and rename new table to old table name
example.
create table t1 as select distinct * from t2; drop table t1; rename t2 to t1;
method 3
delete uing where exists based on rowid. example
delete from tableName a where exists(select 'x' from tableName b where a.key1=b.key1 and a.key2=b.key2 and b.rowid >a.rowid) Note if nulls are on column use nvl on column name.
method 4
collect first row for each key value and delete rows not in this set. Example
delete from tableName a where rowid not in(select min(rowid) from tableName b group by key1, key2)
note that you don't have to use nvl for method 4
Using DISTINCT often is a bad practice. It may be a sing that there is something wrong with your SELECT statement, or your data structure is not normalized.
In your case I would use this (in assumption that default_ps_products_manufacturers has unique records).
SELECT m.id, m.name
FROM default_ps_products_manufacturers m
WHERE EXISTS (SELECT 1 FROM default_ps_products p WHERE p.manufacturer_id = m.id)
Or an equivalent query with IN:
SELECT m.id, m.name
FROM default_ps_products_manufacturers m
WHERE m.id IN (SELECT p.manufacturer_id FROM default_ps_products p)
The only thing - between all possible queries it is better to select the one with the better execution plan. Which may depend on your vendor and/or physical structure, statistics, etc... of your data base.
I think in most cases EXISTS will work better.