Why am I getting "Duplicate entry" error on SELECT DISTINCT query? - mysql

I have the following query to append data into a table if it is unique:
INSERT INTO belgarath.players(tour_id, player_id, player_name_oc)
SELECT DISTINCT 0, ID_P, NAME_P FROM oncourt.players_atp
LEFT JOIN belgarath.players
ON belgarath.players.tour_id = 0
AND belgarath.players.player_id=oncourt.players_atp.ID_P;
I run this once on an empty table and it's fine. I delete a row and run it expecting MySQL to append the one deleted row. However, I get the following error code: Error Code: 1062. Duplicate entry '0-43042' for key 'players.unique_plyrs' . I have a unique key across tour_id and player_id and clearly it's failing because I'm trying to append a duplicate record.
Why would I be getting this if I'm only selecting distinct records to insert? How do I avoid getting this in future?

This should resolve your issue. Put a Where clause to check for belgarath.players.player_id is NULL.
INSERT INTO belgarath.players(tour_id, player_id, player_name_oc)
SELECT DISTINCT 0, ID_P, NAME_P FROM oncourt.players_atp
LEFT JOIN belgarath.players
ON belgarath.players.tour_id = 0
AND belgarath.players.player_id=oncourt.players_atp.ID_P
WHERE belgarath.players.player_id is NULL;

Hope this hint realted to Distinct keyword helps you. When we use distinct key it usually select distinct rows. So we can't expect it should return distinct values for only one column before which we have wrote distinct. Below example will better explain you what i am trying to say.
create table test(id1 int, id2 int);
insert into test values(1,1),(1,2),(1,3);
Here i have created a test table and when i use distinct keyword as used in below query
select distinct id1, id2 from test;
Then we'll get output like this:
id1 id2
1 1
1 2
1 3
You are inserting tour_ID as 0, and as you have defined tour_id and player_id as unique key in oncourt.players_atp table. So your select query is selecting tour_id as '0' every time. Because select query with distinct is getting really distinct records like say player_id is 1,2,3 and names are john, steve, bill respectively then select query will return this 3 records like (0, 1, john), (0, 2, steve), (0, 3, bill) and so on.
If your oncourt.players_atp table also has unique constraint and that table also contains tour_id then you can just copy tour ID from there. If tour_id is not present there and you want to generate it inside belgarath.players table only then in you table definition you can define tour id as a auto increment then it will generate unique id's there and then you don't need to select tour_id in your query you just have to insert player_id and player_name once you define tour_id as an autoincrement ID.
Hope this may help you.

Related

SQL adding in table rows with same id and different second id

I have a databasw that has 2 columns.
First is category id, the second is group id.
Both are primary.
And i need to add to the table to the same category id some rows with group id 1 and 2.
I mean all the categories need to have groups 1,2,3
INSERT INTO `ps_category_group` (`id_category`, `id_group`)
SELECT `id_category`, 2 FROM (
SELECT DISTINCT `id_category`,`id_group` From `ps_category_group`) as x;
gives me
1062 - Duplicate entry '2-2' for key 'PRIMARY'
You are selecting values from the same table for inserting into the table. Then you are asking why you are getting primary key violations? Of course you are; that is exactly what your query is doing.
Based on the description, you seem to want:
INSERT INTO `ps_category_group` (`id_category`, `id_group`)
SELECT `id_category`, 2
FROM `ps_category_group`
ON DUPLICATE KEY UPDATE id_category = VALUES(id_category); -- This does nothing if the value already exists in the table

Using WHERE on grouped rows after UNION statement

I have a database schema with two tables, song and edited_song. These tables are identical, except for one extra column in edited_song called deleted. The edited_song-table contains a reference to the id in the song-table. I want to find all the songs which aren't deleted.
I have a UNION-statement in which I GROUP on the id of the result of two SELECT-statements. I want to exclude results where the deleted column has the value 1. An example of the setup can be seen here.
CREATE TABLE if not exists song
(
id int(11) NOT NULL auto_increment ,
title varchar(255),
PRIMARY KEY (id)
);
CREATE TABLE if not exists editedsong
(
id int(11) NOT NULL auto_increment ,
title varchar(255),
deleted tinyint(1),
PRIMARY KEY (id)
);
INSERT INTO song (id, title) VALUES
(1, 'Born in the USA');
INSERT INTO editedsong (id, title, deleted) VALUES
(1, 'Born in the USA', 1);
And the query is here:
SELECT * FROM
((SELECT *, 0 AS deleted FROM song WHERE id=1)
UNION
(SELECT * FROM editedsong WHERE id=1)) AS song
WHERE song.deleted!=1
GROUP BY song.id
The UNION-statement is used instead of a join as there is a LOT of text in these two tables and a join results in writing to disk. This is a simplified form of the real query, but it reproduces the problem I'm experiencing. I would expect the query to yield no results as the GROUP BY should preserve the first row and throw away all following. Why doesn't it do this? Is it because the WHERE is executed before the GROUP BY? If it is, what is a good way to overcome this problem?
http://sqlfiddle.com/#!2/5cdb6c/3
The reason that the code in the SQLFiddle doesn't work is that the WHERE clause is excluding the deleted record from editedsong before the GROUP BY is executed.
You can use HAVING to apply criteria after a GROUP BY clause.
This appears to work:
SELECT *, max(deleted) as md FROM
((SELECT *, 0 AS deleted FROM song)
UNION
(SELECT * FROM editedsong)) AS song
-- WHERE song.deleted!=1
GROUP BY song.id
HAVING md != 1
This returns the record from song, not the record from editedsong for records that haven't been deleted. If you want the other, reverse the order of the items in the UNION clause.
This syntax for GROUP BY is unusual, and I'm surprised it's supported. Most database systems I've worked with require every field in the output to have some treatment specified (MAX, COUNT, GROUP BY, etc). So a SELECT * is incompatible with GROUP BY. MySQL must be making some assumption or have some default behaviours here, but I think most servers wouldn't like it (me either).

Select records based on multiple value of foreign key

i am currently writing query. i want to select all records from table . records will be based on mutiple values of foreign key. for example all records related to 1 and 2 both
eg. table might have
id name uid
1 bil 3
2 test 3
3 test 4
4 test 4
5 bil 5
6 bil 5
i want to select all records related to 3 but also related to 4 in this case it is record number 2
SELECT id
FROM `table`
WHERE uid = value1 AND like_id
IN (SELECT like_id
FROM likes
WHERE uid = uid2)
LIMIT 0 , 30
It's not at all clear where "value1" is coming from, or "uid2" is coming from, or where the column "like_id" is coming from. Those column names do not appear in your sample table. Your example query references two different table names (table and likes), yet you only show data for one example table, and that table does not have a column named like_id.
If we assume that "value1" and "uid2" in your query are literals, or bind parameters supplied to the query, which seems to be reasonable, given your specification (variously), of values of 1,2,3 and 4. But we're still left with "like_id" column. Given that it's referenced in the SELECT list of the IN subquery, we're going to presume that's a column in the "likes" table, and given that it's referenced in the outer query, we're going to assume that it's a column in the (unfortunately named) table table.
(Bottomline, it's not at all clear how your query is returning a "correct" result, given that you've made it impossible to replicate a working test case.)
Given a single table, as shown in your example data, e.g.
CREATE TABLE likes (id INT, name VARCHAR(4), uid INT);
INSERT INTO likes VALUES (1,'bil',3),(2,'test',3),(3,'test',4)
,(4,'test',4),(5,'bil',5),(6,'bil',5);
ALTER TABLE likes ADD PRIMARY KEY (id);
ALTER TABLE likes ADD CONSTRAINT likes_ix UNIQUE KEY (uid, name);
Assuming that we're running a query against that single table, and that we're matching "likes" associated with uid=3 to "likes" associated with uid=4, and that the matching is done on the "name" column, then
SELECT t.id
FROM `likes` t
WHERE t.uid = 3
AND EXISTS
( SELECT 1
FROM `likes` s
WHERE s.name = t.name
AND s.uid = 4
)
That will return the id of the row from the likes table for uid=3 where we also find a row in the likes table for uid=4 with a matching name value.
Given a limited number of rows to be inspected from the likes table on the outer query, that gives a limited number of times a correlated subquery would need to be run, which should give reasonable performance:
For large sets, a join operation generally performs better to return an equivalent result:
SELECT t.id
FROM `likes` t
JOIN `likes` s
ON s.name = t.name
AND s.uid = 4
WHERE t.uid = 3
GROUP
BY t.id
The key to optimum performance for either query is going to be appropriate indexes.

Deleting duplicates in mysql (2 tables)

I have two tables (id_test, test) , each of them has an ID column, which is unique, and two entries with the same id in the two tables are the same. Now, i have another column in one of the tables (id_test) that also should be unique, so I want to eliminate duplicates according to this other column, let's call it YD.
To identify the duplicates I used
SELECT ID, YD AS x, COUNT(*) AS y
FROM id_test
GROUP BY x
HAVING y>1;
now, I want to delete these entries in both tables. How can I do it?
This query shows the first ID for every YD in id_test table:
SELECT ID, YD
FROM id_test
GROUP BY YD
and these are the rows you have to keep. The following query returns the IDs you have to delete:
SELECT id_test.ID
FROM id_test LEFT JOIN (select ID, YD from id_test group by YD) id_test_keep
on id_test.ID=id_test_keep.ID and id_test.YD = id_test_keep.YD
WHERE id_test_keep.ID IS NULL
Now I think i need more details about your tables, but what I think you need is this:
DELETE FROM test
WHERE
test.ID IN (
SELECT id_test.ID
FROM id_test LEFT JOIN (select ID, YD from id_test group by YD) id_test_keep
on id_test.ID=id_test_keep.ID and id_test.YD = id_test_keep.YD
WHERE id_test_keep.ID IS NULL)
As documented under ALTER TABLE Syntax (emphasis added):
IGNORE is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if duplicate-key errors occur. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.
Therefore:
ALTER IGNORE TABLE id_test ADD UNIQUE (YD)
I think you don't user select in because if data large it impossible.
You should clone a table the same structure. Insert data not duplicate in it.
INSERT INTO test_new (ID, YD) SELECT t.ID, t.YD FROM test t LEFT JOIN test_id ti ON t.ID = ti.id WHERE ti.id IS NULL;
After drop table test, rename test_new -> test.

Deleting duplicate rows from a table

I have a table in my database which has duplicate records that I want to delete. I don't want to create a new table with distinct entries for this. What I want is to delete duplicate entries from the existing table without the creation of any new table. Is there any way to do this?
id action
L1_name L1_data
L2_name L2_data
L3_name L3_data
L4_name L4_data
L5_name L5_data
L6_name L6_data
L7_name L7_data
L8_name L8_data
L9_name L9_data
L10_name L10_data
L11_name L11_data
L12_name L12_data
L13_name L13_data
L14_name L14_data
L15_name L15_data
see these all are my fields :
id is unique for every row.
L11_data is unique for respective action field.
L11_data is having company names while action is having name of the industries.
So in my data I'm having duplicate name of the companies in L11_data for their respective industries.
What I want is to have is unique name and other data of the companies in the particular industry stored in action. I hope I have stated my problem in a way that you people can understand it.
Yes, assuming you have a unique ID field, you can delete all records that are the same except for the ID, but don't have "the minimum ID" for their group of values.
Example query:
DELETE FROM Table
WHERE ID NOT IN
(
SELECT MIN(ID)
FROM Table
GROUP BY Field1, Field2, Field3, ...
)
Notes:
I freely chose "Table" and "ID" as representative names
The list of fields ("Field1, Field2, ...") should include all fields except for the ID
This may be a slow query depending on the number of fields and rows, however I expect it would be okay compared to alternatives
EDIT: In case you don't have a unique index, my recommendation is to simply add an auto-incremental unique index. Mainly because it's good design, but also because it will allow you to run the query above.
ALTER IGNORE TABLE 'table' ADD UNIQUE INDEX(your cols);
Duplicates get NULL, then you can delete them
DELETE
FROM table_x a
WHERE rowid < ANY (
SELECT rowid
FROM table_x b
WHERE a.someField = b.someField
AND a.someOtherField = b.someOtherField
)
WHERE (
a.someField,
a.someOtherField
) IN (
SELECT c.someField,
c.someOtherField
FROM table_x c
GROUP BY c.someField,
c.someOtherField
HAVING count(*) > 1
)
In above query the combination of someField and someOtherField must identify the duplicates distinctively.