I am using PHPAdmin to create a unique index on a field. It comes back with Error 1062 - Duplicate Key on . . . then it gives me the offending data. The issue is the data is NOT a duplicate. Each record has a unique entry in that field. Thinking it was an anomaly, I deleted that entry and tried again. It gave me the same error this time on the last row before the deleted record.
Table schema:
CREATE TABLE prospects (
client_id int(11) NOT NULL AUTO_INCREMENT,
company varchar(64) DEFAULT NULL,
created_on timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
... some other fields like first_name...
PRIMARY KEY (client_id),
KEY first_name (first_name,last_name)
) ENGINE=MyISAM AUTO_INCREMENT=1958 DEFAULT CHARSET=latin1
Alter table statement failing:
ALTER TABLE acceler6_accelrefer.prospects ADD UNIQUE company_ui (company);
Any help or insight would be appreciated.
As much as you don't want to hear it again from the comments, you've got a duplicate company name. Note, this does not mean the entire record is a duplicate, but when you add a unique on company, every record has to have a unique company. I'm going to guess you've sometimes got more than one prospect entry per company.
To verify, try this:
SELECT count(company), count(distinct company) FROM prospects;
If these number are identical, then ok, you win, you've not got more than one record with the same company, but I'm certain they'll be different.
To find out exactly which ones are duplications you can do this:
SELECT company, count(company) AS counter
FROM prospects
GROUP BY company
HAVING counter > 1;
If you just want fast lookup of the client_id's by company, drop the UNIQUE and just use a regular key.
ALTER TABLE acceler6_accelrefer.prospects ADD KEY company_ui (company);
Related
Context:
my user_feed table is stored in a MariaDB database
the table uses the InnoDBstorage engine
Clarity note: Throughout the question, whenever I use the term "user's feed" what I'm referring to is all the records in the user_feed table that have identical values set for the user_id field.
So initially, on login, the user gets the top 40 posts that have their user id as a foreign key in the user_feed table. The query gets the top 40 posts using an ORDER BY date_created clause. When the user scrolls down to, let's say post number 30, I want to query for the next 40 posts in their feed. Right now, I plan on using the date created of the last post the user has in the app, to determine what posts to get from the user_feed table.
My question is: If I set the date_created timestamp of a post when it is inserted into the user_feed table, is it possible that two posts for a particular user's feed will have the same timestamp?
user_feed table:
CREATE TABLE `user_feed` (
`user_id` int(1) unsigned NOT NULL,
`post_id` int(1) unsigned NOT NULL,
`reposter_id` int(1) unsigned DEFAULT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`user_id`,`post_id`),
KEY `user_id` (`user_id`),
KEY `date_created` (`date_created`),
KEY `post_id` (`post_id`),
KEY `reposter_id` (`reposter_id`),
CONSTRAINT `user_feed_ibfk_1` FOREIGN KEY (`post_id`) REFERENCES `post` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `user_feed_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `user` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Update: So the complexity that it takes to use time as an indicating factor for determining uniqueness seems to be out of the question. The next option would be to have a column id that increments with each new post, so I can query for the next 40 records in the user_feed table that correspond to a particular user_id and have ids that are smaller than the 40th post's id that was received from the last query. However, there seem to be some issues with this approach as well:
One, a user's feed cannot have records with identical post_ids i.e. if you looked at a user's entire feed, you wouldn't see two records that corresponded to the same post. This means that whenever a repost is made, a deletion has to occur if a particular user's feed contains a record with the same post_id of the one that was reposted. Then, there will be an insertion of a new record that has the reposter_id field set. The other option would be to update the existing record by setting the reposter_id field to the id of the reposter and the date_created field to the date it was reposted. Using updates seems to be more efficient, but with a new auto_incrementing id column, I'd have to update the auto_incrementing column manually by getting the next possible auto_increment value and use it to update the id field.
The immediate problem I see with this is: What if while the new repost's id field is being updated another user creates a post meant for this user's feed and because the idcolumn doesn't need to be set for new posts manually (the id of a post that never existed before will never already be in the user_feed table) the creation of the new post record beats the update of the other record and has the same id that was retrieved for the update; leading to a primary key already exists exception.
The other issue that seems to exist with using a unique id column for the entire table is if a single post will have a unique id for each user's feed it's placed in -- fan out system for getting a user's feed, if you haven't noticed by now. And, a single post can be reposted millions of times, so each of those reposts will also have a unique id. It seems like the value for id column would increment too quickly, unless each user's feed had it's own auto_incrementing field i.e. to get the next highest id value for a user's feed I'd have to add 1 to the result of the following query:
.
SELECT MAX('id') FROM user_feed WHERE user_id = :(id of the user whose feed
is changing)
Any feedback on the two points above?
What a tangled web. Let's start with some principles...
Inserts, replacements, and deletes can occur while a client is scanning the list?
You need a unique key for each row in the table. This should be the PRIMARY KEY, and may as well be an id .. AUTO_INCREMENT.
You to fetch a range or rows ordered by time. Suggest INDEX(date_created, id). This is ordered, and has no dups (because of id). There is no need to say UNIQUE instead of INDEX. Do not use OFFSET.
You need to replace an existing item. For that, you need a unique key. It could be id, if you can hang on to the value until you need to do the replacement. Or you could have another column (or combination of columns) that are unique in order to determine which row to reinsert or replace or update. Note that you will probably want the old timestamp to be kept. You probably don't care if the id changes; in the rare case of a duplicate timestamp, the pair of items may be swapped.
I have a composite unique key on two columns, "user_id" & "project_id".
When I try to run a DELETE query on single rows or multiple rows, I get an error.
ERROR 1062: 1062: Duplicate entry '87-1736' for key 'index_on_user_id_and_project_id'
SQL Statement:
DELETE FROM `members` WHERE `id`='39142'
The table has a single column primary key, 2 single column unique indexes (for user_id and project_id), and 1 composite unique index on user_id and project_id. No foreign keys in the database.
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL DEFAULT '0',
`project_id` int(11) NOT NULL DEFAULT '0',
`created_on` datetime DEFAULT NULL,
`mail_notification` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `index_on_user_id_and_project_id` (`user_id`,`project_id`),
KEY `index_members_on_user_id` (`user_id`),
KEY `index_members_on_project_id` (`project_id`)
This error only shows up for certain entries (a lot of entries) and it is consistently those entries that are problematic (e.g. 87 and 1736 pair shown above).
I have tried looking for duplicates and none were found. I was able to find some entries in there with "0"s in the fields and I removed those entries. No NULL fields were found.
I have tried:
looking for duplicates, found none.
looking for zero or NULL values in the index fields, deleted, but did not solve
removing the composite unique index, did not solve.
alter ignore table ... add unique index (user_id, project_id), it found no duplicates, threw a warning about IGNORE being deprecated, and did not solve
How do I delete these problematic entries?
It is impossible for a delete statement itself to generate a duplicate key error. At least, I cannot think of any way for that to happen in an unbroken database. After all, if you are removing a value, it can't conflict with another value.
That leaves the possibility that something else is going on. The only reasonable alternative is a trigger on the table. It is unfortunate that the error message doesn't specify the table name, but that is the only cause that I can readily think of.
I've bumped into this before when I've had the target table (t1) tied to a history table (t1_hist). t1_hist was populated by a trigger on changes to the t1 (any add/change/delete). Once I deleted the unwanted records from the t1_hist, I was able to delete from the t1. This required a second pass delete from the t1_hist table because it recorded my deletes from t1.
Simply, mine was: DELETE FROM t1 WHERE customer_number > 50000;
Same error. Did the same from t1_hist first (and last), then no problem.
I have the following (simplified) database table that represents teams in a tournament. Each team belongs to a pool and has a rank (1st, 2nd, etc) in that pool.
CREATE TABLE `team` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`pool_id` bigint(20) NOT NULL,
`rank` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `pool_id_rank_idx` (`pool_id`,`rank`),
)
The unique key is there because a single pool cannot have two teams with the same rank.
When teams play each other, their ranks change and I need to update them. I'd like to be able to update all the ranks in one query, but I cannot figure out a way to do it that doesn't sometimes cause a "duplicate entry" error. Here's an example of a situation that causes this problem:
Two teams (A and B) are ranked 1 and 2 respectively in a pool. They play each other and B beats A. Now I need to switch their ranks. The query I'm using is something like this:
UPDATE team
SET rank = CASE id WHEN idA THEN 2 WHEN idB THEN 1 END
WHERE id IN (idA,idB);
idA and idB are the ids of the corresponding teams
This seems like it should work nicely, but I get this error:
ERROR 1062 (23000): Duplicate entry '1-2' for key 'pool_id_rank_idx'
I think this happens because MySQL checks the unique keys after each row is changed.
Is there any way to postpone the unique key check until after all the changes are done?
Rank is dynamic value, why have you added to unique key? Or you mean it's not possible two similar rank values to be for one pool?
When you update data, there are two similar pool_id, rank pairs appear - that's why.
Remove Unique Key:
ALTER TABLE team DROP INDEX pool_id_rank_idx
I have a simple table set up with two columns, each column is a key value. the values stored in each field are varchar(45) representing an email address and a keyword. It is possible that the information collected may duplicate itself as it is related to site browsing data collection. To avoid duplicate entries, I used tried to use INSERT IGNORE into, REPLACE into, and finally I'm trying the following:
insert into <table name> (user_email, key_token) values ('<email>#<this>.com', 'discountsupplies') on duplicate key update user_email='<email>#<this>.com',key_token='discountsupplies';
but I am still seeing duplicate records being inserted into the table.
The SQL that generated the table:
DROP TABLE IF EXISTS `<database name>`.`<table name>` ;
CREATE TABLE IF NOT EXISTS `<database name>`.`<table name>` (
`user_email` VARCHAR(45) NOT NULL ,
`key_token` VARCHAR(45) NOT NULL,
PRIMARY KEY (`user_email`, `key_token`) )
ENGINE = InnoDB;
While I saw several questions that were close to this one, I did not see any that addressed why this might be happening, and I'd like to figure out what I'm not understanding about this behavior. Any help is appreciated.
As an addendum, After adding the UNIQUE KEY statements, I went back and tried both REPLACE and INSERT IGNORE to achieve my goal, and none of these options is excluding duplicate entries.
Also adding: UNIQUE INDEX (user_email, key_token)
doesn't seem to help either.
I'm going to do this check via a manual look-up routine until I can figure this out. If I find an answer I'll be happy to update the post.
Added Unique Index lines below the original create table statement -
-- -----------------------------------------------------
-- Table `<db name>`.`<table name>`
-- -----------------------------------------------------
DROP TABLE IF EXISTS `<db name>`.`<table name>` ;
CREATE TABLE IF NOT EXISTS `<db name>`.`<table name>` (
`user_email` VARCHAR(45) NOT NULL ,
`key_token` VARCHAR(45) NOT NULL,
PRIMARY KEY (`user_email`, `key_token`),
UNIQUE KEY (user_email),
UNIQUE KEY (key_token)
)
ENGINE = InnoDB;
CREATE UNIQUE INDEX ix_<table name>_useremail on `<db name>`.`<table name>`(user_email);
CREATE UNIQUE INDEX ix_<table name>_keytoken on `<db name>`.`<table name>`(key_token);
it seems to be ok (no errors when creating tables during the source step), but I'm still getting duplicates when running the on duplicate query.
You have a composite primary key on both columns.
This means that it's the combination of the fields is UNIQUE, not each field as is.
Thes data are possible in the table:
1#example.com 1
2#example.com 1
2#example.com 2
, since no combination of (user_email, key_token) repeats in the table, while user_email and key_token as themselves can repeat.
If you want each separate column to be UNIQUE, define the UNIQUE constraints on the fields:
CREATE TABLE IF NOT EXISTS `<database name>`.`<table name>` (
`user_email` VARCHAR(45) NOT NULL ,
`key_token` VARCHAR(45) NOT NULL,
PRIMARY KEY (`user_email`, `key_token`),
UNIQUE KEY (user_email),
UNIQUE KEY (key_token)
)
ENGINE = InnoDB;
Update
Having duplicates in a column marked as UNIQUE would be a level 1 bug in MySQL.
Could you please run the following queries:
SELECT user_email
FROM mytable
GROUP BY
user_email
HAVING COUNT(*) > 1
SELECT key_token
FROM mytable
GROUP BY
key_token
HAVING COUNT(*) > 1
and see if they return something?
PRIMARY KEY (user_email,key_token) means a combination of both will be unique but if you also want individual email and key_tokens to be unique you have to use UNIQUE seperately for each column..
PRIMARY KEY ('user_email', 'key_token'),
UNIQUE KEY (user_email),
UNIQUE KEY (key_token)
final solution for now: query table to get list of key_tokens by user_email, test current key_token against list entries, if found don't insert.
Not optimal or pretty, but it works....
To me it looks like you selected composite Primary Key solely for performance reasons where it should be an index like so
CREATE TABLE IF NOT EXISTS `<database name>`.`<table name>` (
`user_email` VARCHAR(45) NOT NULL ,
`key_token` VARCHAR(45) NOT NULL,
PRIMARY KEY (`user_email`),
INDEX (`user_email`, `key_token`)
)
Of course if you are concerned about getting a duplicate key_token you'll still need a unique index.
Sorry I'm awfully late to reply, but perhaps someone will stumble on this like I have :)
RESOLVED
From the developer: the problem was that a previous version of the code was still writing to the table which used manual ids instead of the auto increment. Note to self: always check for other possible locations where the table is written to.
We are getting duplicate keys in a table. They are not inserted at the same time (6 hours apart).
Table structure:
CREATE TABLE `table_1` (
`sales_id` int(10) unsigned NOT NULL auto_increment,
`sales_revisions_id` int(10) unsigned NOT NULL default '0',
`sales_name` varchar(50) default NULL,
`recycle_id` int(10) unsigned default NULL,
PRIMARY KEY (`sales_id`),
KEY `sales_revisions_id` (`sales_revisions_id`),
KEY `sales_id` (`sales_id`),
KEY `recycle_id` (`recycle_id`)
) ENGINE= MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=26759 ;
The insert:
insert into `table_1` ( `sales_name` ) VALUES ( "Blah Blah" )
We are running MySQL 5.0.20 with PHP5 and using mysql_insert_id() to retrieve the insert id immediately after the insert query.
I have had a few duplicate key error suddenly appear in MySql databases in the past even though the primary key is defined and auto_increment. Each and every time it has been because the table has become corrupted.
If it is corrupt performing a check tables should expose the problem. You can do this by running:
CHECK TABLE tbl_name
If it comes back as corrupt in anyway (Will usually say the size is bigger than it actually should be) then just run the following to repair it:
REPAIR TABLE tbl_name
Does the sales_id field have a primary (or unique) key? If not, then something else is probably making inserts or updates that is re-using existing numbers. And by "something else" I don't just mean code; it could be a human with access to the database doing it accidentally.
As the other said; with your example it's not possible.
It's unrelated to your question, but you don't have to make a separate KEY for the primary key column -- it's just adding an extra not-unique index to the table when you already have the unique (primary) key.
We are getting duplicate keys in a table.
Do you mean you are getting errors as you try to insert, or do you mean you have some values stored in the column more than once?
Auto-increment only kicks in when you omit the column from your INSERT, or try to insert NULL or zero. Otherwise, you can specify a value in an INSERT statement, over-riding the auto-increment mechanism. For example:
INSERT INTO table_1 (sales_id) VALUES (26759);
If the value you specify already exists in the table, you'll get an error.
Please post the results of this query:
SELECT `sales_id`, COUNT(*) AS `num`
FROM `table_1`
GROUP BY `sales_id`
HAVING `num` > 1
ORDER BY `num` DESC
If you have a unique key on other fields, that could be the problem.
If you have reached the highest value for your auto_increment column MySQL will keep trying to re-insert it. For example, if sales_id was a tinyint column, you would get duplicate key errors after you reached id 127.