ERROR 1062 (23000): Duplicate entry '37205' for key 'citystate.PRIMARY' - mysql

In this image is the same error code of my title, I created a new table successfully and what I had to do was insert data from another table to this new table. For the first time it worked, but when I had to insert data from a second table into the new table, I got this error code. What can I do to avoid that?
This is was what I used to create the table
CREATE TABLE cityState (
city VARCHAR(90) NOT NULL,
state CHAR(2) NOT NULL,
zipCode CHAR(5) NOT NULL UNIQUE,
primary key (zipCode)
);
This next set of commands worked
INSERT INTO cityState (city, state, zipCode)
SELECT city, state, zipCode
FROM crew;
This next set of commands gets the error code, I'm trying to fix the whole duplicate thing basically
INSERT INTO cityState (city, state, zipCode)
SELECT city, state, zipCode
FROM passenger;

If you want to just ignore duplicate zipcodes, change INSERT to INSERT IGNORE.
But your whole table design is confusing to me; some cities have multiple zipcodes and some zipcodes have multiple cities. If you get rid of the unique constraint on zipcode, you may want to just do:
INSERT INTO cityState (city, state, zipCode)
SELECT city, state, zipCode FROM crew
UNION DISTINCT
SELECT city, state, zipCode FROM passenger

It is somewhat difficult to know for sure exactly what is going on without some examples of the data you are trying to copy from one table to another, but looking at your SQL, you've declared the zipCode field of your ctyState table to be UNIQUE and serve as the PRIMARY KEY. My guess is that some record from your passenger table has a zipCode that already exists in the cityState table and therefore the UNIQUE index is preventing it from being added because it already exists and therefore would cease to be unique.
Keep in mind that there are cases where a single zip code could serve more than one city. These cases are rare, but they do exist. For example, the zip code 94608 in California is used for both Emeryville and parts of Oakland.

Related

Remove Duplicates in MySQL

I have a database table that was generated by importing several thousand text documents each very large. For some reason, some files were imported multiple times.
I am trying to remove duplicate rows by using following query:
ALTER IGNORE TABLE mytable ADD UNIQUE INDEX myindex (LASTNAME, FIRSTNAME, HOUSENUMBER, STREET, CITY, ZIP, DOB, SEX);
but I was getting an error
1062 - Duplicate entry
Apparently, IGNORE has been deprecated.
How can I remove duplicates from my database?
I guess I have to do a DELETE with a JOIN but I can't figure out the code.
The table is InnoDB and currently has about 40,000,000 rows (there should be about 17,000,000). Each row has a primary key.
Considering the size, I am hesitant to temporally change the table to MyISAM.
Each row has a primary key
Is a unique number?
Create an AUX table like this(assuming ID is the PK):
create table mytable_aux as (
select LASTNAME, FIRSTNAME, HOUSENUMBER, STREET, CITY, ZIP, DOB, SEX, MIN(ID)
from mytable
group by LASTNAME, FIRSTNAME, HOUSENUMBER, STREET, CITY, ZIP, DOB, SEX);
Then delete everything that is not in aux table:
delete from mytable where id not in (select aux.id from mytable_aux aux) ;
Assuming it is just one table and you have the SQL dump available...
CREATE the table with all the relationships established but no data inserted. Keep the INSERT statements stored in a separate .sql file.
Change all the INSERT statements to INSERT IGNORE.
Import the updated .sql file containing only the INSERT IGNORE statements. The duplicates will be automatically ignored.
Please note that, without comparing manually, you won't be able to figure out which or how many records were ignored.
However, if you're absolutely sure that you really don't need the duplicates based on the relationships defined on the table, then this approach works fairly well.
Also, if you'd like to do the same with multiple tables, you'll have to make sure that you CREATE all the tables at the start, define the foreign keys / dependencies AND, most importantly, arrange the new .sql file in such a manner that the table that has no dependency gets the INSERT statements loaded first. Likewise, the last set of INSERT statements will be for the table with the most number of dependencies.
Hope that helps.
If those are the only fields in your table you can always:
create table temp_unique as
select distinct LASTNAME, FIRSTNAME, HOUSENUMBER, STREET, CITY, ZIP, DOB, SEX
from mytable
then rename (or drop if you dare) mytable and rename temp_unique to mytable, then create your indexes (make sure to create any other indexes or FKs or whatever that already exist).
If you're working on a live table you'll have to delete the underlying records one at a time. That's quite a bit different -- add a uid then perform deletes. If that's your situation, let us know, we can refactor.

MySQL relationship issue

I've two tables structure is below
person (id, fname, lname, ph, mob, dob, email)
address (id, address1, address2, address3, town, county, postcode)
person_address (id, person_id, address_id)
I've have a issue here, if one person have more than one address how would i work out which is active or current address, should i add a direct link of address table such as person (id, fname, lname, ph, mob, dob, email, address_id)
or should i add a link to person_address link to person table person (id, fname, lname, ph, mob, dob, email, person_address_id)
any idea
Presumably, a person can have only one current address. If so, you should add a column into the person table, called something like CurrentAddress.
If you require a current address, you can even declare CurrentAddress to be NOT NULL.
If a person could have more than one current address, then use a flag in person_address.
Now, if you want the current address to be the most recently inserted address, you can use a trigger to reset the value on each insert. Or, if your database is not too big (thousands of rows, not millions of rows), you can calculate it on the fly by choosing the person_address record with the most recent creation time.
EDIT:
#Joanvo's point is a good point. You can fix it by adding a foreign key constraint in person referring to to person_address. You will have to create a unique constraint on current_address(person_id, address_id) and use that for the foreign key.
You should add a bit in the person_address table to indicate if it is the current address or not. Make sure there are not more than one current addresses via sql or code checks:
person_address (id, person_id, address_id, current)
If you add a reference of the person_address table into the person table that would create a circular dependancy between person_address and the person table, since the person_address table already encapsulates a reference to the person table. Thing that you should never do when dealing with database design.
Now, If you want a clean design, I would recommand to a reference from the adress table to the person table like you did in your first example. So your design solution will be like follows:
person (id, fname, lname, ph, mob, dob, email, address_id)
EDIT:
Or a solution will be a current_flag in your person_adress table, so when you want to have the current address for a person, you will search for the person_adress entry that has the current_flag set to 1 for that person.
person_address (id, person_id, address_id, current_flag)

Check if entry exists and not insert in mysql

I am doing an insert from an imported table and adding data into multiple tables using mysql.
Basically when doing the insert there are some null fields as the data has been imported from a csv.
What I want to do is extract the data and not create multiple null entries. An example is when adding contacts which have no entries. Basically I want to have one entry in the table which can be bound to the id within the table.
How can i do this?
My current code is
Insert into Contact(FirstName, Surname, Position, TelephoneNo, EmailAddress, RegisteredDate)
Select Distinct Import.FirstName, Import.SecondName, Import.JobTitle,
Import.ContactTelNumber, Import.EmailAddress, Import.RegistrationDate
FROM Import
This basically imports and does no checks but where can I add check for this?
It's hard to infer exactly what you mean from your description. It would help if you showed a couple of example lines, one that you want included and one that you want to be excluded.
But you can add a variety of conditions in the WHERE clause of your SELECT. For example, if you just want to make sure that at least one column in Import is non-null, you could do this:
INSERT INTO Contact(FirstName, Surname, Position, TelephoneNo,
EmailAddress, RegisteredDate)
SELECT DISTINCT FirstName, SecondName, JobTitle,
ContactTelNumber, EmailAddress, RegistrationDate
FROM Import
WHERE COALESCE(FirstName, SecondName, JobTitle, ContactTelNumber,
EmailAddress, RegistrationDate) IS NOT NULL
COALESCE() is a function that accepts a variable number of arguments, and returns the first non-null argument. If all the arguments are null, it returns null. So if we coalesce all the columns, and we get a null, then we know that all the columns are null, and we exclude that row.
Re your comment:
Okay, it sounds like you want a unique constraint over the whole row, and you want to copy only rows that don't violate the unique constraint.
One way to accomplish this would be the following:
ALTER TABLE Contact ADD UNIQUE KEY (FirstName, Surname, Position, TelephoneNo,
EmailAddress, RegisteredDate);
INSERT IGNORE INTO Contact(FirstName, Surname, Position, TelephoneNo,
EmailAddress, RegisteredDate)
SELECT DISTINCT FirstName, SecondName, JobTitle,
ContactTelNumber, EmailAddress, RegistrationDate
FROM Import;
The INSERT IGNORE means if it encounters an error like a duplicate row, don't insert it, but also don't abort the insert for other rows.
The unique constraint creates an index, so it will take some time to run that ALTER TABLE, depending on the size of your table.
Also it may be impractical to have a key containing many columns. Indexes have a limit of 16 columns and 1000 bytes total in length. However, I would expect that what you really want is to restrict to one row per EmailAddress or some other subset of the columns.

Remove Duplicate Data From mysql database

To work on database related stuffs. Mostly it is done when client send you its data in form of excel sheets and you push that data to database tables after some excel manipulations. I have also done it many times.
A very common problem faced in this approach is that it might result in duplicate rows at times, because data sent is mostly from departments like HR and finance where people are not well aware of data normalization techniques [:-)].
I will use Employee table where column names are id, name, department and email.
Below are the SQL scripts for generating the test data.
Create schema TestDB;
CREATE TABLE EMPLOYEE
(
ID INT,
NAME Varchar(100),
DEPARTMENT INT,
EMAIL Varchar(100)
);
INSERT INTO EMPLOYEE VALUES (1,'Anish',101,'anish#howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (2,'Lokesh',102,'lokesh#howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (3,'Rakesh',103,'rakesh#howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (4,'Yogesh',104,'yogesh#howtodoinjava.com');
--These are the duplicate rows
INSERT INTO EMPLOYEE VALUES (5,'Anish',101,'anish#howtodoinjava.com');
INSERT INTO EMPLOYEE VALUES (6,'Lokesh',102,'lokesh#howtodoinjava.com');
Solution:
DELETE e1 FROM EMPLOYEE e1, EMPLOYEE e2 WHERE e1.name = e2.name AND e1.id > e2.id;
delete all duplicate record in excel sheet itself using filter and then insert those record in ur database
use distinct keyword for unique
How to insert Distinct Records from Table A to Table B (both tables have same structure)
check this stack overflow link
Add Unique constraint on name field and use below query
INSERT INTO EMPLOYEE VALUES
Please refer "INSERT IGNORE" vs "INSERT ... ON DUPLICATE KEY UPDATE"
You can always make sure before inserting to the database either that record already exists or not, in your case you can have condition on its unique key which will be different for every employee. Moreover you can have a single column as unique key or a composite key that uses more than one columns to uniquely identify a record.

MySQL update if exists

I have been trying to do insert / update records in a mysql table. Cannot use ON DUPLICATE KEY because i have nothing to do with the primary key.
Basically i have to update a record in the database
INSERT INTO table (city, state, gender, value) VALUES ("delhi","delhi","M",22)
If a record of that city, state, gender exists, then simply overwrite the value.
Can i achieve this without sending two queries from the programming language
actually you can still use ON DUPLICATE KEY, just add a unique index on the following columns, eg
ALTER TABLE tbl_name ADD UNIQUE index_name (city, state, gender)
your query now will be,
INSERT INTO table (city, state, gender, value)
VALUES ('delhi','delhi','M', 22)
ON DUPLICATE KEY UPDATE value = 22
Keep in mind that constructs such as ON DUPLICATE KEY and REPLACE INTO were specifically designed to prevent exactly that. The only other way to prevent two queries from your application layer is by declaring a database function that does the same things.
Therefore, add either a UNIQUE(city, state, gender) key or a primary key that spans the same columns. The difference between the two lies in the value range of each column; primary keys force NOT NULL whereas UNIQUE allows for columns to be NULL.
The difference is subtle but can sometimes lead to unexpected results, because NULL values are considered to be unique. For example, let's say you have this data in your database:
nr | name
123 | NULL
If you try to insert another (123, NULL) it will not complain when you use UNIQUE(nr,name); this may seem like a bug, but it's not.