Cannot insert duplicate key: MySQL to SQL server - mysql

Suppose we have the following chunk of data (SQL table):
Col-A Col-B Col-C Col-D
1 1 1 1
1 1 1 2
1 1 1 3
2 2 2 4
2 2 2 5
In MySQL the table is defined as:
CREATE TABLE `my_table` (
`Col-A` INT(10) UNSIGNED NOT NULL DEFAULT '0',
`Col-B` INT(10) UNSIGNED NOT NULL DEFAULT '0',
`Col-C` INT(10) UNSIGNED NOT NULL DEFAULT '0',
`Col-D` INT(10) UNSIGNED NOT NULL DEFAULT '0',
PRIMARY KEY (`Col-A`, `Col-B`, `Col-C`),
KEY `my_index` (`Col-D`) USING BTREE
);
I need to convert MySQL database to SQL Server. Here is my initial attempt:
CREATE TABLE my_table (
Col-A INT NOT NULL DEFAULT(0),
Col-B INT NOT NULL DEFAULT(0),
Col-C INT NOT NULL DEFAULT(0),
Col-D INT NOT NULL DEFAULT(0),
CONSTRAINT my_pk PRIMARY KEY NONCLUSTERED (Col-A, Col-B, Col-C)
)
CREATE INDEX my_idx ON my_table(Col-D)
When I try to import data (I use bcp), the following error occurs:
Cannot insert duplicate key ... The duplicate key is (1, 1, 1)
I suspect that something is wrong with my_pk and my_idx definitions. Any pointers or suggestions?

You know the definition of a primary key?
http://en.wikipedia.org/wiki/Primary_key
In the relational model of database design, a unique key or primary key is a set of attributes whose values uniquely define the characteristics of each row.
When the combination of Col-A + Col-B + Col-C is not unique you violate the primary key constraint and thus SQL server won't allow it.
Your options are:
Extend the PK to include Col-D
Drop the PK and use a (clustered or not) Index on Col-A, Col-B and Col-C
Fix the data so it doesn't violate the PK constraint (either drop records or alter/correct incorrect records)
Add a synthetic (or surrogate) key (see mrjoltcola's answer)
Which option to choose is up to you and depends on your requirements. We can't answer that for you based only on the information in your question.
Why MySQL allowed this data to get in there in the first place... *shrugs* MySql is a "funny" beast. Maybe the PK constraint was added after the data was already in the table, maybe it's a really old version, maybe you're using MyISAM instead of InnoDB. I'm not sure which but each of these reasons (or combination of them) are a good guess or, at least, were decent guesses some time / versions ago. Either way: it shouldn't have been possible (even if the PK constraint was added later; MySQL should've denied adding it since the data in the table was conflicting) but MySQL had, and does have, it's own weird ways of reasoning about these kind of things. Strict mode helps if I recall correctly but I can't remember if that only works on InnoDB tables or also on MyISAM etc. Either way; they made a nice mess of it back in the day; I (or you) shouldn't have to worry about remembering the differences in underlying MyISAM/InnoDB/Whatevs etc. or which specific version allows what (not) to happen or if you need strict mode or not for this-or-that for basic stuff like PK's to work correctly*
* Each RDBMS has it's quirks; I'm sure there's a good reason for some switches/toggles/settings/whatevs to tweak some details, I'm saying PK's should be PK's no matter what.

For your data requirement, you cant use cols (A,B,C) as primary key. You need to either add (D) to the key, or add a surrogate key. See RobIII's answer https://stackoverflow.com/a/24703970/257090 for why.
I recommend you go with the latter, add an ID primary key so you have a single field key:
CREATE TABLE my_table (
ID INT IDENTITY PRIMARY KEY,
ColA INT NOT NULL DEFAULT(0),
ColB INT NOT NULL DEFAULT(0),
ColC INT NOT NULL DEFAULT(0),
ColD INT NOT NULL DEFAULT(0),
UNIQUE(ColA,ColB,ColC,ColD)
)
INSERT INTO my_table(cola, colb, colc, cold) VALUES(1,1,1,1)
INSERT INTO my_table(cola, colb, colc, cold) VALUES(1,1,1,2)
INSERT INTO my_table(cola, colb, colc, cold) VALUES(1,1,1,3)
INSERT INTO my_table(cola, colb, colc, cold) VALUES(2,2,2,4)
INSERT INTO my_table(cola, colb, colc, cold) VALUES(2,2,2,5)
SELECT * FROM my_table
ID ColA ColB ColC ColD
----------- ----------- ----------- ----------- -----------
1 1 1 1 1
2 1 1 1 2
3 1 1 1 3
4 2 2 2 4
5 2 2 2 5
(5 row(s) affected)
Now I can identify each row by a single key value.
delete from my_table where ID = 5
This is much more practical for any code you write against the database or ORMs you use.
NOTE: with surrogate (or synthetic keys) it is still important that you add any additional constraints to enforce data integrity of the actual data. A surrogate key doesn't keep you from inserting 1,1,1,1 multiple times, so add a unique constraint/index to those fields in addition to the primary key ID.

Related

MySQL autoincrement value range

Scenario is that I have 2 tables of the same structure, however I only want to allow php permissions to update table B, while table A can only be updated via DBMS.
These 2 tables are merged into a single php array, so I would like to set primary key ranges to seperate them at this point to avoid conflict of primary key (a simple autoincrement integer for best indexing).
As far as I know the simplest would be to constrain table A to have primary key auto increment values from 1000000 to 1999999 and then table B 2000000 upwards.
Is this possible to constrain min-max autoincrement values (I know I can start them at a given integer so asking if there is a simple 'max' to put on table A).
This simple configuration would ensure integrity.
Would an 'after_insert' type trigger work to remove the new row and throw an SQL error ?
You could create one table with id as mediumint (max 8 388 607 or twice as much for unsigned):
create table tableA( id mediumint(5) not null auto_increment, `test` varchar(5), primary key (id)) ;
and second with int and auto_increment value set over mediumint max:
create table tableB( id int(5) not null auto_increment, `test` varchar(5), primary key (id)) auto_increment=8388608 ;
https://dev.mysql.com/doc/refman/8.0/en/integer-types.html
But i think that much more elegant would be to utilize auto_increment_increment mechanism.
auto-increment-increment = 2 //global for all tables in mysql.ini
SET ##auto_increment_increment=2; //run-time just for one session
Set in tableA first auto_increment=1 and in tableB auto_increment=2 and You will never collide. One table will have odd ids and second will have even ids. This way You do not have to worry about reaching id limit.

Update of primary key would cause duplicate entries in foreign table

I have two tables described by the following SQL Fiddle. My application needs to insert new records in tblA in between two already existing records. For example, if tblA has 6 records with AID ranging from 0 to 5 and I want to insert a new record with AID being 4, I increment the AID of tuple 4 and tuple 5 by one and then insert the new record. Thus, I use the following prepared statement to increment the value of the column AID of the tuples of both tblA and tblB (via cascading) by one:
update tblA set AID = (AID + 1) where AID >= ? order by AID desc;
On my test Installation the above Statement works great. However, on our production system we get the following error message in some, but not all cases:
Foreign key constraint for table 'tblA', record '4' would lead to a duplicate entry in table 'tblB'
Now, it is unclear to me what exactly causes the problem and how to solve the issue.
I appreciate any tips. Thanks in advance!
About tblB
This
create table if not exists tblB(
BID integer not null,
AID integer not null,
constraint fkB_A foreign key(AID) references tblA(AID),
primary key(AID, BID)
);
should probably be
create table if not exists tblB(
BID integer not null,
AID integer not null,
constraint fkB_A foreign key(AID) references tblA(AID)
on update cascade,
-- ^^^^^^^^^^^^^^^^
primary key(AID, BID)
);
Surrogate ID numbers in the relational model of data and in SQL databases are meaningless. Unless you know more than you've included in your question, AID and BID are meaningless. In a properly designed database, there's never a need to insert a row between two other rows based solely on their surrogate ID numbers.
If your real-world requirement is simply to insert a timestamp between "2015-12-01 23:07:00" and "2015-12-04 14:58:00", you don't need the ID number 4 to do that.
-- Use single quotes around timestamps.
insert into tblA values (-42, '2015-12-03 00:00:00');
select * from tblA order by RecordDate;
AID RecordDate
--
0 2015-11-07 16:55:00
1 2015-11-08 22:16:00
2 2015-11-10 14:26:00
3 2015-12-01 23:07:00
-42 2015-12-03 00:00:00
5 2015-12-04 14:58:00
6 2015-12-13 10:07:00
About tblA
This
create table if not exists tblA(
AID integer not null,
RecordDate varchar(25),
constraint pkA primary key(AID)
);
should probably be
create table if not exists tblA(
AID integer not null,
RecordDate varchar(25) not null,
-- ^^^^^^^^
constraint pkA primary key(AID)
);
Without that not null, you can insert data like this.
AID RecordDate
--
17 Null
18 Null
19 Null
Since surrogate ID numbers are meaningless, these rows are all essentially both identical and identically useless.
About the update statement
update tblA
set AID = (AID + 1)
where AID >= 4
order by AID desc;
Standard SQL doesn't permit order by in this position in update statement. MySQL documents this as
If the ORDER BY clause is specified, the rows are updated in the order
that is specified.
The relational model and SQL are set-oriented. Updates are supposed to happen "all at once". IMHO, you'd be better off learning standard SQL and using a dbms that better supports standard SQL. (PostgreSQL springs to mind.) But adding on update cascade to tblB (above) will let your update statement succeed in MySQL.
update tblA
set AID = (AID + 1)
where AID >= 4 order by AID desc;
adding on update cascade might solve your problem
create table if not exists tblB(
BID integer not null,
AID integer not null,
constraint fkB_A foreign key(AID)
references tblA(AID)
on update cascade,
primary key(AID, BID));

MySQL - clustered index on the "many" side of a "one to many" relationship

I'm sure this is simple stuff to many of you, so I hope you can help easily.
If I have a MySQL table on the "many" side of a "one to many" relationship - like this:
Create Table MyTable(
ThisTableId int auto_increment not null,
ForeignKey int not null,
Information text
)
Since this table would always be used via a join using ForeignKey, it would seem useful to make ForeignKey a clustered index so that foreign keys would always be sorted adjacently for the same source record. However, ForeignKey is not unique, so I gather that it is either not possible or bad practice to make this a clustered index? If I try and make a composite primary key using (ForeignKey, ThisTableId) to achieve both the useful clustering and uniqueness, then there is an error "There can only be one auto column and it must be defined as a key".
I think perhaps I am approaching this incorrectly, in which case, what would be the best way to index the above table for maximum speed?
InnoDB requires that if you have an auto-increment column, it must be the first column in a key.
So you can't define the primary key as (ForeignKey, ThisTableId) -- if ThisTableId is auto-increment.
You could do it if ThisTableId were just a regular column (not auto-increment), but then you would be responsible for assigning a value that is at least unique among other rows with the same value in ForeignKey.
One method I have seen used is to make the column BIGINT UNSIGNED, and use a BEFORE INSERT trigger to assign the column a value from the function UUID_SHORT().
#ypercube correctly points out another solution: The InnoDB rule is that the auto-increment column should be the first column of some key, and if you create a normal secondary key, that's sufficient. This allows you to create a table like the following:
CREATE TABLE `MyTable` (
`ForeignKey` int(11) NOT NULL,
`ThisTableId` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`ForeignKey`,`ThisTableId`),
KEY (`ThisTableId`)
) ENGINE=InnoDB;
And the auto-increment works as expected:
mysql> INSERT INTO MyTable (ForeignKey) VALUES (123), (234), (345), (456);
mysql> select * from MyTable;
+------------+-------------+
| ForeignKey | ThisTableId |
+------------+-------------+
| 123 | 1 |
| 234 | 2 |
| 345 | 3 |
| 456 | 4 |
+------------+-------------+

auto_increment usage in composite key

I am having a table with the composite key
emp_tbl(
companyId int not null,
empId int not null auto_increment,
name varchar2,
....
...
primary key(companyId,empId)
);
In mysql whats happening is while i starts inserting the data
Emp_tbl
companyId empId
1 1
1 2
1 3
2 1
2 2
Note that when the companyId changes the auto_increament value is resetted to 1 again. I want to disable that. I mean i don't want to reset the auto_increament. I am expecting the result like this.
companyId empId
1 1
1 2
1 3
2 4
2 5
Is it possible to do it?
Thanks
This is what happens with a composite primary key that incorporates a auto_increment. Recreate the primary key so that it's purely your auto_increment field (empId) then create a unique index on companyId and empId
EDIT
Note that this only applies to MyISAM and BDB tables. If you used InnoDB for your tables, then it would also work as you wanted
If you do not want empId to reset then just reverse the order of primary definition
primary key(companyId,empId)
note that composite key order matters.

mySQL KEY Partitioning using three table fields (columns)

I am writing a data warehouse, using MySQL as the back-end. I need to partition a table based on two integer IDs and a name string. I have read (parts of) the mySQL documentation regarding partitioning, and it seems the most appropriate partitioning scheme in this scenario would be either a HASH or KEY partitioning.
I have elected for a KEY partitioning because I (chicked out and) dont want to be responsible for providing a 'collision free' hashing algorithm for my fields - instead, I am relying on MySQL hashing to generate the keys required for hashing.
I have included below, a snippet of the schema of the table that I would like to partition based on the COMPOSITE of the following fields:
school id, course_id, ssname (student surname).
BTW, before anyone points out that this is not the best way to store school related information, I'll have to point out that I am only using the case below as an analogy to what I am trying to model.
My Current CREATE TABLE statement looks like this:
CREATE TABLE foobar (
id int UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
school_id int UNSIGNED NOT NULL,
course_id int UNSIGNED NOT NULL,
ssname varchar(64) NOT NULL,
/* some other fields */
FOREIGN KEY (school_id) REFERENCES school(id) ON DELETE RESTRICT ON UPDATE CASCADE,
FOREIGN KEY (course_id) REFERENCES course(id) ON DELETE RESTRICT ON UPDATE CASCADE,
INDEX idx_fb_si (school_id),
INDEX idx_fb_ci (course_id),
CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname(16))
) ENGINE=innodb;
I would like to know how to modify the statement above so that the table is partitioned using the three fields I mentioned at the begining of this question (namely - school_id, course_id and the starting letter of the students surname).
Another question I would like to ask is this:
What happens in 'edge' situations for example if I attempt to insert a record that contains a valid* school_id, course_id or surname - for which no underlying partitioned table file exists - will mySQL automatically create the underlying file.?
Case in point. I have the following schools: New York Kindergaten, Belfast Elementary and the following courses: Lie Algebra in Infitesmal Dimensions, Entangled Entities
Also assume I have the following students (surnames): Bush, Blair, Hussein
When I add a new school (or course, or student), can I insert them into the foobar table (actually, I cant think why not). The reason I ask is that I forsee adding more schools and courses etc, which means that mySQL will have to create additional tables behind the scenes (as the hash will generate new keys).
I will be grateful if someone with experience in this area can confirm (preferably with links backing their assertion), that my understanding (i.e. no manual administration is required if I add new schools, courses or students to the database), is correct.
I dont know if my second question was well formed (clear) or not. If not, I will be glad to clarify further.
*VALID - by valid, I mean that it is valid in terms of not breaking referential integrity.
I doubt partitioning is as useful as you think. That said, there are a couple of other problems with what you're asking for (note: the entirety of this answer applies to MySQL 5; version 6 might be different):
columns used in KEY partitioning must be a part of the primary key. school_id, course_id and ssname are not part of the primary key.
more generally, every UNIQUE key (including the primary key) must include all columns in the partition1. This means you can only partition on the intersection of the columns in the UNIQUE keys. In your example, the intersection is empty.
most partitioning schemes (other than KEY) require integer or null values. If not NULL, ssname will not be an integer value.
foreign keys and partitioning aren't supported simultaneously2. This is a strong argument not to use partitioning.
Fortunately, collision free hashing is one thing you don't need to worry about, because partitioning is going to result in collisions (otherwise, you'd only have a single row in each partition). If you could ignore the above problems as well as the limitations on functions used in partitioning expressions, you could create a HASH partition with:
CREATE TABLE foobar (
...
) ENGINE=innodb
PARTITION BY HASH (school_id + course_id + ORD(ssname))
PARTITIONS 2
;
What should work is:
CREATE TABLE foobar (
id int UNSIGNED NOT NULL AUTO_INCREMENT,
school_id int UNSIGNED NOT NULL,
course_id int UNSIGNED NOT NULL,
ssname varchar(64) NOT NULL,
/* some other fields */
PRIMARY KEY (id, school_id, course_id),
INDEX idx_fb_si (school_id),
INDEX idx_fb_ci (course_id),
CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
PARTITION BY HASH (school_id + course_id)
PARTITIONS 2
;
or:
CREATE TABLE foobar (
id int UNSIGNED NOT NULL AUTO_INCREMENT,
school_id int UNSIGNED NOT NULL,
course_id int UNSIGNED NOT NULL,
ssname varchar(64) NOT NULL,
/* some other fields */
PRIMARY KEY (id, school_id, course_id, ssname),
INDEX idx_fb_si (school_id),
INDEX idx_fb_ci (course_id),
CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
PARTITION BY KEY (school_id, course_id, ssname)
PARTITIONS 2
;
As for the files that store tables, MySOL will create them, though it may do it when you define the table rather than when rows are inserted into it. You don't need to worry about how MySQL manages files. Remember, there are a limited number of partitions, defined when you create the table by the PARTITIONS *n* clause.