I have one employee table:
create table employee
(number integer primary key,
name varchar(20),
salary integer,
manager integer,
birthyear integer,
startyear integer);
Where the manager column is the employee number of the employees manager, i.e. a couple of rows would look something like this:
number | name | salary | manager | birthyear | startyear |
32 | Smythe, Carol | 9050 | 199 | 1929 | 1967 |
33 | Hayes, Evelyn | 10100 | 199 | 1931 | 1963 |
35 | Evans, Michael | 5000 | 32 | 1952 | 1974 |
So to clarify, Michael Evans manager is Carol Smythe. And two more things, there are no foreign key constraints on this table and there are a couple of NULL values in the manager column.
Now, I would like to create a Managers table which contains all managers. I would do something like this;
create table Mgr(
Mgr_id INTEGER PRIMARY KEY,
bonus INTEGER,
FOREIGN KEY (Mgr_id) REFERENCES employee(manager));
BUT; this doesn´t work and I get an error. Can someone please explain why? Have searched for an answer but can´t find any good explanation. Thanks in advance.
ERROR:
ERROR 1005 (HY0000): Can´t create table johnson.mgr (errno: 150)
You are trying to create Primary key and Foreign key on same field and table.
You need to change definition of employee table like below,
create table employee
(number integer primary key,
name varchar(20),
salary integer,
manager integer,
birthyear integer,
startyear integer
FOREIGN KEY (manager) REFERENCES Mgr(Mgr_id)
);
And remove Foreign Key clause from Mgr table definition
The Corresponding columns in the foreign key and the referenced key must have similar internal data types so that they can be compared without a type conversion. The size and sign of integer types must be the same. The length of string types need not be the same. For nonbinary (character) string columns, the character set and collation must be the same.
So check whether you are giving a proper datatype for managerId table.
These conditions must be satisfied to not get error 150:
The two tables must be ENGINE=InnoDB.
The two tables must have the same charset.
The PK column(s) in the parent table and the FK column(s) must be the same data type.
The PK column(s) in the parent table and the FK column(s), if they have a define collation type, must have the same collation type;
If there is data already in the foreign key table, the FK column value(s) must match values in the parent table PK columns.
One more problem is that you are creating foreign key and primary key on same column
Hope this helps
Refer http://dev.mysql.com/doc/refman/5.6/en/innodb-foreign-key-constraints.html
What you want to do makes sense. How you're doing it doesn't. You need to change the way your database identifies managers. Right now, it identifies managers by the values in employee.manager. When you're finished, it will identify managers by the rows in Mgr.
Make sure you have usable data first. Look for id numbers that are no longer in the employee table. Try this.
select manager
from employee
where manager not in (select number
from employee);
Every row returned represents an error--a manager's id number for which there is no corresponding row in employee. You have to fix those before you can make much progress.
After you fix those errors, create the table Mgr.
create table Mgr (
Mgr_id INTEGER PRIMARY KEY,
bonus INTEGER NOT NULL DEFAULT 0,
FOREIGN KEY (Mgr_id) REFERENCES employee(manager)
);
Populate Mgr with a query. Something along these lines should work.
insert into Mgr (Mgr_id)
select distinct manager
from employee;
Update that table with the correct value for bonus. You might have to do that manually if you don't have that data stored somewhere handy.
As far as the database is concerned now, you could drop the column employee.manager. And you should drop that column, but not now. You have to consider what will happen to application code that thinks you can identify managers by looking at the employee table.
You can just drop the column, and let application programs fail until they're fixed.
You can warn all the application developers that these changes will be made on, say, March 1, and they'd better get their code ready.
You can make the changes, leave the column in place, warn the application developers, and take steps to prevent changes to employee.manager.
You can make these changes, rename the employee table, and create an updatable view having the old structure (you'll need to join employee and Mgr) and the name "employee". This option is close to ideal--it requires no changes to application code--but I'm not sure to what degree MySQL supports updatable views. It might not be possible.
Related
Here is a gross oversimplification of an intense setup I am working with. table_1 and table_2 both have auto-increment surrogate primary keys as the ID. info is a table that contains information about both table_1 and table_2.
table_1 (id, field)
table_2 (id, field, field)
info ( ???, field)
I am trying to decided if I should make the primary key of info a composite of the IDs from table_1 and table_2. If I were to do this, which of these makes most sense?
( in this example I am combining ID 11209 with ID 437 )
INT(9) 11209437 (i can imagine why this is bad)
VARCHAR (10) 11209-437
DECIMAL (10,4) 11209.437
Or something else?
Would this be fine to use this as the Primary Key on a MYSQL MYISAM DB?
I would use a composite (multi-column) key.
CREATE TABLE INFO (
t1ID INT,
t2ID INT,
PRIMARY KEY (t1ID, t2ID)
)
This way you can have t1ID and t2ID as foreign keys pointing to their respective tables as well.
I would not make the primary key of the "info" table a composite of the two values from other tables.
Others can articulate the reasons better, but it feels wrong to have a column that is really made up of two pieces of information. What if you want to sort on the ID from the second table for some reason? What if you want to count the number of times a value from either table is present?
I would always keep these as two distinct columns. You could use a two-column primay key in mysql ...PRIMARY KEY(id_a, id_b)... but I prefer using a two-column unique index, and having an auto-increment primary key field.
the syntax is CONSTRAINT constraint_name PRIMARY KEY(col1,col2,col3) for example ::
CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName)
the above example will work if you are writting it while you are creating the table for example ::
CREATE TABLE person (
P_Id int ,
............,
............,
CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName)
);
to add this constraint to an existing table you need to follow the following syntax
ALTER TABLE table_name ADD CONSTRAINT constraint_name PRIMARY KEY (P_Id,LastName)
Suppose you have already created a table now you can use this query to make composite primary key
alter table employee add primary key(emp_id,emp_name);
Aside from personal design preferences, there are cases where one wants to make use of composite primary keys. Tables may have two or more fields that provide a unique combination, and not necessarily by way of foreign keys.
As an example, each US state has a set of unique Congressional districts. While many states may individually have a CD-5, there will never be more than one CD-5 in any of the 50 states, and vice versa. Therefore, creating an autonumber field for Massachusetts CD-5 would be redundant.
If the database drives a dynamic web page, writing code to query on a two-field combination could be much simpler than extracting/resubmitting an autonumbered key.
So while I'm not answering the original question, I certainly appreciate Adam's direct answer.
Composite primary keys are what you want where you want to create a many to many relationship with a fact table. For example, you might have a holiday rental package that includes a number of properties in it. On the other hand, the property could also be available as a part of a number of rental packages, either on its own or with other properties. In this scenario, you establish the relationship between the property and the rental package with a property/package fact table. The association between a property and a package will be unique, you will only ever join using property_id with the property table and/or package_id with the package table. Each relationship is unique and an auto_increment key is redundant as it won't feature in any other table. Hence defining the composite key is the answer.
CREATE TABLE `mom`.`sec_subsection` (
`idsec_sub` INT(11) NOT NULL ,
`idSubSections` INT(11) NOT NULL ,
PRIMARY KEY (`idsec_sub`, `idSubSections`)
);
#AlexCuse I wanted to add this as comment to your answer but gave up after making multiple failed attempt to add newlines in comments.
That said, t1ID is unique in table_1 but that doesn't makes it unique in INFO table as well.
For example:
Table_1 has:
Id Field
1 A
2 B
Table_2 has:
Id Field
1 X
2 Y
INFO then can have:
t1ID t2ID field
1 1 some
1 2 data
2 1 in-each
2 2 row
So in INFO table to uniquely identify a row you need both t1ID and t2ID
I have setup a database on MySQL with two tables that related to each other as the following :
create table employee (
-> employeeID int not null auto_increment primary key,
-> emFirstName varchar(50) not null,
-> emLastName varchar(50) not null)
engine=innodb;
create table address (
-> employeeID int not null primary key,
-> emAddress varchar(50) not null,
-> foreign key `emid` (employeeID) references employee(employeeID)
-> on delete cascade on update cascade)
engine=innodb;
Notice on the address table, I set the employeeID, which references to the employeeID on the employee table, to be the primary key, making the two tables are in one-to-one relationship. I add a record to each as the following :
MariaDB [dummy]> select * from employee;
+------------+-------------+--------------+
| employeeID | emFirstName | emLastName |
+------------+-------------+--------------+
| 1 | January | Ananda Putra |
+------------+-------------+--------------+
MariaDB [sampoerna]> select * from address;
+------------+-----------+
| employeeID | emAddress |
+------------+-----------+
| 1 | Sidomulyo |
+------------+-----------+
I tried to add a record to the address table with the employeeID of 1 and failed, indicating that one-to-one relationship worked. But, why did I get the opposite result when I tried to reverse engineer these tables on MySQL workbench? It gave me this :
this schema defines a one-to-many relationship, doesn't it?
Have I set up my table incorrectly or what? thanks for the answer.
Your test tests the PK & FK, not the 1:1. In this code there can be an employee id that is not in the address table. For 1:1 you need another FK the other way plus another CK (candidate key).
Your are confusing "relationship" meaning FK with "relationship" meaning relation/association. In the relational model tables represent relation(ship)s/associations on/among values. Similarly in the original ERM (entity-relationship model), lines from relation(ship)s/association diamonds to entity boxes represent FKs between their tables. Those are the "relationships" that have a cardinality.
A FK constraint says that values for a list of columns must be values for another list of columns.
In pseudo-ER methods & some bad presentions of true ER all relation(ship)s/associations are reified to associative entities and FKs get called relationships. But every FK has an associated binary relation(ship)/association on its referencing entity & its referenced entity. It is represented by a table that is just the two corresponding columns from the referenced table. That's what the cardinality refers to when we attribute one to a FK.
You are saying that the binary relationship "employee eid has address id", which here is the (eid,id) projection of the address table, is 1:1. That would be enforced by eid & id each being a CK (PK or UNIQUE NOT NULL). Then to enforce those two particular tables being consistent you need FKs both ways. If you intend there to be employees without addresses, that's 1:0-or-1, with a FK one way, the current code. Nullable FKs introduce more 0-ors.
I'm sure this is simple stuff to many of you, so I hope you can help easily.
If I have a MySQL table on the "many" side of a "one to many" relationship - like this:
Create Table MyTable(
ThisTableId int auto_increment not null,
ForeignKey int not null,
Information text
)
Since this table would always be used via a join using ForeignKey, it would seem useful to make ForeignKey a clustered index so that foreign keys would always be sorted adjacently for the same source record. However, ForeignKey is not unique, so I gather that it is either not possible or bad practice to make this a clustered index? If I try and make a composite primary key using (ForeignKey, ThisTableId) to achieve both the useful clustering and uniqueness, then there is an error "There can only be one auto column and it must be defined as a key".
I think perhaps I am approaching this incorrectly, in which case, what would be the best way to index the above table for maximum speed?
InnoDB requires that if you have an auto-increment column, it must be the first column in a key.
So you can't define the primary key as (ForeignKey, ThisTableId) -- if ThisTableId is auto-increment.
You could do it if ThisTableId were just a regular column (not auto-increment), but then you would be responsible for assigning a value that is at least unique among other rows with the same value in ForeignKey.
One method I have seen used is to make the column BIGINT UNSIGNED, and use a BEFORE INSERT trigger to assign the column a value from the function UUID_SHORT().
#ypercube correctly points out another solution: The InnoDB rule is that the auto-increment column should be the first column of some key, and if you create a normal secondary key, that's sufficient. This allows you to create a table like the following:
CREATE TABLE `MyTable` (
`ForeignKey` int(11) NOT NULL,
`ThisTableId` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`ForeignKey`,`ThisTableId`),
KEY (`ThisTableId`)
) ENGINE=InnoDB;
And the auto-increment works as expected:
mysql> INSERT INTO MyTable (ForeignKey) VALUES (123), (234), (345), (456);
mysql> select * from MyTable;
+------------+-------------+
| ForeignKey | ThisTableId |
+------------+-------------+
| 123 | 1 |
| 234 | 2 |
| 345 | 3 |
| 456 | 4 |
+------------+-------------+
I am a noob in MySql. I want to create the following self-referencing table:
EMPLOYEE
+-----+------+------+
|Name |E-ID |M-ID |
+-----+------+------+
|ABC |12345 |67890 |
|DEF |67890 |12345 |
+-----+------+------+
I use the following commands:
CREATE TABLE EMPLOYEE (
NAME VARCHAR(20) ,
E-ID CHAR(6) NOT NULL ,
M-ID CHAR(6) NULL ,
PRIMARY KEY (E-ID) ,
FOREIGN KEY (M-ID) REFERENCES EMPLOYEE(E-ID)
);
Now my problem is, how do I enter the two records? I mean, each time the foreign constraint will fail. I tried entering:
INSERT INTO EMPLOYEE VALUES('12345','67890');
I also tried :
INSERT INTO EMPLOYEE VALUES('12345','67890'),('67890','12345');
Both of the above commands fail. Giving error:
ERROR 1452 (23000): Cannot add or update a child row: a foreign key
constraint fails BLAH BLAH
Guys, actually I was trying to implement the tables given in slide number 25 of the following ppt: The Relational Data Model and Relational Database Constraints
The constraints are:
SUPERSSN Of EMPLOYEE references SSN of EMPLOYEE.
MGRSSN of DEPARTMENT references SSN of EMPLOYEE.
DNO of EMPLOYEEE references DNumber of DEPARTMENT.
After I have created the tables, how do I add records? It will always fail the foreign key constraints.
As MySQL does not support deferrable constraints (which are the "natural solution" to such a problem) you will need to do this in two steps;
INSERT INTO employee (name, `E-ID`) values ('Arthur', '123456');
INSERT INTO employee (name, `E-ID`) values ('Ford', '67890');
UPDATE employee
SET `M-ID` = '67890'
WHERE `E-ID` = '123456';
UPDATE employee
SET `M-ID` = '123456'
WHERE `E-ID` = '67890';
You circular reference does sound strange to me though. An employee being the manager of an employee who is in turn his manager?
Allow me two comments on your table definition:
avoid column (or table names) with special characters that need quoted identifiers. Using E_ID instead of E-ID will save you some trouble in the long run
If your employee ID can be shorter than 6 characters than you most probably want to use VARCHAR(6) instead of CHAR(6) due to the padding of the values with the CHAR datatype.
There are good answers here, but figured I'd point out the quick-and-dirty:
SET FOREIGN_KEY_CHECKS = 0;
INSERT INTO EMPLOYEE VALUES('12345','67890'),('67890','12345');
SET FOREIGN_KEY_CHECKS = 1;
It will obviously fail because the table is empty.
INSERT INTO EMPLOYEE VALUES('12345','67890');
Since M-ID depends on E-ID. Remove the constraint so you can insert record. The best thing you do is to create another table for M-ID and reference it to Employee table.
Need to load data from a single file with a 100,000+ records into multiple tables on MySQL maintaining the relationships defined in the file/tables; meaning the relationships already match. The solution should work on the latest version of MySQL, and needs to use the InnoDB engine; MyISAM does not support foreign keys.
I am a completely new to using Pentaho Data Integration (aka Kettle) and any pointers would be appreciated.
I might add that it is a requirement that the foreign key constraints are NOT disabled. Since it's my understanding that if there is something wrong with the database's referential integrity, MySQL will not check for referential integrity when the foreign key constraints are turned back on. SOURCE: 5.1.4. Server System Variables -- foreign_key_checks
All approaches should include some from of validation and a rollback strategy should an insert fail, or fail to maintain referential integrity.
Again, completely new to this, and doing my best to provide as much information as possible, if you have any questions, or request for clarification -- just let me know.
If you are able to post the XML from the kjb and ktr files (jobs/transformations) that would be SUPER. Might even hunt down every comment/answer you've every made anywhere and up vote them... :-) ...really, it's really important to me to find an answer for this.
Thanks!
SAMPLE DATA: To better elaborate with an example, lets assume I am trying to load a file containing employee name, the offices they have occupied in the past and their Job title history separated by a tab.
File:
EmployeeName<tab>OfficeHistory<tab>JobLevelHistory
John Smith<tab>501<tab>Engineer
John Smith<tab>601<tab>Senior Engineer
John Smith<tab>701<tab>Manager
Alex Button<tab>601<tab>Senior Assistant
Alex Button<tab>454<tab>Manager
NOTE: The single table database is completely normalized (as much as a single table may be) -- and for example, in the case of "John Smith" there is only one John Smith; meaning there are no duplicates that would lead to conflicts in referential integrity.
The MyOffice database schema has the following tables:
Employee (nId, name)
Office (nId, number)
JobTitle (nId, titleName)
Employee2Office (nEmpID, nOfficeId)
Employee2JobTitle (nEmpId, nJobTitleID)
So in this case. the tables should look like:
Employee
1 John Smith
2 Alex Button
Office
1 501
2 601
3 701
4 454
JobTitle
1 Engineer
2 Senior Engineer
3 Manager
4 Senior Assistant
Employee2Office
1 1
1 2
1 3
2 2
2 4
Employee2JobTitle
1 1
1 2
1 3
2 4
2 3
Here's the MySQL DDL to create the database and tables:
create database MyOffice2;
use MyOffice2;
CREATE TABLE Employee (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
name CHAR(50) NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB;
CREATE TABLE Office (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
office_number INT NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB;
CREATE TABLE JobTitle (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
title CHAR(30) NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB;
CREATE TABLE Employee2JobTitle (
employee_id MEDIUMINT NOT NULL,
job_title_id MEDIUMINT NOT NULL,
FOREIGN KEY (employee_id) REFERENCES Employee(id),
FOREIGN KEY (job_title_id) REFERENCES JobTitle(id),
PRIMARY KEY (employee_id, job_title_id)
) ENGINE=InnoDB;
CREATE TABLE Employee2Office (
employee_id MEDIUMINT NOT NULL,
office_id MEDIUMINT NOT NULL,
FOREIGN KEY (employee_id) REFERENCES Employee(id),
FOREIGN KEY (office_id) REFERENCES Office(id),
PRIMARY KEY (employee_id, office_id)
) ENGINE=InnoDB;
My Notes in Response to Selected Answer:
PREP:
(a) Use the sample data, create a CSV by changing <TAB> to comma delimited.
(b) Install MySQL and create sample database using the MySQL DDL sample
(c) Install Kettle (it's Java based and will run on anything that runs Java)
(d) Download KTR file
Dataflow by Step: (My Notes)
Open the KTR file in Kettle, and double clicked the "CSV file input" and browse to the CSV file that you created. The delimiter should already be set to comma. Then click OKAY.
Double click "Insert Employees" and select DB connector then follow these directions on Creating a New Database Connection
I put together a sample transformation(right click and choose save link) based on what you provided. The only step I feel a bit uncertain on is the last table inputs. I'm basically writing the join data to the table and letting it fail if a specific relationship already exists.
note:
This solution doesn't really meet the "All approaches should include some from of validation and a rollback strategy should an insert fail, or fail to maintain referential integrity." criteria, though it probably won't fail. If you really want to setup something complex we can but this should definitely get you going with these transformations.
Dataflow by Step
1. We start with reading in your file. In my case I converted it to CSV but tab is fine too.
2. Now we're going to insert the employee names into the Employee table using a combination lookup/update.
After the insert we append the employee_id to our datastream as id and remove the EmployeeName from the data stream.
3. Here we're just using a Select Values step to rename the id field to employee_id
4. Insert Job Titles just like we did employees and append the title id to our datastream also deleting the JobLevelHistory from the datastream.
5. Simple rename of the title id to title_id(see step 3)
6. Insert offices, get id's, remove OfficeHistory from the stream.
7. Simple rename of the office id to office_id(see step 3)
8. Copy Data from the last step into two streams with the values employee_id,office_id and employee_id,title_id respectively.
9. Use a table insert to insert the join data. I've got it selected to ignore insert errors as there could be duplicates and the PK constraints will make some rows fail.
Output Tables