Composite index for a relationship table - mysql

I have the following tables:
CREATE TABLE `students` (
`student_id` int NOT NULL AUTO_INCREMENT,
`student_name` varchar(40) NOT NULL DEFAULT '',
PRIMARY KEY (`student_id`)
);
CREATE TABLE `courses` (
`course_id` int NOT NULL AUTO_INCREMENT,
`course_name` varchar(40) NOT NULL DEFAULT '',
PRIMARY KEY (`course_id`)
);
CREATE TABLE `students_courses` (
`id` int NOT NULL AUTO_INCREMENT,
`student_id` int NOT NULL DEFAULT '0',
`course_id` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
);
Here, am using the students_courses table to store the relationships between the Students and Courses. Because one Student can enroll to more than one Course.
The doubt am having is, what should be indexed and how for that table.
1) Shall I index student_id and course_id separately like this:
CREATE TABLE `students_courses` (
`id` int NOT NULL AUTO_INCREMENT,
`student_id` int NOT NULL DEFAULT '0',
`course_id` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY (`student_id`),
KEY (`course_id`)
);
2) Or, create a composite index for both the student_id and course_id
CREATE TABLE `students_courses` (
`id` int NOT NULL AUTO_INCREMENT,
`student_id` int NOT NULL DEFAULT '0',
`course_id` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY (`student_id`, `course_id`)
);
3) If going with composite key, should I remove that that id primary key and make the PRIMARY KEY composite?
I will be using this relationship table during JOIN mainly. So am a bit confused here.

Let's say we stick to using the Auto Increment id column as Primary Key. Now, we will also need to ensure that the data is consistent, i.e., there are no duplicate rows for a combination of (student_id, course_id) values. So, we will need to either handle this in application code (do a select every time before insert/update), or we can fix this thing structurally by defining a Composite UNIQUE constraint on (student_id, course_id).
Now, a Primary Key is basically a UNIQUE NOT NULL Key. If you look at your table definition, this newly defined UNIQUE constraint is basically a Primary Key only (because the fields are NOT NULL as well). So, in this particular case, you don't really need to use a Surrogate Primary key id.
The difference in overheads during random DML (Insert/Update/Delete) will be minimal, as you would also have similar overheads when using a UNIQUE index only. So, you can rather define a Natural Primary Composite Key (student_id, course_id):
-- Drop the id column
ALTER TABLE students_courses DROP COLUMN id;
-- Add the composite Primary Key
ALTER TABLE students_courses ADD PRIMARY(student_id, course_id);
Above will also enforce the UNIQUE constraint on the combination of (student_id, course_id). Moreover, you will save 4 bytes per row (size of int is 4 bytes). This will come handly when you would have large tables.
Now, while Joining from students to students_courses table, above Primary Key will be a sufficient index. However, if you need to Join from courses to students_courses table, you will need another key for this purpose. So, you can define one more key on course_id as follows:
ALTER TABLE students_courses ADD INDEX (course_id);
Moreover, you should define Foreign Key constraints to ensure data integrity:
ALTER TABLE students_courses ADD FOREIGN KEY (student_id)
REFERENCES students(student_id);
ALTER TABLE students_courses ADD FOREIGN KEY (course_id)
REFERENCES courses(course_id);

Related

SQL: Unique index for two columns where value of one is NOT DEFAULT

I have the following table structure:
CREATE TABLE `users` (
`id` varchar(36) NOT NULL,
`group_id` varchar(36) NOT NULL,
`group_owner` tinyint(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`));
With group_owner acting as a boolean, I am trying to make a unique index to where any given group ID can only have one row where group_owner = 1.
Something like:
UNIQUE KEY (`group_id`,`group_owner`) ... WHERE group_owner NOT DEFAULT
How would I go about doing this?
You have the wrong structure for your data. What you need is a groups table. And that should have the foreign key reference:
CREATE TABLE `users` (
user_id varchar(36) PRIMARY KEY,
group_id varchar(36) NOT NULL
);
create table groups (
group_id int auto_increment primary key,
group_name varchar(36) not null,
owner_user_id varchar(36),
constraint fk_groups_owner foreign key (owner_user_id) references users (user_id)
);
alter table users add constraint fk_users_groups
foreign key (group_id) references groups(group_id);
Voila! Only one owner per group.
Notes:
I made group_id an integer. I find that auto-incremented values are better for primary keys. I would recommend doing the same for users as well.
This allows a user to be in only one group.
In practice, you will insert a group with a NULL owner. Then insert the user with the appropriate group, and then set the owner to the user.
Actually, the above data model runs the risk that a user could be the owner of a group s/he is not a member of. That is fixed with a slight tweak:
CREATE TABLE `users` (
user_id varchar(36) PRIMARY KEY,
group_id varchar(36) NOT NULL,
UNIQUE (group_id, user_id)
);
create table groups (
group_id int auto_increment primary key,
group_name varchar(36) not null,
owner_user_id varchar(36),
constraint fk_groups_owner foreign key (group_id, owner_user_id) references users (group_id, user_id)
);
alter table users add constraint fk_users_groups
foreign key (group_id) references groups(group_id);

Adding continual data to a users ID? [duplicate]

I have the following tables:
CREATE TABLE `students` (
`student_id` int NOT NULL AUTO_INCREMENT,
`student_name` varchar(40) NOT NULL DEFAULT '',
PRIMARY KEY (`student_id`)
);
CREATE TABLE `courses` (
`course_id` int NOT NULL AUTO_INCREMENT,
`course_name` varchar(40) NOT NULL DEFAULT '',
PRIMARY KEY (`course_id`)
);
CREATE TABLE `students_courses` (
`id` int NOT NULL AUTO_INCREMENT,
`student_id` int NOT NULL DEFAULT '0',
`course_id` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
);
Here, am using the students_courses table to store the relationships between the Students and Courses. Because one Student can enroll to more than one Course.
The doubt am having is, what should be indexed and how for that table.
1) Shall I index student_id and course_id separately like this:
CREATE TABLE `students_courses` (
`id` int NOT NULL AUTO_INCREMENT,
`student_id` int NOT NULL DEFAULT '0',
`course_id` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY (`student_id`),
KEY (`course_id`)
);
2) Or, create a composite index for both the student_id and course_id
CREATE TABLE `students_courses` (
`id` int NOT NULL AUTO_INCREMENT,
`student_id` int NOT NULL DEFAULT '0',
`course_id` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY (`student_id`, `course_id`)
);
3) If going with composite key, should I remove that that id primary key and make the PRIMARY KEY composite?
I will be using this relationship table during JOIN mainly. So am a bit confused here.
Let's say we stick to using the Auto Increment id column as Primary Key. Now, we will also need to ensure that the data is consistent, i.e., there are no duplicate rows for a combination of (student_id, course_id) values. So, we will need to either handle this in application code (do a select every time before insert/update), or we can fix this thing structurally by defining a Composite UNIQUE constraint on (student_id, course_id).
Now, a Primary Key is basically a UNIQUE NOT NULL Key. If you look at your table definition, this newly defined UNIQUE constraint is basically a Primary Key only (because the fields are NOT NULL as well). So, in this particular case, you don't really need to use a Surrogate Primary key id.
The difference in overheads during random DML (Insert/Update/Delete) will be minimal, as you would also have similar overheads when using a UNIQUE index only. So, you can rather define a Natural Primary Composite Key (student_id, course_id):
-- Drop the id column
ALTER TABLE students_courses DROP COLUMN id;
-- Add the composite Primary Key
ALTER TABLE students_courses ADD PRIMARY(student_id, course_id);
Above will also enforce the UNIQUE constraint on the combination of (student_id, course_id). Moreover, you will save 4 bytes per row (size of int is 4 bytes). This will come handly when you would have large tables.
Now, while Joining from students to students_courses table, above Primary Key will be a sufficient index. However, if you need to Join from courses to students_courses table, you will need another key for this purpose. So, you can define one more key on course_id as follows:
ALTER TABLE students_courses ADD INDEX (course_id);
Moreover, you should define Foreign Key constraints to ensure data integrity:
ALTER TABLE students_courses ADD FOREIGN KEY (student_id)
REFERENCES students(student_id);
ALTER TABLE students_courses ADD FOREIGN KEY (course_id)
REFERENCES courses(course_id);

Foreign Key Reference Composite Primary Key

The database is going to store information about hardware devices and their gathered data. I created a devices table to store the available hardware devices:
CREATE TABLE IF NOT EXISTS `devices` (
`deviceID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`deviceType` int(10) unsigned NOT NULL,
`updateFrequency` int(10) unsigned NOT NULL,
PRIMARY KEY (`deviceID`,`deviceType`)
)
The deviceID will correspond to a real hardware id (from 1 to 12). Since there are two types of hardware devices I thought it would be appropriate to create a deviceType which will be either 0 or 1 depending on which hardware device and make a composite primary key.
To store that data I created another table.
CREATE TABLE IF NOT EXISTS `data` (
`dataID` int(11) unsigned NOT NULL AUTO_INCREMENT,
`deviceID` int(11) unsigned NOT NULL,
`payload` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
PRIMARY KEY (`dataID`),
KEY `fk_data_devices` (`deviceID`),
CONSTRAINT `fk_data_devices`
FOREIGN KEY (`deviceID`)
REFERENCES `devices` (`deviceID`)
ON DELETE CASCADE
)
The problem is obviously that I cannot reference the composite key in one column inside data. Would it make sense to create an additional column inside data for deviceType and foreign key reference that as well or would it make more sense to assign deviceID and deviceType inside devices to another id and reference that inside data?
Thanks in advance!
You have a parent table with a composite primary key on columns (deviceID, deviceType). If you want to create a child table, you would need to:
create one column in the child table for each column that is part of the primary key in the parent table (deviceID, deviceType)
create a composite foreign key that references that tuple of columns to the corresponding column tuple in the parent table
Consider:
CREATE TABLE IF NOT EXISTS `data` (
`dataID` int(11) unsigned NOT NULL AUTO_INCREMENT,
`deviceID` int(11) unsigned NOT NULL,
`deviceType` int(10) unsigned NOT NULL,
`payload` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
PRIMARY KEY (`dataID`),
CONSTRAINT `fk_data_pk`
FOREIGN KEY (`deviceID`, `deviceType`)
REFERENCES `devices` (`deviceID`, `deviceType`)
ON DELETE CASCADE
);
NB: creating a composite foreign key is functionally different than creating two foreign keys, each pointing at one of the columns in the parent table.
Given this data in the parent table:
deviceID deviceType
1 0
2 1
If you create a separated foreign key on each column, they will allow you to insert a record in the child table with values like (1, 1), or (2, 0). The composite foreign key will not allow it, since these specific tuples do not exist in the source table.

SQL Join Table - Does it require a primary key at all or just unique keys?

I have created an application for the CakePHP framework which uses a join table.
I am unsure as to whether it is neccessary that I need a primary key to uniquley identify each row for the join table, as shown in the first block of SQL.
Do the two fields need to be set as unique keys or can they both be set as primary keys and I remove id as the primary key?
I was also asked why declaring atomic primary keys using a table constraint rather
than a column constraint, does this mean I shouldn't set unique keys for a join table?
CREATE TABLE IF NOT EXISTS `categories_invoices` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` int(11) NOT NULL,
`invoice_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `category_id` (`category_id`,`invoice_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=163 ;
I was thinking the solution is possibly to set both keys as unique and remove the primary key as shown here:
CREATE TABLE IF NOT EXISTS `categories_invoices` (
`category_id` int(11) NOT NULL,
`invoice_id` int(11) NOT NULL,
UNIQUE KEY `category_id` (`category_id`,`invoice_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
I did in fact test deleting the primary key 'id' for the join table leaving only 'category_id' and 'invoice_id' and the application still worked. This has left both fields as unique fields within the join table. Is this in fact the correct practice?
You don't need both. The compound unique key can replace the Primary key (unless the Cake framework cannot deal with compound Priamry Keys):
CREATE TABLE IF NOT EXISTS categories_invoices (
category_id int(11) NOT NULL,
invoice_id int(11) NOT NULL,
PRIMARY KEY (category_id, invoice_id)
)
ENGINE = MyISAM
DEFAULT CHARSET = latin1 ;
It's also good to have another index, with the reverse order, besides the index created for the Primary Key:
INDEX (invoice_id, category_id)
If you want to define Foreign Key constraints, you should use the InnoDB engine. With MyISAM you can't have Foreign Keys. So, it would be:
CREATE TABLE IF NOT EXISTS categories_invoices (
category_id int(11) NOT NULL,
invoice_id int(11) NOT NULL,
PRIMARY KEY (category_id, invoice_id),
INDEX invoice_category_index (invoice_id, category_id)
)
ENGINE = InnoDB
DEFAULT CHARSET=latin1 ;
If Cake cannot cope with composite Primary Keys:
CREATE TABLE IF NOT EXISTS categories_invoices (
id int(11) NOT NULL AUTO_INCREMENT,
category_id int(11) NOT NULL,
invoice_id int(11) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY category_invoice_unique (category_id, invoice_id),
INDEX invoice_category_index (invoice_id, category_id)
)
ENGINE = InnoDB
DEFAULT CHARSET=latin1 ;
There is nothing wrong with the second method. It is referred to as a composite key and is very common in database design, especially in your circumstance.
http://en.wikipedia.org/wiki/Relational_database#Primary_keys

How can I add a foreign key when creating a new table?

I have these two CREATE TABLE statements:
CREATE TABLE GUEST (
id int(15) not null auto_increment PRIMARY KEY,
GuestName char(25) not null
);
CREATE TABLE PAYMENT (
id int(15) not null auto_increment
Foreign Key(id) references GUEST(id),
BillNr int(15) not null
);
What is the problem in the second statement? It did not create a new table.
The answer to your question is almost the same as the answer to this one .
You need to specify in the table containing the foreign key the name of the table containing the primary key, and the name of the primary key field (using "references").
This has some code showing how to create foreign keys by themselves, and in CREATE TABLE.
Here's one of the simpler examples from that:
CREATE TABLE parent (id INT NOT NULL,
PRIMARY KEY (id)
) ENGINE=INNODB;
CREATE TABLE child (id INT, parent_id INT,
INDEX par_ind (parent_id),
FOREIGN KEY (parent_id) REFERENCES parent(id)
ON DELETE CASCADE
) ENGINE=INNODB;
I will suggest having a unique key for the payment table. On it's side, the foreign key should not be auto_increment as it refer to an already existing key.
CREATE TABLE GUEST(
id int(15) not null auto_increment PRIMARY KEY,
GuestName char(25) not null
) ENGINE=INNODB;
CREATE TABLE PAYMENT(
id int(15)not null auto_increment,
Guest_id int(15) not null,
INDEX G_id (Guest_id),
Foreign Key(Guest_id) references GUEST(id),
BillNr int(15) not null
) ENGINE=INNODB;
Make sure you're using the InnoDB engine for either the database, or for both tables. From the MySQL Reference:
For storage engines other than InnoDB,
MySQL Server parses the FOREIGN KEY
syntax in CREATE TABLE statements, but
does not use or store it.
create table course(ccode int(2) primary key,course varchar(10));
create table student1(rollno int(5) primary key,name varchar(10),coursecode int(2) not
null,mark1 int(3),mark2 int(3),foreign key(coursecode) references course(ccode));
There should be space between int(15) and not null