MySQL - Supertype/Subtype design - mysql

I need to create the following database:
For semi-trucks I don't need extra subtypes, while for Car I need to have only those 3 subtypes and also for Sedan I need the four subtypes.
For SELECTs I will use JOINs (normalized database) but I need to find an easy way to make INSERTs.
Vehicle table stores common information
Semi-truck stores specific information for semis
Car tables has specific fields for cars and a car_type field which is linked to the three subtypes
Van, Suv and Sedan (and other types if I would need them) should be in one table CAR_TYPE
However, for Sedan type I need to have additional subtypes which maybe should be contained in another table. These subtypes are not needed for Suvs and Vans (in real life suv, vans can have the same subtypes as sedans but not in my case).
I need this database to be created exactly as it is in the diagram.
So far, my first approach is to have the following tables:
Vehicle: veh_id, veh_type(Semi, car), ..., other_fields
Vehicle_semis: veh_id, ..., other_semis_fields
Vehicle_car: veh_id, car_type(Van, Suv, Sedan), other_car_specific_fields
Car_type: car_type_id, type
Sedan_type: sedan_type_id, type
My problem is that I'm not sure this would be the right approach, and I don't know exactly how to create relationships between the tables.
Any ideas?
Thank you!
UPDATE:
The following diagram is based on #Mike 's answer:

Before I get started, I want to point out that "gas" describes either fuel or a kind of engine, not a kind of sedan. Think hard before you keep going down this path. (Semantics are more important in database design than most people think.)
What you want to do is fairly simple, but not necessarily easy. The important point in this kind of supertype/subtype design (also known as an exclusive arc) is to make it impossible to have rows about sedans referencing rows about semi-trucks, etc..
MySQL makes the code more verbose, because it doesn't enforce CHECK constraints. You're lucky; in your application, the CHECK constraints can be replaced by additional tables and foreign key constraints. Comments refer to the SQL above them.
create table vehicle_types (
veh_type_code char(1) not null,
veh_type_name varchar(10) not null,
primary key (veh_type_code),
unique (veh_type_name)
);
insert into vehicle_types values
('s', 'Semi-truck'), ('c', 'Car');
This is the kind of thing I might implement as a CHECK constraint on other platforms. You can do that when the meaning of the codes is obvious to users. I'd expect users to know or to figure out that 's' is for semis and 'c' is for cars, or that views/application code would hide the codes from users.
create table vehicles (
veh_id integer not null,
veh_type_code char(1) not null,
other_columns char(1) default 'x',
primary key (veh_id),
unique (veh_id, veh_type_code),
foreign key (veh_type_code) references vehicle_types (veh_type_code)
);
The UNIQUE constraint lets the pair of columns {veh_id, veh_type_code} be the target of a foreign key reference. That means a "car" row can't possibly reference a "semi" row, even by mistake.
insert into vehicles (veh_id, veh_type_code) values
(1, 's'), (2, 'c'), (3, 'c'), (4, 'c'), (5, 'c'),
(6, 'c'), (7, 'c');
create table car_types (
car_type char(3) not null,
primary key (car_type)
);
insert into car_types values
('Van'), ('SUV'), ('Sed');
create table veh_type_is_car (
veh_type_car char(1) not null,
primary key (veh_type_car)
);
Something else I'd implement as a CHECK constraint on other platforms. (See below.)
insert into veh_type_is_car values ('c');
Only one row ever.
create table cars (
veh_id integer not null,
veh_type_code char(1) not null default 'c',
car_type char(3) not null,
other_columns char(1) not null default 'x',
primary key (veh_id ),
unique (veh_id, veh_type_code, car_type),
foreign key (veh_id, veh_type_code) references vehicles (veh_id, veh_type_code),
foreign key (car_type) references car_types (car_type),
foreign key (veh_type_code) references veh_type_is_car (veh_type_car)
);
The default value for veh_type_code, along with the foreign key reference to veh_type_is_car, guarantees that this rows in this table can be only about cars, and can only reference vehicles that are cars. On other platforms, I'd just declare the column veh_type_code as veh_type_code char(1) not null default 'c' check (veh_type_code = 'c').
insert into cars (veh_id, veh_type_code, car_type) values
(2, 'c', 'Van'), (3, 'c', 'SUV'), (4, 'c', 'Sed'),
(5, 'c', 'Sed'), (6, 'c', 'Sed'), (7, 'c', 'Sed');
create table sedan_types (
sedan_type_code char(1) not null,
primary key (sedan_type_code)
);
insert into sedan_types values
('g'), ('d'), ('h'), ('e');
create table sedans (
veh_id integer not null,
veh_type_code char(1) not null,
car_type char(3) not null,
sedan_type char(1) not null,
other_columns char(1) not null default 'x',
primary key (veh_id),
foreign key (sedan_type) references sedan_types (sedan_type_code),
foreign key (veh_id, veh_type_code, car_type) references cars (veh_id, veh_type_code, car_type)
);
insert into sedans (veh_id, veh_type_code, car_type, sedan_type) values
(4, 'c', 'Sed', 'g'), (5, 'c', 'Sed', 'd'), (6, 'c', 'Sed', 'h'),
(7, 'c', 'Sed', 'e');
If you have to build additional tables that reference sedans, such as gas_sedans, diesel_sedans, etc., then you need to build one-row tables similar to "veh_type_is_car" and set foreign key references to them.
In production, I'd revoke permissions on the base tables, and either use
updatable views to do the inserts and updates, or
stored procedures to do the inserts and updates.

I refer you to the "Info" tab under the following three tags:
class-table-inheritance
single-table-inheritance
shared-primary-key
The first two describe the two major design patterns for dealing with a class/subclass (aka type/subtype) situation when designing a relational database. The third descibes a technique for using a single primary key that gets assigned in the superclass table and gets propagated to the subclass tables.
They don't completely answer the questions you raise, but they shed some light on the whole topic. This topic, of mimicking inheritance in SQL, comes up over and over again in both SO and the DBA area.

Related

MySQL database with user/admin rights [duplicate]

I need to create the following database:
For semi-trucks I don't need extra subtypes, while for Car I need to have only those 3 subtypes and also for Sedan I need the four subtypes.
For SELECTs I will use JOINs (normalized database) but I need to find an easy way to make INSERTs.
Vehicle table stores common information
Semi-truck stores specific information for semis
Car tables has specific fields for cars and a car_type field which is linked to the three subtypes
Van, Suv and Sedan (and other types if I would need them) should be in one table CAR_TYPE
However, for Sedan type I need to have additional subtypes which maybe should be contained in another table. These subtypes are not needed for Suvs and Vans (in real life suv, vans can have the same subtypes as sedans but not in my case).
I need this database to be created exactly as it is in the diagram.
So far, my first approach is to have the following tables:
Vehicle: veh_id, veh_type(Semi, car), ..., other_fields
Vehicle_semis: veh_id, ..., other_semis_fields
Vehicle_car: veh_id, car_type(Van, Suv, Sedan), other_car_specific_fields
Car_type: car_type_id, type
Sedan_type: sedan_type_id, type
My problem is that I'm not sure this would be the right approach, and I don't know exactly how to create relationships between the tables.
Any ideas?
Thank you!
UPDATE:
The following diagram is based on #Mike 's answer:
Before I get started, I want to point out that "gas" describes either fuel or a kind of engine, not a kind of sedan. Think hard before you keep going down this path. (Semantics are more important in database design than most people think.)
What you want to do is fairly simple, but not necessarily easy. The important point in this kind of supertype/subtype design (also known as an exclusive arc) is to make it impossible to have rows about sedans referencing rows about semi-trucks, etc..
MySQL makes the code more verbose, because it doesn't enforce CHECK constraints. You're lucky; in your application, the CHECK constraints can be replaced by additional tables and foreign key constraints. Comments refer to the SQL above them.
create table vehicle_types (
veh_type_code char(1) not null,
veh_type_name varchar(10) not null,
primary key (veh_type_code),
unique (veh_type_name)
);
insert into vehicle_types values
('s', 'Semi-truck'), ('c', 'Car');
This is the kind of thing I might implement as a CHECK constraint on other platforms. You can do that when the meaning of the codes is obvious to users. I'd expect users to know or to figure out that 's' is for semis and 'c' is for cars, or that views/application code would hide the codes from users.
create table vehicles (
veh_id integer not null,
veh_type_code char(1) not null,
other_columns char(1) default 'x',
primary key (veh_id),
unique (veh_id, veh_type_code),
foreign key (veh_type_code) references vehicle_types (veh_type_code)
);
The UNIQUE constraint lets the pair of columns {veh_id, veh_type_code} be the target of a foreign key reference. That means a "car" row can't possibly reference a "semi" row, even by mistake.
insert into vehicles (veh_id, veh_type_code) values
(1, 's'), (2, 'c'), (3, 'c'), (4, 'c'), (5, 'c'),
(6, 'c'), (7, 'c');
create table car_types (
car_type char(3) not null,
primary key (car_type)
);
insert into car_types values
('Van'), ('SUV'), ('Sed');
create table veh_type_is_car (
veh_type_car char(1) not null,
primary key (veh_type_car)
);
Something else I'd implement as a CHECK constraint on other platforms. (See below.)
insert into veh_type_is_car values ('c');
Only one row ever.
create table cars (
veh_id integer not null,
veh_type_code char(1) not null default 'c',
car_type char(3) not null,
other_columns char(1) not null default 'x',
primary key (veh_id ),
unique (veh_id, veh_type_code, car_type),
foreign key (veh_id, veh_type_code) references vehicles (veh_id, veh_type_code),
foreign key (car_type) references car_types (car_type),
foreign key (veh_type_code) references veh_type_is_car (veh_type_car)
);
The default value for veh_type_code, along with the foreign key reference to veh_type_is_car, guarantees that this rows in this table can be only about cars, and can only reference vehicles that are cars. On other platforms, I'd just declare the column veh_type_code as veh_type_code char(1) not null default 'c' check (veh_type_code = 'c').
insert into cars (veh_id, veh_type_code, car_type) values
(2, 'c', 'Van'), (3, 'c', 'SUV'), (4, 'c', 'Sed'),
(5, 'c', 'Sed'), (6, 'c', 'Sed'), (7, 'c', 'Sed');
create table sedan_types (
sedan_type_code char(1) not null,
primary key (sedan_type_code)
);
insert into sedan_types values
('g'), ('d'), ('h'), ('e');
create table sedans (
veh_id integer not null,
veh_type_code char(1) not null,
car_type char(3) not null,
sedan_type char(1) not null,
other_columns char(1) not null default 'x',
primary key (veh_id),
foreign key (sedan_type) references sedan_types (sedan_type_code),
foreign key (veh_id, veh_type_code, car_type) references cars (veh_id, veh_type_code, car_type)
);
insert into sedans (veh_id, veh_type_code, car_type, sedan_type) values
(4, 'c', 'Sed', 'g'), (5, 'c', 'Sed', 'd'), (6, 'c', 'Sed', 'h'),
(7, 'c', 'Sed', 'e');
If you have to build additional tables that reference sedans, such as gas_sedans, diesel_sedans, etc., then you need to build one-row tables similar to "veh_type_is_car" and set foreign key references to them.
In production, I'd revoke permissions on the base tables, and either use
updatable views to do the inserts and updates, or
stored procedures to do the inserts and updates.
I refer you to the "Info" tab under the following three tags:
class-table-inheritance
single-table-inheritance
shared-primary-key
The first two describe the two major design patterns for dealing with a class/subclass (aka type/subtype) situation when designing a relational database. The third descibes a technique for using a single primary key that gets assigned in the superclass table and gets propagated to the subclass tables.
They don't completely answer the questions you raise, but they shed some light on the whole topic. This topic, of mimicking inheritance in SQL, comes up over and over again in both SO and the DBA area.

sql creating view irrelevant result error

I'd like to make a view of romance movies only if it's on Netflix.
Here's the code I used and extracted pics of the result.
Sadly, I got a redundant result including not related movie lists.
CREATE VIEW DB2020_romance_on_netflix AS
SELECT
distinct(DB2020_MOVIEINFO.MOVIE_ID), DB2020_MOVIEINFO.title, DB2020_MOVIEINFO.plot, DB2020_genre.genre, DB2020_ON_NETFLIX.on_netflix
FROM
DB2020_genre, DB2020_MOVIEINFO, DB2020_ON_NETFLIX
WHERE
DB2020_genre.genre = 'romance' and
DB2020_genre.mov_id in (SELECT mov_id
FROM DB2020_ON_NETFLIX
WHERE on_netflix = 'yes');
select * from DB2020_romance_on_netflix;
Here's more code needed for checking.
CREATE TABLE DB2020_MOVIEINFO (
MOVIE_ID Int primary key,
title VARCHAR(30) NOT NULL,
plot VARCHAR(500) NOT NULL,
main_target VARCHAR(10) NOT NULL,
country char(6) NOT NULL,
INDEX (MOVIE_ID)
);
DESCRIBE DB2020_MOVIEINFO;
INSERT INTO DB2020_MOVIEINFO VALUES (1, 'Parasite', 'Greed and class discrimination threaten the newly formed symbiotic relationship between the wealthy Park family and the destitute Kim clan.','Adult','Korea');
INSERT INTO DB2020_MOVIEINFO VALUES (2, 'Before Sunset', 'Celine and Jesse, who met nine years ago in Vienna, cross paths again for a single day in Paris. Together, they try to find out what might have happened if they had acted on their feelings back then.','Adult','US');
INSERT INTO DB2020_MOVIEINFO VALUES (3,'Before Sunrise', 'While travelling on a train in Europe, Jesse, an American man, meets Celine, a French woman. On his last day in Europe before returning to the US, he decides to spend his remaining hours with her.','All', 'US');
CREATE TABLE DB2020_genre(
mov_id INT NOT NULL,
genre VARCHAR(15) NOT NULL,
release_date DATE NOT NULL,
INDEX (genre),
PRIMARY KEY (mov_id, genre),
FOREIGN KEY (mov_id) REFERENCES DB2020_MOVIEINFO(MOVIE_ID)
ON DELETE CASCADE ON UPDATE CASCADE);
DESCRIBE DB2020_genre;
INSERT INTO DB2020_genre VALUES (1, 'thriller',str_to_date('20190530','%Y%m%d') );
INSERT INTO DB2020_genre VALUES (2, 'romance',str_to_date('20041024','%Y%m%d'));
INSERT INTO DB2020_genre VALUES (3, 'romance',str_to_date('19960316','%Y%m%d'));
CREATE TABLE DB2020_ON_NETFLIX(
mov_id INT NOT NULL,
on_netflix VARCHAR(10) NOT NULL,
service_start_year VARCHAR(10),
PRIMARY KEY (mov_id, on_netflix),
INDEX(mov_id),
FOREIGN KEY (mov_id) REFERENCES DB2020_MOVIEINFO(MOVIE_ID)
ON DELETE CASCADE ON UPDATE CASCADE);
DESCRIBE DB2020_ON_NETFLIX;
INSERT INTO DB2020_ON_NETFLIX VALUES (1, 'no', NULL);
INSERT INTO DB2020_ON_NETFLIX VALUES (2, 'yes', '2018');
INSERT INTO DB2020_ON_NETFLIX VALUES (3, 'yes', '2018');
I wonder what made this problem. Big thanks in advance for helping me.
This worked. natural join made some redundancies. applying left outer join helped.
CREATE VIEW DB2020_romance_on_netflix AS
SELECT
DB2020_MOVIEINFO.title, DB2020_MOVIEINFO.plot, DB2020_genre.genre
FROM
DB2020_genre left outer join DB2020_MOVIEINFO on DB2020_genre.MOVIE_ID = DB2020_MOVIEINFO.MOVIE_ID
WHERE
DB2020_genre.genre = 'romance' and
DB2020_genre.MOVIE_ID in (SELECT MOVIE_ID
FROM DB2020_ON_NETFLIX
WHERE on_netflix = 'yes');

duplicate value for composite primary key

I have a table friend :
when I want to insert two new records like (1,2, true), (1,2, false) I got duplicate du champ '1-2' its logic, because when I insert another Two records (1,2, true) (2,1, false) it goes well. my question is why? I think that (1,2) (2,1) is also duplicated for the composite primary key (request_to , request_from )
my sql queries :
INSERT INTO `friends` (`request_to`, `request_from`, `confirmed`, `date_confirmation`) VALUES ('11', '12', b'1', NULL), ('12', '11', '', NULL)
Seems to me like the PRIMARY key "request_to" is enough to create a constraint violation for the duplicate (1) .
You have a primary key on request_to and request_from.
That means that you cannot insert duplicate values into this column. In your example that fails, (1, 2) is duplicated. In your example that works, (1, 2) <> (2, 1), so it is okay (for this constraint).
If you want uniqueness regardless of direction, add a unique constraint:
create unique index unq_friends_to_from on
friends(least(request_to, request_from), greatest(request_to, request_from));

sql integrity constraint parent key not found

I am trying to do something so simple creating and insert 4 tables with their data. I have spent hours on the web researching integrity constraints and tried several IDE's in case there's a bug but nothing seems to work. Code is shows below (excuted in order).
I can insert the data for the first two tables i.e vod_actor and vod_classification but when trying to add third/fourth table data I get the following error:
ORA-02291: integrity constraint (SYSTEM.VOD_FILM_CLASS_FK) violated - parent
I don't understand why because the FK for vod_film is the PK for vod_classification which already has its data populated.
Any help would be greatly appreciated. I am a beginner please bear that in mind. Thanks
CREATE TABLE vod_actor (
dbActorId CHAR(4) NOT NULL,
dbFirstname VARCHAR2(50) NOT NULL,
dbLastname VARCHAR2(50) NOT NULL,
dbDateOfBirth DATE,
dbNationality VARCHAR2(30),
dbBiography CLOB,
CONSTRAINT vod_actor_PK PRIMARY KEY (dbActorId)
);
CREATE TABLE vod_classification (
dbClassId CHAR(4) NOT NULL,
dbDescription VARCHAR(250) NOT NULL,
CONSTRAINT vod_classification_PK PRIMARY KEY (dbClassId)
);
CREATE TABLE vod_film (
dbFilmId CHAR(4) NOT NULL,
dbTitle VARCHAR2(100) NOT NULL,
dbDirector_firstname VARCHAR2(50) NOT NULL,
dbDirector_lastname VARCHAR2(50) NOT NULL,
dbGenre VARCHAR2(20),
dbUK_release_date DATE,
dbFilename VARCHAR2(50),
dbRuntime NUMBER(4),
dbClass CHAR(3),
CONSTRAINT vod_film_PK PRIMARY KEY (dbFIlmId),
CONSTRAINT vod_film_class_FK FOREIGN KEY (dbClass) REFERENCES
vod_classification (dbClassId) ON DELETE SET NULL
);
CREATE TABLE vod_role (
dbFilmId Char(4) NOT NULL,
dbActorId CHAR(4) NOT NULL,
dbCharacterName VARCHAR2(25) NOT NULL,
dbFirstAppearance NUMBER(6),
dbDescription CLOB,
CONSTRAINT vod_role_PK PRIMARY KEY (dbFilmId, dbActorId, dbCharacterName),
CONSTRAINT vod_role_film_FK FOREIGN KEY (dbFilmId) REFERENCES vod_film (dbFilmId)
ON DELETE CASCADE,
CONSTRAINT vod_role_actor_FK FOREIGN KEY (dbActorId) REFERENCES vod_actor (dbActorId)
ON DELETE CASCADE
);
//Insert into vod_actor & vod_classification works fine
Executing code below gives the error:
INSERT INTO vod_film VALUES ('1', 'Toy Story 3', 'Lee', 'Unkrich', 'Comedy', '19-JUL-2010', 'ToyStory3.mpg', '103', 'U');
INSERT INTO vod_film VALUES ('2', 'Lord of the Rings: Fellowship of the ring', 'Peter', 'Jackson', 'Fantasy', '19-DEC-2001', 'Fellowship.mpg', '178', '12');
INSERT INTO vod_film VALUES ('3', 'Lord of the Rings: Two Towers', 'Peter', 'Jackson', 'Fantasy', '18-DEC-2002', 'TwoTowers.mpg', '179', '12');
INSERT INTO vod_film VALUES ('4', 'Lord of the Rings: Return of the King', 'Peter', 'Jackson', 'Fantasy', '17-DEC-2003', 'KingReturns.mpg', '201', '12');
INSERT INTO vod_film VALUES ('5', 'Face/Off', 'John', 'Woo', 'Action', '7-NOV-1997', 'FaceOff.mpg', '138', '18');
INSERT INTO vod_film VALUES ('6', 'The Nutty Professor', 'Tom', 'Shadyac', 'Comedy', '4-OCT-1996', 'NuttyProf.mpg', '95', '12');
So in this case different character lengths for the PK FK fields I think is the issue.
CREATE TABLE vod_classification (
dbClassId CHAR(4) NOT NULL,
....
CREATE TABLE vod_film (
...
dbClass CHAR(3),
Given constraint
CONSTRAINT vod_film_class_FK FOREIGN KEY (dbClass) REFERENCES
vod_classification (dbClassId)
Appear to be the issue. char(3) <> char(4) make them both the same. likely the 3 to 4.
If I remember right char pads spaces to the end so 'U ' will never equal 'U ' U w/ 2 spaces for 3 characters vs U with 3 spaces for 4 characters. One of the reasons I prefer varchar no padding of spaces. Why was char chosen here?

Mysql improve performance of count with join

I need to improve a query similar to the following which simply performs a count based on some filtering while also joining a second table. The books table could have million of records.
CREATE TABLE author
(id int auto_increment primary key, name varchar(20), style varchar(20));
CREATE TABLE books
(
id int auto_increment primary key not null,
author_id int not null,
title varchar(20) not null,
level int not null,
date datetime not null,
CONSTRAINT fk_author FOREIGN KEY (author_id) REFERENCES author(id)
);
CREATE INDEX idx_level_date ON books(level, date);
INSERT INTO author
(name, style)
VALUES
('John', 'Fact'),
('Sarah', 'Fact'),
('Michael', 'Fiction');
INSERT INTO books
(id, author_id, title, level, date)
VALUES
(1, 1, 'John Book 1', 1, '2012-01-13 13:10:30'),
(2, 1, 'John Book 2', 1, '2011-03-12 12:10:20'),
(3, 1, 'John Book 3', 2, '2012-01-23 12:40:30'),
(4, 2, 'Sarah Book 1', 1, '2009-10-15 13:10:30'),
(5, 2, 'Sarah Book 2', 2, '2013-01-30 12:10:30'),
(6, 3, 'Michael Book 1', 3, '2012-11-13 12:10:30');
It runs extremely quickly once I remove the join but I really need the join in there as I may need to filter based on the author table.
Can anyone help by suggesting potentially more indexing that could help speed things up.
You always need to index foreign key fields as well as primary key fields.