How to implement a superclass/subclass structure in MySQL? - mysql

These are the tables in my database, I need to create a couple superclass/subclass structures.
The first is where...
Superclass-Crew_Member
Subclasses-Director, Producer, Other_Directing, Other_Production, Art, Camera, Sound, Grip, Electrical, Post.
The second is...
Superclass-Producer
Subclasses-Salaries, Budget
+---------------------+
| Tables_in_film_crew |
+---------------------+
| art |
| budget |
| camera |
| crew_member |
| director |
| electrical |
| equipment |
| grip |
| location |
| manufacturer |
| other_directing |
| other_production |
| post_production |
| producer |
| salaries |
| sound |
+---------------------+
So how exactly would I go about creating those relationships?
Edit:
Maybe I should have clarified some other things too.
Here's what's contained in crew_member (Superclass):
+-------------+-------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+-------------------+----------------+
| Member_ID | int(5) | NO | PRI | NULL | auto_increment |
| Member_Name | varchar(25) | YES | | [INSERT EXAMPLE] | |
| DOB | date | YES | | [INSERT EXAMPLE] | |
| Address1 | varchar(25) | YES | | [INSERT EXAMPLE] | |
| Address2 | varchar(25) | YES | | [INSERT EXAMPLE] | |
+-------------+-------------+------+-----+-------------------+----------------+
Meanwhile here's what's contained in Other_Directing (Example Subclass):
+---------------+--------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------+------+-----+---------+----------------+
| O_Director_ID | int(4) | NO | PRI | NULL | auto_increment |
| FAD_ID | int(5) | NO | MUL | NULL | |
| SAD_ID | int(5) | NO | MUL | NULL | |
| SUD_ID | int(5) | NO | MUL | NULL | |
+---------------+--------+------+-----+---------+----------------+
Now all the foreign keys are referring to Member_ID from Crew_Member. All the other tables (except Director and Producer) are created in similar ways.

You could start by following some general rules which have to be taken in consideration when creating a database. Put the information that different groups have in common in 1 table and the specific data in smaller satellite tables.
I would put the generic information about a crew member in the first table:
so we would have an id, name, address and whatever all the members have in common.
Then you create "sub-tables" which relate to the "crew-member" table through the value crew_member_id. In this tables you put only the specific information related to directors, producers etc..
So the fields here might be something like: id, crew_member_id, directed movies, etc..
Even with the superclass producer you should work in the same way. Relate the subtales with the superclass through its primary key to have relations between them.
I would suggest you to read some articles about database designing. It might save your life in the future, cause after a database is made then it becomes much harder to correct mistakes.
http://www.datanamic.com/support/lt-dez005-introduction-db-modeling.html

Yeah this is a really good question, that I was researching as well.
The ideas I came up with were:
1> Have a parent table as the super class with sattelite tables for the attributes for each subclass joined by a foreign key. You could then represent it as a view.
2> Have a parent table as the super class and another single table for all of the extra attributes. This would have to be matched by joined by two foreign keys.
3> One table which holds all of the classes.(Terrible idea)
Theres other ideas but I think the first is the best bet.
Here's more info that suggests the first way.
http://www.tomjewett.com/dbdesign/dbdesign.php?page=subclass.php

Related

Learning SQL: Query for rows in diff tables based on if a column value is set

I am doing a side project to help me learn SQL.
I have setup 2 different tables:
computers
+------------------+-----------+------+-----+---------+-------+
│| Field | Type | Null | Key | Default | Extra |
│+------------------+-----------+------+-----+---------+-------+
│| serial_number | char(25) | NO | PRI | NULL | |
│| operating_system | char(10) | YES | | NULL | |
│| purchase_year | int(4) | YES | | NULL | |
│| assigned_to | char(100) | YES | | NULL | |
│+------------------+-----------+------+-----+---------+-------+
employees
│+------------+-----------+------+-----+---------+-------+
│| Field | Type | Null | Key | Default | Extra |
│+------------+-----------+------+-----+---------+-------+
│| email | char(100) | NO | PRI | NULL | |
│| first_name | char(25) | NO | | NULL | |
│| last_name | char(25) | NO | | NULL | |
│| office | char(5) | NO | | NULL | |
│| assigned | char(25) | YES | | NULL | |
│+------------+-----------+------+-----+---------+-------+
These both have a few entries while I am testing, but in trying to write a search function based off the employee email, I am reaching a snag with SQL queries. I'm pouring through the documentation, but not understanding it well, and can't find a good example of what I am trying to do to follow along with.
Here is what I am attempting to do with the query:
I want to grab a the employee row matching email address provided, and if the "employees.assigned" field is set (not null, think EXIST is used?) then I want to also grab the "computers.serial_number" row matching that column value
I can do what I want with 2 separate queries, but I want to see if it is possible with only one to clean up code and make the query as fast as possible. Any further documentation you think is worthwhile for this project is very welcome as well!
For those people finding this on google:
What I found worked for my need:
SELECT * FROM employees LEFT JOIN computers ON employees.assigned=computers.serial_number WHERE email='email#example.com';

Having more than one value inserted into the same column

I have a webapp that I'm building. This webapp will take as input some products (cars, motos, boats, houses, etc...) and each product will have one or more photos associated with it. The id of each of photo is generated by the uniqid() function of php.
My problem is:I can't seem to fit more than two id_photos into the same column
+-----------+------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------------------------------+------+-----+---------+----------------+
| carid | int(11) | NO | MUL | NULL | auto_increment |
| brand | enum('Alfa Romeo','Aston Martin','Audi') | NO | | NULL | |
| color | varchar(20) | NO | | NULL | |
| type | enum('gasoline','diesel','eletric') | YES | | NULL | |
| price | mediumint(8) unsigned | YES | | NULL | |
| mileage | mediumint(8) unsigned | YES | | NULL | |
| model | text | YES | | NULL | |
| year | year(4) | YES | | NULL | |
| id_photos | varchar(30) | YES | | NULL | |
+-----------+------------------------------------------+------+-----+---------+----------------+
What I would like to happen is something like this: INSERT INTO cars(id_photos) values ('id_1st_photo', 'id_2nd_photo')
Ending up having something like this:
| 60 | Audi | Yellow | diesel | 252352 | 1234112 | R8 | 1990 | id_1st_photo id_2nd_photo |
Eventually I would have to grab those photos from the folders they are in which is something like this: /var/www/website/$login/photos/id_of_photo with the query select id_photos from cars where carid=$id.
You may found some data types that is not proprelly good for the data that the server will receive but I'm one week into mysql and I'll worry about data types later on.
First of all I don't know if that is possible, if it's not how can I design something to work like that?
I have found this question that is quite the same of mine but I can't seem to implement something like this: add multiple values in one column
You can insert the concatenated values into a field. But it is not a good practice. You can create another table with foreign key having the id of the parent table.
You can easily adapt the approach in the linked question and even remove one table needed:
You first table stays almost the same, but has the id_photos column removed:
+-----------+------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------------------------------+------+-----+---------+----------------+
| carid | int(11) | NO | MUL | NULL | auto_increment |
| brand | enum('Alfa Romeo','Aston Martin','Audi') | NO | | NULL | |
| color | varchar(20) | NO | | NULL | |
| type | enum('gasoline','diesel','eletric') | YES | | NULL | |
| price | mediumint(8) unsigned | YES | | NULL | |
| mileage | mediumint(8) unsigned | YES | | NULL | |
| model | text | YES | | NULL | |
| year | year(4) | YES | | NULL | |
+-----------+------------------------------------------+------+-----+---------+----------------+
Then you'll add a second table to store the links to the photo ids:
+-----------+------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------------------------------+------+-----+---------+----------------+
| carid | int(11) | NO | MUL | NULL | |
| id_photos | varchar(30) | NO | | NULL | |
+-----------+------------------------------------------+------+-----+---------+----------------+
Both tables are linked by the field carid (You should even make carid in the second table a foreign key pointing to the one in the first table).
Each id_photos then results in a new row in the second table.
To query the data you probably need a JOIN between both tables and maybe a GROUP BY to reduce the result to one row per carid again, but this depends on the other usecases.
You can insert the string formatted woth multiple photo name
INSERT INTO cars(id_photos) values ('id_1st_photo, id_2nd_photo')
In this way you don'have a well normalized database structure so you have problem when retrive the singole foto name ..
i suggest you of normalize the id_photo column in a separata table with reference to the master table and in this way store each single photo in one row

Database schema: Key/Value table or all keys in one record

I guess that this is somewhat of a philosophical question. I need to collect pathology results for a group of patients and store them in a database. In the past I have used a very simple table structure (simplified):
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| Updated | datetime | NO | PRI | NULL | |
| PatientId | varchar(255) | NO | | NULL | |
| Name | varchar(255) | NO | | NULL | |
| Value | varchar(255) | NO | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
More often in schema design I see:
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| PatientId | varchar(255) | NO | | NULL | |
| Ph_Value | varchar(255) | NO | | NULL | |
| K_Value | varchar(255) | NO | | NULL | |
| Ca_Value | varchar(255) | NO | | NULL | |
| Ph_Value_updated | datetime | NO | | NULL | |
| K_Value_updated | datetime | NO | | NULL | |
| Ca_Value_updated | datetime | NO | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
It seems to me that the first design is much more flexible, expandable etc. However, I do wonder about performance hits when the records run to the millions.
The issue with the second is that there may be a couple of hundred fields that need to be recorded on occasions.
I would be really interested to get comments / advice / guidance on this.
You are absolutely right, the first schema is a lot more flexible: you can add new keys on a live database without changing the schema. However, flexibility is usually bought with the time and/or the space. In this case, it's both: you need more space to store all keys for the same row because the ID is replicated N times, and the joins or orderings required to get the fields together would take time.
There is no reason to pay for flexibility unless you need it. If most of your queries need most of the columns, the second result is the most economical. However, if most of your queries ask for a single column, getting the flexibility may be worth spending the CPU time and the database space.
In my opinion, If that name/value pairs won't be changed much so the second option is much better in the terms of space and number of rows.
Also you can have another solution to optimize the first schema , to put the names in another table and just put name_id instead of repeating the same name several times.
The other schema is to have patient table and a table for each value that contains patient_id and value and the table name is the name for that value

Fetching/Inserting huge chunks of data from/to a large table

We have an Online Judge (something similar to SPOJ.pl) where we conduct these 3 hour long contests during the weekends, by the end of which we have close to around 1000 submissions. And we store all these runs on a single table (which includes the submitted code). The present structure of the table is as follows :
+------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+---------+----------------+
| rid | int(11) | NO | PRI | NULL | auto_increment |
| pid | int(11) | YES | | NULL | |
| tid | int(11) | YES | | NULL | |
| language | tinytext | YES | | NULL | |
| name | tinytext | YES | | NULL | |
| code | longtext | YES | | NULL | |
| time | tinytext | YES | | NULL | |
| result | tinytext | YES | | NULL | |
| error | text | YES | | NULL | |
| access | tinytext | YES | | NULL | |
| submittime | int(11) | YES | | NULL | |
| output | longtext | YES | | NULL | |
+------------+----------+------+-----+---------+----------------+
Now the problem is that, every time we use the ORDER BY clause while querying within this, it ends up sorting the whole table. And in case of more than 1000 rows, each with a considerable amount of data, the time taken is significant. Please note that this is after OPTIMIZE-ing the tables at regular intervals say there have been changes made to the submissions. We do have two options :
Split up the tables after say around 100 entries.
Store the huge chunks of data (the submitted code) as files instead of inserting them as values into the table to reduce the overhead.
Is there another alternative/workaround to this were we can actually maintain the table structure as it is? I could really use some help here. Thanks.
My recommendation would be to do something called vertical partitioning: split the table into multiple tables, with different columns.
In this case, I would have one table that has all the small data: rid, pid, tid, language, name, time, result, access, submittime.
A second table would have: rid, code, error, output.
This way, you can do the sort on the first table and then join in the other fields after the sort. I put code, error, and output together since they sort of seem to go together.

Is many to many needed on this DB?

I am designing a DB for a possible PHP MySQL project I may be undertaking.
I am a complete novice at relational DB design, and have only worked with single table DB's before.
This is a diagram of the tables:
So, 'Cars' contains each model of car, and the other 3 tables contains parts that the car can be fitted with.
So each car can have different parts from each of the three tables, and each part can be fitted to different cars from the parts table. In reality, there will be about 10 of these parts tables.
So, what would be the best way to link these together? do I need another table in the middle etc?
and what would I need to do with keys in terms of linking.
There is some inheritance in your parts. The common attributes seem to be:
part_number
price
and there are some specifics for your part types exhaust, software and intake.
There are two strategies:
- have three tables and one view over the three tables
- have one table with a parttype column and may be three views for the tables.
If you'd like to play with your design you might want to look at my companies website http://www.uml2php.com. UML2PHP will automatically convert your UML design to a database design and let you "play" with the result.
At:
http://service.bitplan.com/uml2phpexamples/carparts/
you'll find an example applicaton along your design. The menu does not allow you to access all tables via the menu yet.
via:
http://service.bitplan.com/uml2phpexamples/carparts/index.php?function=dbCheck
the table definitions are accessible:
mysql> describe CP01_car;
+-------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| car_id | varchar(255) | NO | PRI | NULL | |
| model | varchar(255) | YES | | NULL | |
| description | text | YES | | NULL | |
| model_year | decimal(10,0) | YES | | NULL | |
+-------------+---------------+------+-----+---------+-------+
mysql> describe CP01_part;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
| car_car_id | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe cp01_exhaust;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| type | varchar(255) | YES | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe CP01_intake;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe CP01_software;
+-------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| power_gain | decimal(10,0) | YES | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+---------------+------+-----+---------+-------+
The above tables have been generated from the UML model and the result does not fit your needs yet.
Especially if you think of having 10 or more table likes this. The field car_car_id that links your parts to the car table should be available in all the tables. And according to the design proposal the base "table" for the parts should be a view like this:
mysql>
create view partview as
select oid,part_number,price from CP01_software
union select oid,part_number,price from CP01_exhaust
union select oid,part_number,price from CP01_intake;
of course the car_car_id column also needs to be selected;
Now you can edit every table by itself and the partview will show all parts together.
To be able to distinguish the parts types you might want to add another column "part_type".
I would do it like this. Instead of having three different tables for car parts:
table - cars table - parts (this would have only an id and a part
number and a type maybe)
table - part_connections (connectin cars with parts)
table - part_options (with all the options which arent in the
parts table like "power gain")
table - part_option_connections (which connects the parts to the
various part options)
In this way it is much easier to add new parts (because you won't need a new table) and its closer to being normalized as well.