Is many to many needed on this DB? - mysql

I am designing a DB for a possible PHP MySQL project I may be undertaking.
I am a complete novice at relational DB design, and have only worked with single table DB's before.
This is a diagram of the tables:
So, 'Cars' contains each model of car, and the other 3 tables contains parts that the car can be fitted with.
So each car can have different parts from each of the three tables, and each part can be fitted to different cars from the parts table. In reality, there will be about 10 of these parts tables.
So, what would be the best way to link these together? do I need another table in the middle etc?
and what would I need to do with keys in terms of linking.

There is some inheritance in your parts. The common attributes seem to be:
part_number
price
and there are some specifics for your part types exhaust, software and intake.
There are two strategies:
- have three tables and one view over the three tables
- have one table with a parttype column and may be three views for the tables.
If you'd like to play with your design you might want to look at my companies website http://www.uml2php.com. UML2PHP will automatically convert your UML design to a database design and let you "play" with the result.
At:
http://service.bitplan.com/uml2phpexamples/carparts/
you'll find an example applicaton along your design. The menu does not allow you to access all tables via the menu yet.
via:
http://service.bitplan.com/uml2phpexamples/carparts/index.php?function=dbCheck
the table definitions are accessible:
mysql> describe CP01_car;
+-------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| car_id | varchar(255) | NO | PRI | NULL | |
| model | varchar(255) | YES | | NULL | |
| description | text | YES | | NULL | |
| model_year | decimal(10,0) | YES | | NULL | |
+-------------+---------------+------+-----+---------+-------+
mysql> describe CP01_part;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
| car_car_id | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe cp01_exhaust;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| type | varchar(255) | YES | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe CP01_intake;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
mysql> describe CP01_software;
+-------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+-------+
| oid | varchar(32) | NO | | NULL | |
| power_gain | decimal(10,0) | YES | | NULL | |
| part_number | varchar(255) | NO | PRI | NULL | |
| price | varchar(255) | YES | | NULL | |
+-------------+---------------+------+-----+---------+-------+
The above tables have been generated from the UML model and the result does not fit your needs yet.
Especially if you think of having 10 or more table likes this. The field car_car_id that links your parts to the car table should be available in all the tables. And according to the design proposal the base "table" for the parts should be a view like this:
mysql>
create view partview as
select oid,part_number,price from CP01_software
union select oid,part_number,price from CP01_exhaust
union select oid,part_number,price from CP01_intake;
of course the car_car_id column also needs to be selected;
Now you can edit every table by itself and the partview will show all parts together.
To be able to distinguish the parts types you might want to add another column "part_type".

I would do it like this. Instead of having three different tables for car parts:
table - cars table - parts (this would have only an id and a part
number and a type maybe)
table - part_connections (connectin cars with parts)
table - part_options (with all the options which arent in the
parts table like "power gain")
table - part_option_connections (which connects the parts to the
various part options)
In this way it is much easier to add new parts (because you won't need a new table) and its closer to being normalized as well.

Related

Learning SQL: Query for rows in diff tables based on if a column value is set

I am doing a side project to help me learn SQL.
I have setup 2 different tables:
computers
+------------------+-----------+------+-----+---------+-------+
│| Field | Type | Null | Key | Default | Extra |
│+------------------+-----------+------+-----+---------+-------+
│| serial_number | char(25) | NO | PRI | NULL | |
│| operating_system | char(10) | YES | | NULL | |
│| purchase_year | int(4) | YES | | NULL | |
│| assigned_to | char(100) | YES | | NULL | |
│+------------------+-----------+------+-----+---------+-------+
employees
│+------------+-----------+------+-----+---------+-------+
│| Field | Type | Null | Key | Default | Extra |
│+------------+-----------+------+-----+---------+-------+
│| email | char(100) | NO | PRI | NULL | |
│| first_name | char(25) | NO | | NULL | |
│| last_name | char(25) | NO | | NULL | |
│| office | char(5) | NO | | NULL | |
│| assigned | char(25) | YES | | NULL | |
│+------------+-----------+------+-----+---------+-------+
These both have a few entries while I am testing, but in trying to write a search function based off the employee email, I am reaching a snag with SQL queries. I'm pouring through the documentation, but not understanding it well, and can't find a good example of what I am trying to do to follow along with.
Here is what I am attempting to do with the query:
I want to grab a the employee row matching email address provided, and if the "employees.assigned" field is set (not null, think EXIST is used?) then I want to also grab the "computers.serial_number" row matching that column value
I can do what I want with 2 separate queries, but I want to see if it is possible with only one to clean up code and make the query as fast as possible. Any further documentation you think is worthwhile for this project is very welcome as well!
For those people finding this on google:
What I found worked for my need:
SELECT * FROM employees LEFT JOIN computers ON employees.assigned=computers.serial_number WHERE email='email#example.com';

What is the most practical way to setup a database for multi-lang content?

I'm still learning MySQL and while working on a new project that requires multi-language content, I have stumbled upon a question about the most practical way to design a database that will support this functionality and at the same time be the most efficient database setup.
Table content_quote:
+--------------+-----------------------+------+-----+---------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------------------+-----------------------------+
| quote_id | int(11) unsigned | NO | PRI | NULL | auto_increment |
| url_slug | varchar(255) | NO | | NULL | |
| author_id | mediumint(8) unsigned | NO | | NULL | |
| quote | mediumtext | NO | | NULL | |
| category | varchar(15) | NO | | NULL | |
| likes | int(11) unsigned | NO | | 0 | |
| publish_time | datetime | NO | | 0000-00-00 00:00:00 | on update CURRENT_TIMESTAMP |
| locale | char(5) | NO | | NULL | |
+--------------+-----------------------+------+-----+---------------------+-----------------------------+
Now here I can just have a standard locale value like en-US in the locale field, but I have quite a few tables like that and I'm not sure what is the correct path, either leave it like that OR create a locale table to store all the locales and change the current locale field to be tinyint 2 with a Foreign Key going to the new table that stores all the locales.
Example:
+-----------+------------------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------+------+-----+---------+-------------------+
| locale_id | tinyint(2) unsigned | NO | PRI | NULL | auto_increment |
| locale | char(50) | NO | | NULL | |
+-----------+------------------+------+-----+---------+-------------------+
More than the answer itself, I'm interested to know what are the advantages/disadvantages of both approaches.
Advantages and disadvantages of a new locales table (they are swapped when NOT a locales table is used):
Advantages
Add a list of available locales when some might not be used yet. It allows you to create a dropdown list of available locales in some form.
Prevent typos since there will be only one en_US value available.
Disadvantages
JOIN on the new table all the time just to get a string like en_US.
Keep in mind that space will not be an issue. Don't try to make a decision base on 5 chars vs tiny int size.

Having more than one value inserted into the same column

I have a webapp that I'm building. This webapp will take as input some products (cars, motos, boats, houses, etc...) and each product will have one or more photos associated with it. The id of each of photo is generated by the uniqid() function of php.
My problem is:I can't seem to fit more than two id_photos into the same column
+-----------+------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------------------------------+------+-----+---------+----------------+
| carid | int(11) | NO | MUL | NULL | auto_increment |
| brand | enum('Alfa Romeo','Aston Martin','Audi') | NO | | NULL | |
| color | varchar(20) | NO | | NULL | |
| type | enum('gasoline','diesel','eletric') | YES | | NULL | |
| price | mediumint(8) unsigned | YES | | NULL | |
| mileage | mediumint(8) unsigned | YES | | NULL | |
| model | text | YES | | NULL | |
| year | year(4) | YES | | NULL | |
| id_photos | varchar(30) | YES | | NULL | |
+-----------+------------------------------------------+------+-----+---------+----------------+
What I would like to happen is something like this: INSERT INTO cars(id_photos) values ('id_1st_photo', 'id_2nd_photo')
Ending up having something like this:
| 60 | Audi | Yellow | diesel | 252352 | 1234112 | R8 | 1990 | id_1st_photo id_2nd_photo |
Eventually I would have to grab those photos from the folders they are in which is something like this: /var/www/website/$login/photos/id_of_photo with the query select id_photos from cars where carid=$id.
You may found some data types that is not proprelly good for the data that the server will receive but I'm one week into mysql and I'll worry about data types later on.
First of all I don't know if that is possible, if it's not how can I design something to work like that?
I have found this question that is quite the same of mine but I can't seem to implement something like this: add multiple values in one column
You can insert the concatenated values into a field. But it is not a good practice. You can create another table with foreign key having the id of the parent table.
You can easily adapt the approach in the linked question and even remove one table needed:
You first table stays almost the same, but has the id_photos column removed:
+-----------+------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------------------------------+------+-----+---------+----------------+
| carid | int(11) | NO | MUL | NULL | auto_increment |
| brand | enum('Alfa Romeo','Aston Martin','Audi') | NO | | NULL | |
| color | varchar(20) | NO | | NULL | |
| type | enum('gasoline','diesel','eletric') | YES | | NULL | |
| price | mediumint(8) unsigned | YES | | NULL | |
| mileage | mediumint(8) unsigned | YES | | NULL | |
| model | text | YES | | NULL | |
| year | year(4) | YES | | NULL | |
+-----------+------------------------------------------+------+-----+---------+----------------+
Then you'll add a second table to store the links to the photo ids:
+-----------+------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------------------------------+------+-----+---------+----------------+
| carid | int(11) | NO | MUL | NULL | |
| id_photos | varchar(30) | NO | | NULL | |
+-----------+------------------------------------------+------+-----+---------+----------------+
Both tables are linked by the field carid (You should even make carid in the second table a foreign key pointing to the one in the first table).
Each id_photos then results in a new row in the second table.
To query the data you probably need a JOIN between both tables and maybe a GROUP BY to reduce the result to one row per carid again, but this depends on the other usecases.
You can insert the string formatted woth multiple photo name
INSERT INTO cars(id_photos) values ('id_1st_photo, id_2nd_photo')
In this way you don'have a well normalized database structure so you have problem when retrive the singole foto name ..
i suggest you of normalize the id_photo column in a separata table with reference to the master table and in this way store each single photo in one row

Mysql paginate more than one table

I'm working on an ECommerce website, in which there are 2 database tables in MySQL, one is products and the other one is taxonomies, products and taxonomies are many to many relationship, and taxonomies have a tree structure, meaning there's a parent_id field in taxonomies table to identify the parent id of a taxonomy.
When user selects one taxonomy, I want to get all the products that belong to this taxonomy and all its offspring taxonomies, I did this by first finding out all the offspring taxonomies of the selected taxonomy, then get paginated products result from there, but in my site there are in total 5000 taxonomies, and my solution makes the site slow like a dog...... Any advice on how I could achieve this for the sake of performance?
products table:
+-------------------+----------------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------------------+------+-----+---------------------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| code | bigint(20) | NO | UNI | NULL | |
| SKU | varchar(255) | NO | | NULL | |
| name | varchar(100) | NO | | NULL | |
| description | varchar(2000) | NO | | NULL | |
| short_description | varchar(200) | NO | | NULL | |
| price | decimal(8,2) | NO | | 0.00 | |
| discounted_price | decimal(8,2) | NO | | 0.00 | |
| stock | smallint(5) unsigned | NO | | 0 | |
| sales | smallint(5) unsigned | NO | | 0 | |
| num_reviews | smallint(6) | NO | | 0 | |
| weight | decimal(5,2) | NO | | 0.00 | |
| overall_rating | decimal(3,2) | NO | | 5.00 | |
| activity_id | int(10) unsigned | YES | MUL | NULL | |
| created_at | timestamp | NO | | 0000-00-00 00:00:00 | |
| updated_at | timestamp | NO | | 0000-00-00 00:00:00 | |
+-------------------+----------------------+------+-----+---------------------+----------------+
taxonomies table:
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(100) | YES | UNI | NULL | |
| parent_id | int(10) unsigned | YES | MUL | NULL | |
| num_products | smallint(6) | NO | | 0 | |
+--------------+------------------+------+-----+---------+----------------+
product_taxonomy table:
+-------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| product_id | int(10) unsigned | NO | MUL | NULL | |
| taxonomy_id | int(10) unsigned | NO | MUL | NULL | |
+-------------+------------------+------+-----+---------+----------------+
In case depth of single level one can use the following query
SELECT * FROM `product_taxonomy`
INNER JOIN (SELECT * FROM `taxonomies` WHERE `id` = 100 OR `parent_id` = 100) `taxonomies`
ON `product_taxonomy`.`taxonomy_id` = `taxonomies`.`id`
LEFT JOIN `products` ON `product_taxonomy`.`product_id` = `products`.`id`
You can add limit, offset to the above query for pagination.
100 in the above query represents the taxonomy id requested by the user.
Apart from this I would suggest :-
1) id in your product table to renamed if possible to product_id as referenced in your product_taxonomy and I presume in other tables, similarly taxonomy_id.
This way when you join query column name would be the same.
2) I hope product_taxonomy.product_id, product_taxonomy.taxonomy_id are indexed for faster querying.
Update:
What you had mentioned in the comment below is a hierarchical data problem and not what relational database ideally intended for.
Solution 1
IF you know for sure that you will have only 4 levels / generation then you can do 4 join queries.
I can elaborate on this if you need to.
Solution 2
If you are not too deep or committed to the architecture of this project I would recommend restructuring it such a way, where recursion is taken care of by the server side scripting. i.e You change your CMS/taxonomy management in such a way that whenever you add/remove/modify taxonomy the script will update a table called taxonomy_childs with all possible offspring for a given category so that you have a flat data at your disposal when you need it.
Personally I would prefer this. I always like my database to match my business logic requirement.
I can elaborate on this if you need to.
Solution 3
As mentioned earlier hierarchical data is not a strong point of a relational database. Having said that you can implement something called as Nested Set Model.
Please read more at http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
You would need to add 3 columns to your taxonomy table :- level_depth, lft, rht.
Please let me know which solution would you want me to elaborate.

Database schema: Key/Value table or all keys in one record

I guess that this is somewhat of a philosophical question. I need to collect pathology results for a group of patients and store them in a database. In the past I have used a very simple table structure (simplified):
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| Updated | datetime | NO | PRI | NULL | |
| PatientId | varchar(255) | NO | | NULL | |
| Name | varchar(255) | NO | | NULL | |
| Value | varchar(255) | NO | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
More often in schema design I see:
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| PatientId | varchar(255) | NO | | NULL | |
| Ph_Value | varchar(255) | NO | | NULL | |
| K_Value | varchar(255) | NO | | NULL | |
| Ca_Value | varchar(255) | NO | | NULL | |
| Ph_Value_updated | datetime | NO | | NULL | |
| K_Value_updated | datetime | NO | | NULL | |
| Ca_Value_updated | datetime | NO | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
It seems to me that the first design is much more flexible, expandable etc. However, I do wonder about performance hits when the records run to the millions.
The issue with the second is that there may be a couple of hundred fields that need to be recorded on occasions.
I would be really interested to get comments / advice / guidance on this.
You are absolutely right, the first schema is a lot more flexible: you can add new keys on a live database without changing the schema. However, flexibility is usually bought with the time and/or the space. In this case, it's both: you need more space to store all keys for the same row because the ID is replicated N times, and the joins or orderings required to get the fields together would take time.
There is no reason to pay for flexibility unless you need it. If most of your queries need most of the columns, the second result is the most economical. However, if most of your queries ask for a single column, getting the flexibility may be worth spending the CPU time and the database space.
In my opinion, If that name/value pairs won't be changed much so the second option is much better in the terms of space and number of rows.
Also you can have another solution to optimize the first schema , to put the names in another table and just put name_id instead of repeating the same name several times.
The other schema is to have patient table and a table for each value that contains patient_id and value and the table name is the name for that value