Database Design - Is redundancy better than adding more relationship?

Database Design - Is redundancy better than adding more relationship? - mysql

Hi I wonder which one of these database designs is better:
I have Shopping db, it has two tables: Categories and Brands. Both of those tables have the columns: name, description, and image_url.
So is it okay to keep the duplicate column? Or should I create a table that stores those information?
To visualize it better, here's a chart
The redundant one
| CATEGORIES | | BRANDS |
| - name | | - name |
| - description | | - description |
| - image_url | | - image_url |
or adding a new table?
| CATEGORIES | | INFORMATIONS | | BRANDS |
| |--->| - name |<-----| |
| - info_id | | - description | | - info_id |
| - image_url |
Thanks

In this case it's really NOT the same information. It's category_name, category_description, brand_name and brand_description.
Creating a table to keep generic information about many things is a bad idea.
When it comes to the discussion about (de)normalisation it's generally considered a better design to normalise data and I would start with such a design. As your data evolves and grows it might be necessary to de-normalise some tables as the data retrieval can be faster at the cost of storage space and risk of loosing data consistency.

Related

Good practice on saving properties in relational database

Let's assume I have two types of users in my system.
Those who can program and those who cannot.
I need to save both types of users in the same table.
The users who can program have lots properties different to those who can't, defined in another table.
What's either advantages of the following solutions and are there any better solutions?
Solution 1
One table containing a column with the correspondig property.
Table `users`:
----------------------------
| id | name | can_program |
----------------------------
| 1 | Karl | 1 |
| 2 | Ally | 0 |
| 3 | Blake | 1 |
----------------------------
Solution 2
Two tables related to each other via primary key and foreign key.
One of the tables containing the users and the other table only containing the id of those who can program.
Table users:
--------------
| id | name |
--------------
| 1 | Karl |
| 2 | Ally |
| 3 | Blake |
--------------
Table can_program:
---------------------
| id | can_program |
---------------------
| 1 | 1 |
| 3 | 1 |
---------------------

You have a 1-1 relationship between a user and the property that allows him to program. I would recommend storing this information as an additional column in table users. Creating an additional table will basically results in an additional storage structure with a 1-1 relationship to the original table.

Why not just have some kind of programmer_profiles table that the users table has a one-to-many relationship with?
If there's an associated record in programmer_profiles then they can program, otherwise it's presumed they can't.
This is more flexible since you can add in other x_profiles tables that provide different properties even if some of these have the same names.

MySQL - At what point should more than one table be used?

Edit for future viewers: Aside from the accepted answer which helped me I found some really good info here .
I've got a database with a single table for displaying inventory on a website (RVs). It stores the typical info: year, make, model, etc. I originally made it with 6 extra columns for storing "special features", but I don't like having such a hard limit on what options can be listed. Since I've never messed with more than a single table my gut instinct was to just add 24 or so more columns to cover everything, but something in my head told me that there might be a better way. So when do I decide N columns is too many? The data in these columns will commonly not be unique.
(Sorry for crappy diagram)
Current table design:
-----------------------------------------------------------------------
| id | year | make | model | price | ft_1 | ft_2 | ft_3 | ft_4 | ft_5 |
-----------------------------------------------------------------------
| | | | | | | | | | |
-----------------------------------------------------------------------
Possible better design:
table #1
------------------------------------
| id | year | make | model | price |
------------------------------------
| | | | | |
------------------------------------
table #2
---------------------------------------------
| unique_id(?) | feature | unit_ref |
---------------------------------------------
| 0 | "Diesel Pusher" | 2,6,14 |
---------------------------------------------
I feel like a bonus of the second table might be that I could more easily propagate a dropdown containing all the previously entered features to speed up adding new units to inventory.
Is this the right way to go about it, or should I just add more columns and be content?
Thanks.

Believe it or not, your best option would likely be to add a third table.
Since each record in your rvs table can be linked to multiple rows in the features table, and each feature can correspond to multiple rvs, you have a many-to-many relationship which is inherently difficult to maintain in a relational dbms. By adding a third "intersection" table you convert it to a one-to-many-to-one relationship which can be enforced declaratively by the dbms.
Your table structure would then become something like
rvs
------------------------------------
| id | year | make | model | price |
------------------------------------
| | | | | |
------------------------------------
features
--------------------------
| id | feature |
--------------------------
| 1192 | "Diesel Pusher" |
--------------------------
rv_features
----------------------
| rv_id | feature_id |
----------------------
| | |
----------------------
How do you make use of this? Suppose you want to record the fact that the 2016 Travelmore CampMaster has a 25kW diesel generator. You would first add a record to rvs like
--------------------------------------------------
| id | year | make | model | price |
--------------------------------------------------
| 0231 | 2016 | Travelmore | CampMaster | 750000 |
| 2101 | 2016 | Travelmore | Domestant | 650000 |
--------------------------------------------------
(Note the value in the id column is entirely arbitrary; its sole purpose is to serve as the primary key which uniquely identifies the record. It can encode meaningful information, but it must be something that will not change throughout the life of the record it identifies.)
You then add (or already have) the generator in the features table:
--------------------------------
| id | feature |
--------------------------------
| 1192 | Diesel Pusher 450hp |
| 3209 | diesel generator 25kW |
--------------------------------
Finally, you associate the rv to the feature with a record in rv_features:
----------------------
| rv_id | feature_id |
----------------------
| 0231 | 3209 |
| 0231 | 1192 |
| 2101 | 3209 |
----------------------
(I've added a few other records to each table for context.)
Now, to retrieve the features of the 2016 CampMaster, you use the following SQL query:
SELECT r.year, r.make, r.model, f.feature
FROM rvs r, features f, rv_features rf
WHERE r.id = rf.rv_id
AND rv.feature_id = f.id
AND r.id = '2031';
to get
----------------------------------------------------------
| year | make | model | feature |
----------------------------------------------------------
| 2016 | Travelmore | CampMaster | diesel generator 25kW |
| 2016 | Travelmore | CampMaster | Diesel Pusher 450hp |
----------------------------------------------------------
To see the rvs with a 25kW generator, change the query to
SELECT r.year, r.make, r.model, f.feature
FROM rvs r, features f, rv_features rf
WHERE r.id = rf.rv_id
AND rv.feature_id = f.id
AND f.id = '3209';
Sherantha's link to A Quick-Start Tutorial on Relational Database Design actually looks like a good intro to table design and normalization; you might find it useful.

There is a thing calles "third normal form" it says that everything without the unique ids shuld be unique. This means you need to make a table for year, a table for make a table for models etc and a table where you can combine all these ids to one connected dataset.
But this is not always practical, io think the best way to take this is something in between, like tables for entrys that repeat very often, but there dont need to be an extra table for price with unique ids, that would be overkill i think.

Based upon your scenario, if you believe no. of features columns remain same then no need for second table. And in case if there any possibility that features can be increased at any time in future then you should break up your table into two. (RVS & Features). Then create a third table that identify RVS & features as it seems there is many-to-many relationship. So I suggest you to use three tables.

I think it is better for you to be more familiar with relational database design. This is a short but great article I have found earlier.

Sum query for MySQL where field contain certain values

I need help with a Query, i have a table like this:
| ID | codehwos |
| --- | ----------- |
| 1 | 16,17,15,26 |
| 2 | 15,32,12,23 |
| 3 | 53,15,21,26 |
I need an outpout like this:
| codehwos | number_of_this_code |
| -------- | ---------------------- |
| 15 | 3 |
| 17 | 1 |
| 26 | 2 |
I want to sum all the time a code is used in a row.
Can anyone make a query for doing it for all the code in one time?
Thanks

You have a very poor data format. You should not store lists in strings and never store lists of numbers in strings. SQL has a great data structure for storing lists. Hint: it is called a "table" not a "string".
That said, sometimes one is stuck with other people's really poor design choices. We wouldn't make them ourselves, but we still need to get something done. Assuming you have a list of codes, you can do what you want with:
select c.code, count(*)
from codes c join
table t
on find_in_set(c.code, t.codehwos) > 0
group by c.code;
If you have any influence over the data structure, then advocate for a junction table, the right way to store this data in a relational database.

Relationship between two tables and their primary keys

I am little confused if I should go in this way
I have tables like
| account | | Acc_registration_Info |
| AccID_PK | | AccRegInfo_PK |
| | | |
| | | |
Should I connect them between both primary keys? Also how to secure them in case of mismatching IDs?
I am trying to follow by Advanture Works DB structure, but this is little hard to understand, some of AW DB tables are splitted as hell (like users and their passwords in different tables).
I don't really feel confident about making so much tables and relate them one-to-one by PKs... My other hard decision is to connect Shop table with details shop informations table by PK, etc. etc.
On the other hand making too much non-primary columns to connect other tables doesn't look awesome

i think you have to make one primary key of a table the foreign key of the ather, that's how it work:
| account | | Acc_registration_Info |
| AccID_PK | | AccRegInfo_PK |
| #AccRegInfo_FK | | |
| | | |
like that if you want to know the reg info for an account you have just to pick the #AccRegInfo_FK of that account (in account table) and compart it to AccRegInfo_PK (in reg info table) and you ll get what you wnat , and of course what is called in relation databases joint

CRM Releationships MySQL

Our Company is developing a CRM and we came now to the point where we have to decide how we want to handle the releationships. This is an important point because there are going to be tons of them. And changing the structure later would be simply not cool..
I know 3 ways how we could do it:
One releationship table:
The way i would do this is creating one table holding all the releationships.
Table: releationships
+----+-------------+-----------+--------------+------------+
| id | record_type | record_id | belongs_type | belongs_id |
+----+-------------+-----------+--------------+------------+
| 1 | person | 42 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 2 | person | 43 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 3 | note | 23 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 4 | attachment | 13 | company | 12 |
+----+-------------+-----------+--------------+------------+
Multiple releationship tables:
I think this is the way how it for example the SugarCRM does.
Table: company_realationships
+----+-----------+------------+--------+
| id | record_id | has_type | has_id |
+----+-----------+------------+--------+
| 1 | 12 | person | 42 |
+----+-----------+------------+--------+
| 2 | 12 | person | 43 |
+----+-----------+------------+--------+
| 3 | 12 | note | 23 |
+----+-----------+------------+--------+
| 2 | 12 | attachment | 13 |
+----+-----------+------------+--------+
All in the record table:
Table: person
+----+-----------+------------+
| id | name | company_id |
+----+-----------+------------+
| 42 | luke | 12 |
+----+-----------+------------+
| 43 | other guy | 12 |
+----+-----------+------------+
ect.
So my Question is wich is the Best way of handling lots of releationships?
Are there other ways to do it?
What are disadvantages / advantages?
Is there a special way how hightraffic sides handle their releationships?
Thanks for your help guys :)

So my Question is wich is the Best way of handling lots of releationships?
The third one or the variation of it (see below).
Every "M:N" relationship should be represented by its own junction table. OTOH, a "1:N" relationship doesn't need additional table - just a proper foreign key in the table on the side of the "N".
If I understand your description correctly, the third option models a 1:N relationship between company and person. If by any chance you wanted to model a M:N relationship between them, you'd have a junction table: company_person ( company_id, person_id, PK (company_id, person_id) ).
Are there other ways to do it?
Sometimes, inheritance (aka. category, subtype, generalization hierarchy etc.) can be used to lower the number of possible "relatable" combinations. In a nutshell, make a relationship to a parent, then every child inherited from that parent is automatically involved in that relationship.
For an example, take a look at this post.
What are disadvantages / advantages?
Enforcing constraints (including FKs) declaratively is better (less prone to errors and probably more performant) than enforcing them through triggers, which is again better than enforcing them in the client code.
Choose a design that better adheres to that principle. For example, your options 1 and 2 don't allow the DBMS to enforce FKs declaratively.
Is there a special way how hightraffic sides handle their releationships?
Good logical design followed by good physical implementation is the only solid basis for good performance. It's hard to "bolt-on" the performance on top of a bad design.
Perhaps, you'd like to take a look at:
ERwin Methods Guide
Use The Index, Luke!
And when it comes to performance, don't guess! Measure on realistic amounts of data.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008