Two tables with same columns or one table with additional column? - mysql

Say I have two tables (Apples and Oranges) with the same columns and just a different table name. Would there be any advantages/disadvantages to turning this into one table (lets say its called Fruit) with an additional column 'type' which would then either store a value of Apple or Orange?
Edit to clarify:
CREATE TABLE apples
(
id int,
weight int,
variety varchar(255)
)
CREATE TABLE oranges
(
id int,
weight int,
variety varchar(255)
)
OR
CREATE TABLE fruit
(
id int,
weight int,
variety varchar(255),
type ENUM('apple', 'orange')
)

Depends on constraints:
Do you have foreign keys or CHECKs on apples that don't exist on oranges (or vice-versa)?
Do you need to keep keys unique across both tables (so no apple can have the same ID as some orange)?
If the answers on these two questions are: "yes" and "no", keep the tables separate (so constraints can be made table-specific1).
If the answers are: "no" and "yes", merge them together (so you can crate a key that spans both).
If the answers are: "yes" and "yes", consider emulating inheritance2:
1 Lookup data is a typical example of tables that look similar, yet must be kept separate so FKs can be kept separate.
2 Specifically, this is the "all classes in separate tables" strategy for representing inheritance (aka. category, subclassing, subtyping, generalization hierarchy etc.). You might want to take a look at this post for more info.

If there really is not any further business rules (and resultant underlying data requirements) that separate the two sub-types then I would use one table with an fk to a FruitType lookup table.
You dont mention what you will be using to access the schema which may affect which approach you take (e.g. if you are using a platform which provides an ORM to your database then this may be worth noting).

The advantage would be normalization. Your tables would then be in 2NF (second normal form).
Your fruit type would be a foreign key to a table with those fruits like so:
CREATE TABLE fruit_type (type varchar(15))
CREATE TABLE fruits (id int, weight int, variety varchar(255), type varchar(15))

Related

Relational database and PHP: one-to-many relations with multiple one-tables

Let’s assume there are some rows in a table cars, and each of these rows has an owner. If this owner were always a person (conveniently situated in a table persons), this would be your standard one-to-many relation.
However, what if the owner could not only be a person, but also a company (in a table companies)? How would this relationship be modeled and how would it be handled in PHP?
My first idea was to create a column person and a column company and check that one of them always stays NULL, while the other is filled – however, that seems somewhat inelegant and becomes impractical once there is a higher number of possible related tables.
My current assumption would be to not simply create the foreign key as an integer column person in the table, but to create a further table called tables, which gives IDs to the tables, and then split the foreign key into two integer columns: owner_table, containing the ID of the table (e.g. 0 for persons and 1 for companies), and owner_id, containing the owner ID.
Is this a viable and practical solution or is there some standard design pattern regarding such issues? Is there a name for this type of problem? And are there any PHP frameworks supporting such relations?
EDIT: Found a solution: Such structures are called polymorphic relations, and Laravel supports them.
There are multiple ways to do it.
You can go with two nullable foreign keys: one referencing company and the other user. Then you can have a check constraint which assure you one is null. With PostgreSQL:
CREATE TABLE car{
<your car fields>
company_id INT REFERENCES car,
person_id INT REFERENCES person,
CHECK(company_id IS NULL AND person_id IS NOT NULL
OR company_id IS NOT NULL AND person_id IS NULL)
};
Or you can use table inheritance (beware their limitations)
CREATE TABLE car_owner{
car_owner_id SERIAL
};
CREATE TABLE company{
<company fields>
} INHERITS(car_owner);
CREATE TABLE person{
<person fields>
} INHERITS(car_owner);
CREATE TABLE car{
<car fields>
car_owner_id INT REFERENCES car_owner
};

Database design confusion

I'm developing a classifieds site. And I'm totally stuck at database design level.
Advertisiment can only be in 1 category.
In my database I have table called "ads", which has columns, common for all advertisements.
CREATE TABLE Ads (
AdID int not null,
AdDate datetime not null,
AdCategory int not null,
AdHeading varchar(255) not null,
AdText varchar(255) not null,
etc...
);
I also have a lot of categories.
Ads that are posted in "cars" category, for example, have additional columns like make, model, color, etc. Ads, posted in "housing" have columns like housing type, sqft. etc...
I did something like:
CREATE TABLE Cars (
AdID int not null,
CarMake varchar (255) not null,
CarModel varchar(255) not null,
...
);
CREATE TABLE Housing (
AdID int not null,
HousingType varchar (255) not null
...
);
AdId in those is a foreign key to Ads.
But when I need to retrieve information from Ads, I have to look up all those additional tables and check if AdId in Ads equals to AdId in those tables.
For every category I need a new table. I'm gonna end up with like 15 tables or so.
I had an idea to have a boolean columns in Ads table like is_Cars, is_Housing, etc but having a 15 columns, where 14 would be NULL seems to be horrible.
Is there any better way to design this database? I need my database to be in a 3rd normal form, this is the most important requirement.
Don't worry too much - it's a well known dilemma, there are no 'silver bullets' and all solutions have some trade-offs. Your solution sounds good to me, and is commonly used in the industry. On the down side it has JOINS as you mentioned (which is a well-known trade-off of normalization anyway), and also each new product type requires a new TABLE. On the up side the table structure precisely reflects your business logic, it's readable and efficient in storage.
Your other suggestion, as far as I understand, was a single table where each row has a "type" indication - car, house etc (btw no need for multiple columns such as 'is_car', 'is_house' - it's simpler to have a single column 'type', e.g. type=1 indicates car, type=2 indicates house etc). Then multiple columns where some of them are unused for some product types.
Well, here the advantage is capability to add new types dynamically (even user-defined types) without changing the database schema. Also no 'JOINs'. On the down side you'll be storing & retrieving lots of 'null' cells, and also the schema would be less descriptive: e.g. it's harder to put a constraint "carModel column is not nullable", because it is nullable for houses (you can use triggers, but it's less readable).
Personally I prefer the 1st solution (of course depending on the usecase, but the 1st solution is my first instinct). And I can use it with some peace of mind after considering the trade-offs, e.g. understanding that I'm tolerating those JOINS as payment for a readable & compact schema.
One, you are confusing categories and product specifications.
Two, you need to read up on Table Inheritance.
If you don't mind nulls, use Single Table Inheritance. All "categories" (cars, houses, ...) go in one table and have a "type" column.
If you don't like nulls, use Class Table Inheritance. Make a master table with the primary keys that you point your category foreign key at. Make child tables for each type (cars, houses, ...) whose primary key is also a foreign key to the master table. This is easier with an ORM like Hibernate.

Storing variable values in a database efficiently

I am currently dealing with a data structure similar to the one linked here:
http://sqlfiddle.com/#!2/2ad8f/1
There will be a field (fruits in this case) that can contain very variable options - quantity, colour, type, etc. I am trying to work out an efficient way of storing this data and using it programatically in a frontend.
I have thought about creating new fields (e.g. a field for quantity, a field for colour, etc), however the data can be highly variable and I will be dealing with many, many rows. Potentially 1-2 million. I don't want to create a "texture" field for example that is only used for 100/1,000,000 rows.
The "fruits" here would never be order by or referenced by the database storage engine.
My best idea so far is to store a JSON object as a string (see the second insert in link), however is there a more efficient method?
If you want to place all your attributes into one text container, you may as well be using a text file instead of a relational database. The database will have a lot of overhead that you are simply not using so why have it?
If you want this in a relational form, then let's go through some simple modeling.
WE have different kinds of fruit. These fruit can have different and even different kinds of attributes. Here is one simple way:
create table Fruit(
ID int auto_increment primary key,
Name varchar( 20 ) not null, -- Apple, Orange, etc.
Type varchar( 20 ), -- Macintosh, Granny Smith, Navel, etc.
Size char( 1 ), -- S, M, L
Qty int not null,
-- other data such as price, shelflife, whatever
);
So now we create a table for each type of disparate attribute:
create table Attr(
ID int auto_increment primary key,
Type varchar( 20 ), -- Color, Texture, Taste, etc.
Value varchar( 10 ) -- Red, Green, Juicy, Sweet, Sour, etc.
);
Each fruit can have several attributes and each attribute may apply to several kinds of fruit, so you need a many-to-many cross table between them:
create table FruitAttr(
FruitID int,
AttrID int,
primary key( FruitID, AttrID )
);
with FruitID a foreign key to Fruit and AttrID a foreign key to Attr. Now we can create a Basket table which will define each individual basket.
create table Basket(
ID int auto_increment primary key,
Name varchar( 20 ) not null, -- Graduation, Funeral, Birthday, etc.
Price decimal (19,4),
-- other basket-specific attributes
);
A basket is made up of several selections of fruit and each fruit may appear in several types of basket. So there is the same relationship between Basket and Fruit as between Fruit and Attr: many-to-many. As we've already modeled one of those tables, I'll leave that to you.
There are enhancement and changes that may be made to tailor these tables closer to your specific uses, but we now have a workable solution.
So very quickly we have gone from one table to five tables. That may seem like we've complicated everything but if you have to work with them, you will find we have made our (meaning your) life a whole lot easier, especially when you add new types of baskets or fruit, change the makeup of a basket, substitute one fruit (severe core rot suddenly makes Granny Smiths unavailable), or any number of ways you will need to change your data.
After all, it is a relational database and relations are established between tables, not between substrings within strings. So the DML and queries to work with these relations will be so much easier than trying to manipulate text strings.

MySQL database with user created tables with custom column numbers

I have a person table and I want users to be able to create custom many to many relations of information with them. Educations, residences, employments, languages, and so on. These might require different number of columns. E.g.
Person_languages(person_fk,language_fk)
Person_Educations(person,institution,degree,field,start,end)
I thought of something like this. (Not correct sql)
create Tables(
table_id PRIMARY_KEY,
table_name_fk FOREIGN_KEY(Table_name),
person_fk FOREIGN_KEY(Person),
table_description TEXT
)
Table holding all custom table name and descriptions
create Table_columns(
column_id PRIMARY_KEY,
table_fk FOREIGN_KEY(Tables),
column_name_fk FOREIGN_KEY(Columns),
rank_column INT,
)
Table holding the columns in each custom table and the order they are to be displayed in.
create Table_rows(
row_id PRIMARY_KEY,
table_fk FOREIGN_KEY(Tables),
row_nr INT,
)
Table holding the rows of each custom table.
create Table_cells(
cell_id PRIMARY_KEY,
table_fk FOREIGN_KEY(Tables),
row_fk FOREIGN_KEY(Table_rows),
column_fk FOREIGN_KEY(Table_columns),
cell_content_type_fk FOREIGN_KEY(Content_types),
cell_object_id INT,
)
Table holding cell info.
If any custom table starts to be used with most persons and becomes large, the idea was to maybe then extract it into a separate hard-coded many-to-many table just for that table.
Is this a stupid idea? Is there a better way to do this?
I strongly advise against such a design - you are on the road to an extremely fragmented and hard to read design.
IIUC your base problem is, that you have a common set of (universal) properties for a person, that may be extended by other (non-universal) properties.
I'd tackle this by having the universal properties in the person table and create two more tables: property_types, which translates a property name into an INT primary key and person_properties which combines person PK, propety PK and value.
If you set the PK of this table to be (person,property) you get the best possible index locality for the person, which makes requesting all properties for a person a very fast query.

How to structure my Users Database?

I have a website that allows users to be different types. Each of these types can do specific things. I am asking if I should set up 1 table for ALL my users and store the types in an enum, or should I make different tables for each type. Now, if the only thing different was the type it would be easy for me to choose only using one table. However, here's a scenario.
The 4 users are A, B, C, D.
User A has data for:
name
email
User B has data for:
name
email
phone
User C has data for:
name
email
phone
about
User D has data for:
name
email
phone
about
address
If I were to create a single table, should I just leave different fields null for the different users? Or should I create a whole separate table for each user?
Much better if you could create a single table for all of them. Though some fileds are nullable. And add an extra column (enum) for each type of users. If you keep your current design, you will have to use some joins and unions for the records. (which adds extra overhead on the server)
CREATE TABLE users
(
ID INT,
name VARCHAR(50),
email VARCHAR(50),
phone VARCHAR(50),
about VARCHAR(50),
address VARCHAR(50),
userType ENUM() -- put types of user here
)
Another suggested design is to create two tables, one for user and the other one is for the types. The main advantage here is whenever you have another type of user, you don't have to alter the table but by adding only extra record on the user type table which will then be referenced by the users table.
CREATE TABLE UserType
(
ID INT PRIMARY KEY,
name VARCHAR(50)
)
CREATE TABLE users
(
ID INT,
name VARCHAR(50),
email VARCHAR(50),
phone VARCHAR(50),
about VARCHAR(50),
address VARCHAR(50),
TypeID INT,
CONSTRAINT rf_fk FOREIGN KEY (TypeID) REFERENCES UserType(ID)
)
Basic database design principals suggest one table for the common elements and additional tables, JOINed back to the base table, for the attributes that are unique to each type of user.
Your example suggests one and only one additional field per user-type in a straightforward inheritance hierarchy. Is that really what the data looks like, or did you simply for the example? If that's a true representation of your requirements, I might be tempted (for expedience) to use a single table. But if the real requirements are more complex, I'd bite the bullet and do it "correctly".
Try creating four tables:
Table 1: Name, email
Table 2: Name, phone
Table 3: Name, about
Table 4: Name, address
Name is your primary key on all four tables. There are no nulls in the database. You're not storing an enumerated type but derive the type from table joins:
To find all User A select all records in table 1 not in table 2
To find all User B select all records in table 2 not in table 3
To find all User C select all records in table 3 not in table 4
To find all User D select all records in table 4
You should not create tables for different people because this will lead to a bloated database. It's best to create a single table with all the fields you need. If you don't use the field, pass in null values.
I would suggest that you use 1 single table with nullable fields. And a table of something like roles.