I'm developing a classifieds site. And I'm totally stuck at database design level.
Advertisiment can only be in 1 category.
In my database I have table called "ads", which has columns, common for all advertisements.
CREATE TABLE Ads (
AdID int not null,
AdDate datetime not null,
AdCategory int not null,
AdHeading varchar(255) not null,
AdText varchar(255) not null,
etc...
);
I also have a lot of categories.
Ads that are posted in "cars" category, for example, have additional columns like make, model, color, etc. Ads, posted in "housing" have columns like housing type, sqft. etc...
I did something like:
CREATE TABLE Cars (
AdID int not null,
CarMake varchar (255) not null,
CarModel varchar(255) not null,
...
);
CREATE TABLE Housing (
AdID int not null,
HousingType varchar (255) not null
...
);
AdId in those is a foreign key to Ads.
But when I need to retrieve information from Ads, I have to look up all those additional tables and check if AdId in Ads equals to AdId in those tables.
For every category I need a new table. I'm gonna end up with like 15 tables or so.
I had an idea to have a boolean columns in Ads table like is_Cars, is_Housing, etc but having a 15 columns, where 14 would be NULL seems to be horrible.
Is there any better way to design this database? I need my database to be in a 3rd normal form, this is the most important requirement.
Don't worry too much - it's a well known dilemma, there are no 'silver bullets' and all solutions have some trade-offs. Your solution sounds good to me, and is commonly used in the industry. On the down side it has JOINS as you mentioned (which is a well-known trade-off of normalization anyway), and also each new product type requires a new TABLE. On the up side the table structure precisely reflects your business logic, it's readable and efficient in storage.
Your other suggestion, as far as I understand, was a single table where each row has a "type" indication - car, house etc (btw no need for multiple columns such as 'is_car', 'is_house' - it's simpler to have a single column 'type', e.g. type=1 indicates car, type=2 indicates house etc). Then multiple columns where some of them are unused for some product types.
Well, here the advantage is capability to add new types dynamically (even user-defined types) without changing the database schema. Also no 'JOINs'. On the down side you'll be storing & retrieving lots of 'null' cells, and also the schema would be less descriptive: e.g. it's harder to put a constraint "carModel column is not nullable", because it is nullable for houses (you can use triggers, but it's less readable).
Personally I prefer the 1st solution (of course depending on the usecase, but the 1st solution is my first instinct). And I can use it with some peace of mind after considering the trade-offs, e.g. understanding that I'm tolerating those JOINS as payment for a readable & compact schema.
One, you are confusing categories and product specifications.
Two, you need to read up on Table Inheritance.
If you don't mind nulls, use Single Table Inheritance. All "categories" (cars, houses, ...) go in one table and have a "type" column.
If you don't like nulls, use Class Table Inheritance. Make a master table with the primary keys that you point your category foreign key at. Make child tables for each type (cars, houses, ...) whose primary key is also a foreign key to the master table. This is easier with an ORM like Hibernate.
Related
I have a question about a best practice approach regarding a one-to-many relationship using an MySQL-API, VueJS as a framework and a data-table using a JSON format.
I have a little example available here: https://codepen.io/rasenkantenstein/pen/MWYEvzK. This is a Vuetify data table which hosts information about desserts. One dessert can have many ingredients...
My problem is similar: I have a MySQL - membership table. In case of a family membership, one membership can relate to many people of the family. In a classic scheme I would have a membership table and a members table with a foreign key to the membership.
The membership table contains attributes such as a conctact adress, entry date etc. The members table contains attributes such as the name, birthdate etc. There are also transactions which also relate to the membership table among other things.
My approach is to try to hold all information in one membership table:
CREATE TABLE `Membership` (
`MembershipId` bigint(20) NOT NULL AUTO_INCREMENT,
`MembershipType` varchar(100) COLLATE latin1_german1_ci NOT NULL,
`MembershipMembers` json NOT NULL COMMENT 'contains a JSON String with family name, given name and birthdate',
`MembershipZipCode` int(11) NOT NULL, --and more attributes
However, two tables also seem feasable. Unfortunately, I wouldn't know how to update both tables with the JS-Axios-API inside the same data-table.
I am looking for some ressources to further help my decision making, e.g. links or certain keywords or hard advice.
I am creating a table for dietary_supplement where a supplement can have many ingredients.
I am having trouble designing the table for the ingredients.
The issue is that an ingredient can have many names or an acronym.
For example, vitaminB1 has other names like Thiamine and thiamin.
An acronym BHA can stand for both Butylated hydroxyanisole and beta hydroxy acid(this is actually an ingredient for skincare products but I am using it anyways because it makes a good example).
I am also concerned about the spacing and "-". For example, someone can spell vitaminA without spacing and someone can write vitamin A. Also, beta hydroxy acid can also be written as β-hydroxy acid(with "-") or β hydroxy acid(without "-").
What I have in mind are 2 options)
1) put all the names for one ingredient in a column using semi-colon to distinguish between names. eg) beta hydroxy acid;BHA;β-hydroxy acid;β hydroxy acid
-this would be easy but I am not sure if this is the smart way to design the database when I have to perform search actions etc.
2) create a table for all the names and relate it with a table for ingredients.
-This is the option that I am leaned towards, but I wonder if there are better ways to do this. And do I have to create separate rows for the same items with difference in spacing and "-"?
Make a mapping table of 'name' to 'canonical_name' (or id). It would have rows like
Thiamine vitaminB1
thiamin vitaminB1
vitaminB1 vitaminB1
B1 vitaminB1
By using a collation ending with _ci, you don't need to worry about capitalization.
When ingesting the data for a suplement, first lookup the name to get the canonical_name, then use the latter in any other table(s).
In that 2-column table, have
PRIMARY KEY(canonical_name),
INDEX(name, canonical_name)
so that you can go either direction.
Create a table for ingredients and supplement and make a column that will be the same in table ingredients and supplement and just join them if you want to select
It might be something like this:
CREATE TABLE Ingredient (
Id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
, ImagePath VARCHAR(63)
, Description TEXT
-- other ingredient's non-name dependent properties
);
CREATE TABLE IngredientName (
Id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
, IngredientId INTEGER UNSIGNED NOT NULL
, IsMain TINYINT(1) UNSIGNED NOT NULL DEFAULT 0
, Name VARCHAR(63) NOT NULL
, KEY IX_IngredientName_IngredientId_IsMain (IngredientId, IsMain)
, UNIQUE KEY IX_IngredientName_IngredientId_Name (IngredientId, Name)
, CONSTRAINT FK_IngredientName_IngredientId FOREIGN KEY (`IngredientId`) REFERENCES `Ingredient` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE
);
Or you can add Ingredient.Name that would be the main name and rid off the IngredientName.IsMain then.
For spaces you should use some name normalization in your application such as removing consecutive spaces, capitalizing, normalizing spaces around commas, dashes etc. Sure, you can apply such normalization on database in trigger if you like.
There are some other possibilities.
You should think what would be user cases for using the DB first.
This is very important. There is no 'the best universal DB design'.
If you need some special search cases you might need special DB design or at least indexes.
P.S. I believe that putting different names in one field as something-separated value is bad idea
I have a tables called userAccounts userProfiles and usersearches.
Each userAccount may have multiply Profiles. Each user may have many searches.
I have the db set up working with this. However in each search there may be several user profiles.
Ie, each user account may have a profile for each member of their family.
They then want to search and include all or some of their family members in their search. The way i would kinda like it to work is have a column in user searches called profiles and basically have a list of profileID that are included in that search. (But as far as i know, you can't do this in sql)
The only way i can think i can do this is have 10 columns called profile1, profile2 ... profile10 and place each profileid into the column and 0 or null in the unused space. (but this is clearly messy )
Creating columns of the form name1...nameN is a clear violation of the Zero, One or Infinity Rule of database normalization. Arbitrarily having ten of them is not the right approach, that's an assumption that will prove to be either wildly generous or too constrained most of the time. Since you're using a relational database, try and store your data relationally.
Consider the schema:
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT NOT NULL,
name VARCHAR(255),
UNIQUE KEY index_on_name (name)
);
CREATE TABLE profiles (
id INT PRIMARY KEY AUTO_INCREMENT NOT NULL,
user_id INT NOT NULL,
name VARCHAR(255),
email VARCHAR(255),
KEY index_on_user_id (user_id)
);
With that you can create zero or more profile records as required. You can also add or remove fields from the profile records without impacting the main user records.
If you ever want to search for all profiles associated with a user:
SELECT ... FROM profiles
LEFT JOIN users ON
users.id=profiles.user_id
WHERE users.name=?
Using a simple JOIN or subquery you can easily exercise this relationship.
I am currently dealing with a data structure similar to the one linked here:
http://sqlfiddle.com/#!2/2ad8f/1
There will be a field (fruits in this case) that can contain very variable options - quantity, colour, type, etc. I am trying to work out an efficient way of storing this data and using it programatically in a frontend.
I have thought about creating new fields (e.g. a field for quantity, a field for colour, etc), however the data can be highly variable and I will be dealing with many, many rows. Potentially 1-2 million. I don't want to create a "texture" field for example that is only used for 100/1,000,000 rows.
The "fruits" here would never be order by or referenced by the database storage engine.
My best idea so far is to store a JSON object as a string (see the second insert in link), however is there a more efficient method?
If you want to place all your attributes into one text container, you may as well be using a text file instead of a relational database. The database will have a lot of overhead that you are simply not using so why have it?
If you want this in a relational form, then let's go through some simple modeling.
WE have different kinds of fruit. These fruit can have different and even different kinds of attributes. Here is one simple way:
create table Fruit(
ID int auto_increment primary key,
Name varchar( 20 ) not null, -- Apple, Orange, etc.
Type varchar( 20 ), -- Macintosh, Granny Smith, Navel, etc.
Size char( 1 ), -- S, M, L
Qty int not null,
-- other data such as price, shelflife, whatever
);
So now we create a table for each type of disparate attribute:
create table Attr(
ID int auto_increment primary key,
Type varchar( 20 ), -- Color, Texture, Taste, etc.
Value varchar( 10 ) -- Red, Green, Juicy, Sweet, Sour, etc.
);
Each fruit can have several attributes and each attribute may apply to several kinds of fruit, so you need a many-to-many cross table between them:
create table FruitAttr(
FruitID int,
AttrID int,
primary key( FruitID, AttrID )
);
with FruitID a foreign key to Fruit and AttrID a foreign key to Attr. Now we can create a Basket table which will define each individual basket.
create table Basket(
ID int auto_increment primary key,
Name varchar( 20 ) not null, -- Graduation, Funeral, Birthday, etc.
Price decimal (19,4),
-- other basket-specific attributes
);
A basket is made up of several selections of fruit and each fruit may appear in several types of basket. So there is the same relationship between Basket and Fruit as between Fruit and Attr: many-to-many. As we've already modeled one of those tables, I'll leave that to you.
There are enhancement and changes that may be made to tailor these tables closer to your specific uses, but we now have a workable solution.
So very quickly we have gone from one table to five tables. That may seem like we've complicated everything but if you have to work with them, you will find we have made our (meaning your) life a whole lot easier, especially when you add new types of baskets or fruit, change the makeup of a basket, substitute one fruit (severe core rot suddenly makes Granny Smiths unavailable), or any number of ways you will need to change your data.
After all, it is a relational database and relations are established between tables, not between substrings within strings. So the DML and queries to work with these relations will be so much easier than trying to manipulate text strings.
Say I have two tables (Apples and Oranges) with the same columns and just a different table name. Would there be any advantages/disadvantages to turning this into one table (lets say its called Fruit) with an additional column 'type' which would then either store a value of Apple or Orange?
Edit to clarify:
CREATE TABLE apples
(
id int,
weight int,
variety varchar(255)
)
CREATE TABLE oranges
(
id int,
weight int,
variety varchar(255)
)
OR
CREATE TABLE fruit
(
id int,
weight int,
variety varchar(255),
type ENUM('apple', 'orange')
)
Depends on constraints:
Do you have foreign keys or CHECKs on apples that don't exist on oranges (or vice-versa)?
Do you need to keep keys unique across both tables (so no apple can have the same ID as some orange)?
If the answers on these two questions are: "yes" and "no", keep the tables separate (so constraints can be made table-specific1).
If the answers are: "no" and "yes", merge them together (so you can crate a key that spans both).
If the answers are: "yes" and "yes", consider emulating inheritance2:
1 Lookup data is a typical example of tables that look similar, yet must be kept separate so FKs can be kept separate.
2 Specifically, this is the "all classes in separate tables" strategy for representing inheritance (aka. category, subclassing, subtyping, generalization hierarchy etc.). You might want to take a look at this post for more info.
If there really is not any further business rules (and resultant underlying data requirements) that separate the two sub-types then I would use one table with an fk to a FruitType lookup table.
You dont mention what you will be using to access the schema which may affect which approach you take (e.g. if you are using a platform which provides an ORM to your database then this may be worth noting).
The advantage would be normalization. Your tables would then be in 2NF (second normal form).
Your fruit type would be a foreign key to a table with those fruits like so:
CREATE TABLE fruit_type (type varchar(15))
CREATE TABLE fruits (id int, weight int, variety varchar(255), type varchar(15))