e-commerce structure for products (MySQL) - mysql

I am considering how to structure my MySQL database in e-commerce solution. To be more specific I am looking at the product structure.
These are the tables I have come up with so far. What do you think?
Explanation of structure
The application is multilingual. Therefore the product table are split in 2 tables.
If a products has e.g. 2 variants (Small, Medium) a total of 3 rows will be inserted. This is because each variant can have different information for each variant. When the product is shown on the webpage product 1 will be shown with a drop down box with Small & Medium. A product with no variants will naturally only insert 1 row.
products
id master product_number
1 0 123
2 1 456
3 1 678
products_descriptions
id product_id country_code image name description vat price
1 1 en-us image.jpg t-shirt Nice t-shirt 25 19.99
2 2 en-us image.jpg t-shirt Nice t-shirt 25 19.99
3 3 en-us image.jpg t-shirt Nice t-shirt 25 19.99
products_to_options
product_id option_id
2 1
3 2
options
id name
1 Small
2 Medium

Your Products table is schizophrenic, its entity is sometimes Product and sometimes Variant. This leads to very cumbersome behavior. For example, you'd like the question "how many different products do we have?" be answered by select count(*) from products, but here this gives the wrong answer, to get the correct answer you have to know the Magic Number 0 and query select count (*) from products where master=0. "List all products and how many variants we have for each" is another query that should be straightforward but now isn't. There are other anomalies, like the fact that the first line in products_descriptions is a shirt that has a price and a picture but no size (size is stored in the variants, but they have prices and pictures of their own).
Your problem sounds like you have products in two contexts: (1) something that can be displayed as an item in your store, and (2) something that can be ordered by your customer. (1) probably has a name like "Halloween T-Shirt" or so, and it probably has an image that the customer sees. (2) is what the customer orders, so it has a (1), but also a variant specification like "small" or maybe a color "red". It probably has a price, too, and an order_id so your shop can know what specific item to ship.
You should give each context an entity. Here's how i'd do it
displayable_product
id name
1 "Baseball Cap"
2 "T-Shirt"
orderable_product
id d_product_id order_id size color price
1 1 123 red 9.99
2 2 456 small 19.99
3 2 789 medium 21.99
displayable_content
id d_product_id locale name image
1 1 en_US "Baseball Cap" baseballcap.jpg
2 1 es_US "Gorra de Beisbol" baseballcap.jpg
3 2 en_US "Nice T-Shirt" nicetshirt.jpg
4 2 es_US "Camiseta" nicetshirt.jpg
You should probably use locale instead of country in the display tables to account for countries with more than one language (USA, Switzerland, and others), and you might separate the size and color into its own variants table. And if you need country-dependent data on the orderables (like different prices/currencies for shipping to different countries), you'd have to extract a country-dependent orderable_content table, too.

Related

1NF, 2NF, AND 3NF Normalization?

I have a teacher who doesn't like to explain to the class but loves putting up review questions for upcoming tests. Can anyone explain the image above? My main concern in the red underline which shows that supplier and supplierPhone are repeated values. I thought that repeated values occurred when there were many occurrences of the same item in a column.
Another question I have is that if the Supplier is a repeating value, why isnt Part_Name a repeating value because they both have 2 items with same names in their columns.
Example:
It's repeated because the result of the tuple is always the same. E.g. ABC Plastics will always have the same phone number, therefore having 2 rows with ABC Plastics means that we have redundant information in the phone number.
Part1 Company1 12341234
Part2 Company1 12341234
We could represent the same information with:
Part1 Company1
Part2 Company1
And
Company1 12341234.
Therefore having two rows with the same phone number is redundant.
This should answer your second question as well.
Essentially you're looking for tuples such that given the tuple (X, Y) exists, if there exists another tuple (X, Y') then Y' = Y
Looks like five tables to me.
model (entity)
modelid description price
1 Laserjet 423.56
1 256 Colour 89.99
part (entity)
partid name
PF123 Paper Spool
LC423 Laserjet cartridge
MT123 Power supply
etc
bill_of_materials (many to many relationship model >--< part )
modelid partid qty
1 PF123 2
1 LC423 4
1 MT123 1
2 MT123 2
supplier (entity)
supplier_id phone name
1 416-234-2342 ABC Plastics
2 905.. Jetson Carbons
3 767... ACME Power Supply
etc.
part_supplier (many to many relationship part >--< supplier )
part_id supplier_id
PF123 1
LC423 2
MT123 3
etc.
You have one row in model, part, supplier for each distinct entity
You have rows in bill_of_materials for each part that goes into each model.
You have a row in part_supplier for each supplier that can furnish each part. Notice that more than one part can come from one supplier, and more than one supplier can furnish each part. That's a many-to-many relationship.
The trick: Figure out what physical things you have in your application domain. Then make a table for each one. Then figure out how they relate to each other (that's what makes it relational.)

Edit product selling location using mysql

I'm building a e-Commerce platform (PHP + MySQL) and I want to add a attribute (feature) to products, the ability to specify (enable/disable) the selling status for specific city.
Here are simplified tables:
cities
id name
==========
1 Roma
2 Berlin
3 Paris
4 London
products
id name cities
==================
1 TV 1,2,4
2 Phone 1,3,4
3 Book 1,2,3,4
4 Guitar 3
In this simple example is easy to query (using FIND_IN_SET or LIKE) to check the availability of product for specific city.
This is OK for 4 city in this example or even 100 cities but will be practical for a large number of cities and for very large number of products?
For better "performance" or better database design should I add another table to table to JOIN in query (productid, cityid, status) ?
availability
id productid cityid status
=============================
1 1 1 1
2 1 2 1
3 1 4 1
4 2 1 1
5 2 3 1
6 2 4 1
7 3 1 1
8 3 2 1
9 3 3 1
10 3 4 1
11 4 3 1
For better "performance" or better database design should I add
another table
YES definitely you should create another table to hold that information likewise you posted rather storing in , separated list which is against Normalization concept. Also, there is no way you can gain better performance when you try to JOIN and find out the details pf products available in which cities.
At any point in time if you want to get back a comma separated list like 1,2,4 of values then you can do a GROUP BY productid and use GROUP_CONCAT(cityid) to get the same.

MySQL database to store product, color, size and stock

I have an assignment for shopping cart dealing with shirt store, and was confusing with database design in storing shirt attributes such as color, size and stock for each item.
Let's say to store below shirt to db:
Product name: Nike shirt
Available colors: black, white, blue
Size: M, L, XL
Stock: Black - M - 5 pc
White - L - 10 pc
Blue - M - 2 pc
Blue - XL - 3 pc
(and so on...)
Instead of storing above info iteratively in a table like so:
table shirt
id product color size stock
---------------------------------------------
1 Nike Shirt black M 5
2 Nike Shirt white L 10
3 Nike Shirt blue M 2
4 Nike Shirt blue XL 3
....
What is the best way to design table to keep these attribute and product effectively?
I know that could be JOIN multiple table together, but I need advise on these attributes on how to put separately with difference table and fetch the info when people goes to respective page and show them up how many stock are left for the specific size?
Here's your table.
Shirt
id product color size stock
---------------------------------------------
1 Nike Shirt black M 5
2 Nike Shirt white L 10
3 Nike Shirt blue M 2
4 Nike Shirt blue XL 3
....
You see how you've duplicated the product name "Nike Shirt" and the color "blue". In a normalized relational database, we don't want to duplicate any information. What do you think would happen if someone accidently changed "Nike Shirt" to "Nike Skirt" in row 4?
So, let's normalize your table.
We'll start with a Product table.
Product
id product
------ ------------
0 Nike Shirt
Generally, database id numbers start with zero, not one.
Next, let's create a Color table.
Color
id color
------ -------
0 black
1 white
2 blue
Next, let's create a Size table.
Size
id size
------ -----
0 XS
1 S
2 M
3 L
4 XL
5 XXL
Ok, now we have 3 separate object tables. How do we put them together so we can see what's in stock?
You had the right idea with your original table.
Stock
id product color size stock
---------------------------------------------
0 0 0 2 5
1 0 1 3 10
2 0 2 2 2
3 0 2 4 3
The product, color, and size numbers are foreign keys back to the Product, Color, and Size tables. The reason we do this is to eliminate duplication of the information. You can see that any piece of information is stored in one place and one place only.
The id isn't necessary on the Stock table. The product, color, and size should be unique, so those 3 fields could make a compound key to the Stock table.
In an actual retail store, a product could have many different attributes. The attributes would probably be stored in a key/value table. For your simple table, we can break the table up into normalized relational tables.
Substituting surrogate keys for text identifiers isn't normalizing. The original functional dependency (product, color, size) -> stock remains unchanged in your final version, and is in fact fully normalized. – reaanb Jul 31 '15 at 21:33
This is a horrendously complicated business domain. Your suggested single-table layout has a couple of challenges. Firstly, lots of repeated entries are a smell in database design. Secondly, it kind supposes that all entities have the same attributes - as #gilbertleblanc writes, there are likely going to be lots of other attributes a real-world application would need to store (manufacturer, material, allergy info, etc.).
So, I would split this into two questions:
Which products and variants exist?
How much stock do we have of each item?
The minimal way to represent your sample data would be:
products
----------
id
name
color
size
product_stock
------------
product_id
stock_quantity
In real-life scenarios, the stock table is usually a ledger of transactions - the sum of movement gives you the current amount in stock.
product_id date movement
---------------------------------
1 1 Jan 2021 10
1 2 Jan 2021. -1
1 3 Jam 2021. -4

How to design database to store attributes of attributes into database

I am designing database for jewellery website in which a lot of products will store. While designing tables for attributes of products i stuck. My problem is how store product attributes and there sub attributes and so on in database.
Up to now i have created 3 tables for product and attributes -
tbl_attribute
Structure: attribute_id*, attribute_name
tbl_products
Structure: product_id*, category_id(FK), product_name, seo, description, metatags, length, width, height, weight, image, status
tbl_products_attribute
Structure: product_id(fk), attribute_id(fk), value
I have a situation suppose Necklace have 5 stones (Stone is attribute) and each stones have following sub attributes
1. Stone Name
2. Stone Color
3. Stone Treated Method
4. Stone Clearity
5. Stone Shape
6. Stone Price
and so on etc.... and i have So many attributes like stone so can you please help me how to design the table for these attributes.
Depends on that attributes i have to search(FILTER || Faceted Search) the products in front end.
Like: www.firemountaingems.com
Detailed attributes and sub attributes list:
Alphabatical
Availability (Sold Individualy, Sold in Bulk)
Birth Stone
Brands
Color (red, green, blue)
Design Type
Gender (Male, Female)
Images
Karat (18k, 22k, 24k)
Link
Make (Hand, Machine, Oxidised)
Making Percent
Material Type (Leather, Gemstone, etc)
Metal Art
Metal Stamp
Metal Type (Gold, Silver, PLatinum etc)
Model
Name
Price
Purity
Shapes (round, oval, emerald, diamond etc)
Short Description
Sides (single, both,)
Size (small, big etc)
Special Price
Status
Stone (Name, Color, Treated Method, Clearity, Shape, Price, Main Stone Color (Red, Pink, Green))
Stringing Material
Warranty
Wastage Percent
Weight
Wire - Wrapping Wire
Up to now i have search so many tutorials and article on net as well as on stack overflow.
Your help is highly appreciated.
In this case you cant afford creating tables like this the best approach is to have only 1 product table and 1 attributes table which will be having all attributes of a product but with columns attribute_level and parent_id where parent_id refers id of same table.
Example:
products table:
ID | Name
----------
1 | Necklace
ID | ParentID | AttributeName | ProductID | AttributeLevel | AttributeDescription
----------------------------------------------------------------------------------
1 | | Stone | 1 | 1 | stone description in general
2 | 1 | StoneName | 1 | 2 | Ruby
In this way you can make hirerchy of attribute levels within one table.

SQL "shortcut" identifiers or a long string of joins?

QUESTION: Is it okay to have "shortcut" identifiers in a table so that I don't have to do a long string of joins to get the information I need?
To understand what I'm talking about, I'm going to have to lay ouf an example here that looks pretty complicated but I've simplified the problem quite a bit here, and it should be easily understood (I hope).
The basic setup: A "company" can be an "affiliate", a "client" or both. Each "company" can have multiple "contacts", some of which can be "users" with log in privileges.
`Company` table
----------------------------------------------
ID Company_Name Address
-- ----------------------- -----------------
1 Acme, Inc. 101 Sierra Vista
2 Spacely Space Sprockets East Mars Colony
3 Cogswell Cogs West Mars Colony
4 Stark Industries Los Angeles, CA
We have four companies in our database.
`Affiliates` table
---------------------
ID Company_ID Price Sales
-- ---------- ----- -----
1 1 50 456
2 4 50 222
3 1 75 14
Each company can have multiple affiliate id's so that they can represent the products at different pricing levels to different markets.
Two of our companies are affiliates (Acme, Inc. and Stark Industries), and Acme has two affiliate ID's
`Clients` table
--------------------------------------
ID Company_ID Referring_affiliate_id
-- ---------- ----------------------
1 2 1
2 3 1
3 4 3
Each company can only be a client once.
Three of our companies are clients (Spacely Space Sprockets, Cogswell Cogs, and Stark Industries, who is also an affiliate)
In all three cases, they were referred to us by Acme, Inc., using one of their two affiliate ID's
`Contacts` table
-----------------------------------------
ID Name Email
-- -------------- ---------------------
1 Wylie Coyote wcoyote#acme.com
2 Cosmo Spacely boss#spacely.com
3 H. G. Cogswell ceo#cogs.com
4 Tony Stark tony#stark.com
5 Homer Simpson simpson#burnscorp.com
Each company has at least one contact, but in this table, there is no indication of which company each contact works for, and there's also an extra contact (#5). We'll get to that in a moment.
Each of these contacts may or may not have a login account on the system.
`Contacts_type` table
--------------------------------------
contact_id company_id contact_type
---------- ---------- --------------
1 1 Administrative
2 2 Administrative
3 3 Administrative
4 4 Administrative
5 1 Technical
4 2 Technical
Associates a contact with one or more companies.
Each contact is associated with a company, and in addition, contact 5 (Homer Simpson) is a technical contact for Acme, Inc, and contact 4 (Tony Stark) is a both an administrative contact for company 4 (Stark Industries) and a technical contact for company 3 (Cogswell Cogs)
`Users` table
-------------------------------------------------------------------------------------
ID contact_id company_id client_id affiliate_id user_id password access_level
-- ---------- ---------- --------- ------------ -------- -------- ------------
1 1 1 1 1 wylie A03BA951 2
2 2 2 2 NULL cosmo BF16DA77 3
3 3 3 3 NULL cogswell 39F56ACD 3
4 4 4 4 2 ironman DFA9301A 2
The users table is essentially a list of contacts that are allowed to login to the system.
Zero or one user per contact; one contact per user.
Contact 1 (Wylie Coyote) works for company 1 (Acme) and is a customer (1) and also an affiliate (1)
Contact 2 (Cosmo Spacely) works for company 2 (Spacely Space Sprockets) and is a customer (2) but not an affiliate
etc...
NOW finally onto the problem, if there is one...
Do I have a circular reference via the client_id and affiliate_id columns in the Users table? Is this a bad thing? I'm having a hard time wrapping my head around this.
When someone logs in, it checks their credentials against the users table and uses users.contact_id, users.client_id, and users.affiliate_id to do a quick look up rather than having to join together a string of tables to find out the same information. But this causes duplication of data.
Without client_id in the users table, I would have to find the following information out like this:
affiliate_id: join `users`.`contact_id` to `contacts_types`.`company_id` to `affiliates`.`company_id`
client_id: join `users`.`contact_id` to `contacts_types`.`company_id` to `clients`.`company_id`
company_id: join `users`.`contact_id` to `contacts_types`.`company_id` to `company`.`company_id`
user's name: join `users`.`contact_id` to `contacts_types`.`contact_id` to `contacts`.`contact_id` > `name`
In each case, I wouldn't necessarily know if the user even has an entry in the affiliate table or the clients table, because they likely have an entry in only one of those tables and not both.
Is it better to do these kinds of joins and thread through multiple tables to get the information I want, or is it better to have a "shortcut" field to get me the information I want?
I have a feeling that over all, this is overly complicated in some way, but I don't see how.
I'm using MySQL.
it's better to do the joins. you should only be denormalizing your data when you have timed evidence of a slow response.
having said that, there are various ways to reduce the amount of typing:
use "as" to give shorter names to your fields
create views. these are "virtual tables" that already have your standard joins built-in, so that you don't have to repeat that stuff every time.
use "with" in sql. this lets you define something like a view within a single query.
it's possible mysql doesn't support all the above - you'll need to check the docs [update: ok, recent mysql seems to support views, but not "with". so you can add views to do the work of affiliate_id, client_id etc and treat them just like tables in your queries, but keeping the underlying data nicely organised.]