I have a website for which I am building in "categories" which would work pretty much like the tags of StackOverflow.
What I am confused about it how to best structure the tables for this sort of a thing. For example, I know I'd need a table to structure the actual categories like the name, who made it, what date it was made, etc.
What I am not sure about is: when a record gets n number of different categories, how to store that in the database. Should I have the record_ids in the item table to which the categories belong, and just comma-separate the its? Or should I have a separate table with something like item_categories with item_id, category_id, etc...and just join that table and the item table, and the categories table when getting the category?
The ladder seems slow because of the join, but more organized and clean.
Or is there another way to structure this that I have not thought of? How is a good way to go about structuring this sort of data?
Make three tables. One for the page, one for the categories (along with meta information etc), and one to bind them together. That last table only need to have a pageid and a categoryid, to link records from both tables together.
Don't ever store comma separated values in a database, if you need to use those in joins or searches.
You should use a separate table like you say. It's called normalization.
If you're using a column with comma separated values think of the performance when accessing the values versus doing a join. You will have to split every value and then do a comparison to see if there's a match.
Or should I have a separate table with something like item_categories with item_id, category_id, etc...and just join that table and the item table, and the categories table when getting the category?
Yes, this. That's a classic M:N relationship in SQL.
Related
I'm working on the database (MySQL) - car dealership. Since the product (car) has a lot of features and unique values (gearbox, model, manufacturer...), I wonder, how to create a well designed database for it.
Should I use:
Table cars
columns -> id, name, manufacturer, model, gearbox...
Or:
Table cars
columns -> id, name, manufacturer_id, gearbox_id...
Table manufacturers
columns -> id, name
Table gearbox
columns -> id, name
There are a lot of unique values as I mentioned and I think it's not good to store them again and again, but if I create a lot of tables + link them with link table to product table (car), there will be a lot of joins when I make a query to get all of the values.
And these are only few of them, there are much more values I need to store for every product in the database.
You have 3 options here:
You could store each car as a separate table and then have a row corresponding to the gearbox, etc. This is awful, no one does it, don't do it.
You could serialize all the gearbox, etc. data as json strings and put them in your car cells. This is also awful, some people have stupidly done this, but not that often. Don't do it.
You could do things the normal, good way and implement separate tables for every class of object with foreign keys linking them. This is the way to go.
I have table called UserComments.
It contains 3 columns:
id, user_id, and comment_id.
I query this table 2 separate ways.
1 by user id and
1 by comment id. Both of these fields are indexed.
I want to add an additional column tags.
I will only need this column when querying by comment id.
Does it make more sense to add the column to the existing table (and not return it back to avoid data transfer)?
OR
Create a new table and perform the join when necessary?
Why is 1 better than the other?
You should use a separate table for the specific purpose of tags.
Lets take this stack overflow question as an example. You have created a question with 3 tags. This means that ONE comment has THREE tags or in other words a one-to-many relationship.
The proper way to model one-to-many is with a separate table. Now, lets look at the differences.
One Table:
You will have one table. You will not be able to model a one-to-many so you will have to create your own method for having multiple tags such as a CSV for the tags.
example:
id, user_id, comment_id, tags
'2', '276', '2738', 'mysql,sql,sql-server'
Can you see how this is getting confusing already? You will need to write your own code to parse out the csv. Now, imagine you wanted to search by tag. Oh man... the nightmare that will become.. and the slowness if you use a sql regex or like...
On the other hand, a two table would have a second table
comment_id, tag
123, mysql
123, sql
123, sql-server
You grab all entries with 123, you have your list. Now if you want to search by tag, EASY.
My guess is you already have a separate table somewhere else for users, and you grab all users comments using this comment table. You did that inherently because users and comments are a one-to-many relationship. Same concept here.
Adding as answer because consensus agrees:
Generally speaking, more tables is better. Reason being, you want to avoid redundant data. Your User table should be on it's own. Your comments table should have it's own ID and a field for the UserID - join on that. And subsequent things you need that are not comments or new users should have their own tables with the same scheme.
From this you will have the benefit of having your Users sitting on their own, and be able to easily join each user to an indefinite number of comments with no redundancy.
I would do something like this. I would create a table just for tags rather then having a column containing n instances of say 'sql-server' tag when you can related it to a Tag table. So sql-server has an id of 1. int 1 over varchar 'sql'server' takes less space plus allows easy expand on.
Comment
CommentID
..etc
UserComment
UserCommentID
CommentID
UserID
CommentTag
CommentTagID
UserCommentID
TagID
Tag
TagID
Description
I am designing a relational database of products where there are products that are copies/bootlegs of each other, and I'd like to be able to show that through the system.
So initially during my first draft, I had a field in products called " isacopyof " thinking to just list a comma delimited list of productIDs that are copies of the current product.
Obviously once I started implementing, that wasn't going to work out.
So far, most many-to-many relationship solutions revolve around an associative table listing related id from table A and related id from table B. That works, but my situation involves related items from the SAME table of products...
How can I build a solution around that ? Or maybe I am thinking in the wrong direction ?
You're overthinking.
If you have a products table with a productid key, you can have a clones table with productid1 and productid2 fields mapping from products to products and a multi-key on both fields. No issue, and it's still 3NF.
Because something is a copy, that means you have a parent and child relationship... Hierarchical data.
You're on the right track for the data you want to model. Rather than have a separate table to hold the relationship, you can add a column to the existing table to hold the parent_id value--the primary key value indicating the parent to the current record. This is an excellent read about handling hierarchical data in MySQL...
Sadly, MySQL doesn't have hierarchical query syntax, which for things like these I highly recommend looking at those that do:
PostgreSQL (free)
SQL Server (Express is free)
Oracle (Express is also free)
There's no reason you can't have links to the same product table in your 'links' table.
There are a few ways to do this, but a basic design might simply be 2 columns:
ProductID1, ProductID2
Where both these columns link back to ProductID in your product table. If you know which is the 'real' product and which is the copy, you might have logic/constraints which place the 'real' productID in ProductID1 and the 'copy' productID in ProductID2.
Say I have a store that sells products that fall under various categories... and each category has associated properties... like a drill bit might have coating, diameter, helix angle, or whatever. The issue is that I'd like the user to be able to edit these properties. If I wasn't interested in having the user change the properties, and I was building the store for a certain set of categories, I'd have one table for drill bits, etc. Alternatively, I could just modify the schema online but that doesn't seem to be done very often (unless we're talking phpmyadmin or something), and plus that doesn't fit in well at all with the way models are coupled to tables.
In general, I'm interested in implementing a multi-table database structure with various datatypes (because diameter might be a decimal, coating would be a string/index into a table, etc), within mysql. Any idea how this might be done?
If I understand correctly what you're asking, an, admittedly hacky, solution would be to have a products table that has to related tables, product_properties and product_properties_lookup (or some better name) where product_properties_lookup has an entry for every possible property a product can have and where product_properties contains the value of a property as a string with the ID of the property and the ID of the product. You could then coerce the property value into whatever type you wanted. Not ideal, but I'm not sure what else to do short of adding individual columns to the DB for property types.
Just use the database. It does all of this already. For free. And fast. How is having a table of products point to a table of properties with data types any different from a table with columns? It's not. Save if you use the DBs tables you get to use SQL to query it in all sorts of neat, and efficient ways compared to your own (crosstabs suck in SQL dbs).
Get a new product, make a new table. No big deal. Get a new property, alter the table. If you have 1M products in that table, yea, it may be a slow update (depends on the DB). Do you have 1M products? I don't think WalMart has 1M products.
Building Databases on top of Databases is a silly thing. Just use the one that's there. It is putty in your hands. Mold it to your whim.
Create a Property table first. This will contain all properties. It should have (at minimum) a Name column and a Type column ('string', 'boolean', 'decimal', etc.). Note: Primary keys are implied for all these tables.
Next, create a CategoryProperty table. Here you will be able to assign properties to a category. It should have these columns: CategoryID, PropertyID. Both foreign keys.
Then, create a Category table. This describes the categories. It should have a Name column and possibly some other columns like Description.
Then, create a ProductCategory table. Here, you will assign the categories for each product. It should have these columns: CategoryID, ProductID. Both foreign keys.
Next, create a PropertyValue table. Here, you will "instantiate" the properties and give them values. Columns include ProductID, PropertyID, and PropertyValue. The primary key can consist of ProductID and PropertyID.
Finally, create a Product table that just describes each product with columns like Name, Price, etc.
Note how for each relationship there is a separate table. If you only want one category for each product, you can do away with the ProductCategory table and just put a CategoryID field in the Product table. Similarly, if you want each property to belong to only one category, you can put a PropertyID column in the Category table and get rid of the CategoryProperty table.
Lastly, you will not be able to verify the data type for each property since each property has a different type (and they are rows, not columns). So just make the PropertyValue column a string and then perform your validation either as a trigger, or in your application, by checking the Type column of the Property table for that property.
If you're using a recentish version of mysql (5.1.5 or greater) you can store your data as XML in the database. You can then query that data using thigns like this.
Suppose I have a table that contains some items and I have a widgetpack that contains numerous
widgets. I can get my total number of widgets:
SELECT SUM( EXTRACTVALUE( infoxml, '/info/widget_count/text()' ) ) as widget_count
WHERE product_type="widgetpack"
assuming the table has an infoxml column and each widgetpacks infxml column contain XML that looks like this
<info>
<widget_count>10</widget_count>
<!-- Any other unstructured info can go in here too -->
</info>
DB purists will cringe at this, and it is kinda hacky. But often its easier to keep all your unstructured data in one place.
Have a look at this database schema on DatabaseAnswers.org:
http://www.databaseanswers.org/data_models/products_and_generic_characteristics/index.htm
Maybe consider an Entity-Attribute-Value (EAV) approach (not for the whole model of course!).
Related questions
Entity Attribute Value Database vs. strict Relational Model Ecommerce question
Approach to generic database design
How do you build extensible data model
The best way to describe this scenario is to use an example. Consider Netflix: do they
store their orders (DVD's they mail out) in a separate table from their member lists (NOT members table, but a joiner table of members and movies--a list of movies each member has created), or are orders distinguished by using additional information in the same row of the same table?
For those not familiar with Netflix, imagine a service that lets you create a wish list of movies. This wish list is subsequently sent to you incrementally, say two movies at a time.
I would like to implement a similar idea using a MySQL database, but I am unsure whether to create two tables (one for orders and one for lists) and dynamically move items from the lists table to the orders table (this process should be semi-automatic based on the member returning an item, where before a new one is sent out, a table with some controls will be checked to see if the user is still eligible/has not gone over his monthly limit)...
Thoughts and pros and cons would be fantastic!
EDIT: my current architecture is: member, items, members_items, what I am asking is if to store orders in the same table as members_items or create a separate table.
Moving things from one database table to another to change its status is simply bad practice. In a RDBMS, you relate rows from one table to other rows in other tables using primary and foreign key constraints.
As for your example, I see about four tables just to get started. Comparing this to Netflix, the grand-daddy of movie renting, is a far-cry from reality. Just keep that in mind.
A User table to house your members.
A Movie table that knows about all of the available movies.
A Wishlist or Queue table that has a one-to-many relationship between a User and Movies.
An Order or Rental table that maps users to the movies that are currently at home.
Statuses of the movies in the Movie table could be in yet another table where you relate a User to a Movie to a MovieStatus or something, which brings your table count to 6. To really lay this out and design it properly you may end up with even more, but hopefully this sort of gives you an idea of where to begin.
EDIT: Saw your update on exactly what you're looking for. I thought you were designing from scratch. The simple answer to your question is: have two tables. Wishlists (or member_items as you have them) and Orders (member_orders?) are fundamentally different so keeping them separated is my suggestion.
A problem with storing orders in the members table is that there's a variable number (0, 1, or several) of orders per member. The way to do this using a relational database is to have two separate tables.
I feel like they would store their movies as follows (simplified of course):
tables:
Titles
Members
Order
Order_Has_Titles
This way an order which has a foreign key to the Members would then have a pivot table as many orders could have many titles apart of them.
When you have a many to many realtionship in the database you then need to create a pivot table:
Order_Has_Titles:
ID (auto-inc)
Order_FkId (int 11)
Title_FkId (int 11)
This way you're able to put multiple movies apart of each order.
Of course this is simplified, and you would have many other components which would be apart of it, however at a basic level, you can see it here.