Database design of a tree-like category system - mysql

I'm using the Adjacency List Model to create categories, and it works perfectly.
When retrieving articles in a certain category (for example electronics), I would like to also retrieve the articles in the sub categories (for example electronics->cameras, or even electronics->cameras->camera lenses).
The way I am doing it now is pulling from the DB all the category id's of the sub categories of electronics, and finding all articles with a category_id in this list.
This seems to me very inefficient and time-consuming, since this could result in many queries to retrieve these sub categories.
Another way I thought of doing this is having every article associated with the whole category tree (for example an article about camera lenses will also be associated with the camera and electronics categories in MANY_MANY table), and when I retrieve all articles in electronics it will also appear.
This would add a lot of redundant data to the database though, as I might have to store 3 or 4 categories for each article. Also, it would complicate actions like moving an article to another category.
Is this the right way to go? Or is there a better/simpler way that I have not thought of?
Any help appreciated!

Have a read of this article about Nested Set Modelling: Managing Hierarchical Data in MySQL.
Using the suggested technique, you can get entire trees or subtrees in one single SELECT. It's a bit more complicated than the "normal" approach, but it's totally worth it if you're going to be doing lots of reads from the table.

Related

Form Functionality Verse Table Storage

I have some experience getting data out of database, but not so much in design. To work on this, I'm taking some personal projects and trying to create them in access. I've run across an issue that I was able to get a solution to, but I find it clumsy and was hoping to have some opinions on what I can do better.
My current project is a monthly budget. At this stage I would like a Form to appear as follows:
Category
Sub-Category Budget_Amount
Sub-Category Budget_Amount
Category
Sub-Category Budget_Amount
etc.
I found that I can do this if all the sub-categories are the fields in a table and the category names are hard-coded as labels in the form. However, I would like my table structure to be like this:
Category:
ID
Category_Name
Sub-Category:
ID
Sub-Category
ID_Category
Show_Category
Budget:
Id
Sub-CategoryID
Budget_Amount
The reason I want this structure is that not all sub-categories will be used every month, and in my mind it will be easier to match what was budgeted verses what was spent. I am also trying to practice minimizing space taken up by the database. Are there any ways to do this easily? Or am restricted to my current solution?
I would dive into normalization principles first as you seem not to be too familiar on those principles yet.
"The reason I want this structure..." is not a good reason at all.
Your basic is a sound normalized database structure.
I'd suggest you structure your tables like this:
tblCategory -> CatID (autonumber PK), Categorie
tblSubCategory -> SubCatID (autonumber, PK), CatID (number, FK), Subcategory
As for budgets, I'd expect that budgets tie in into projects. There's no logical reason to tie in budgets directly into any kind of category.
We would need more and escepially more explicit information on what you're trying to achieve.
Similar question:
Database Design Question - Categories / Subcategories
There's sites that could help you to start understanding designing in MS Access, for instance
http://www.functionx.com/access/index.htm

MySql create a link between 1 or 2 tables

Does there is a cleaner way to do this ?
The Products table is linked to the sub_categories table, but if there is no sub_categories for a category, I make a link between the Products table and table categories ?
There are a number of ways to achieve this. Depending on the depth level of categories and also what your preferred implementation can implement using either of the following approaches.
Adjacency List model
A single categories table with a self referencing parent_id column that is populated for each sub category.
Nested Set model
A single categories table with "lft" and "rgt" columns to denote the position within the set. "lft" and "rgt" mean Left and Right respectively as "LEFT" and "RIGHT" are reserved words in SQL.
There is a fantastic full blog post with examples and diagrams explaining in great detail how both these approaches work - here.
I would also recommend looking at libraries, in your chosen language, that may take some of the work out what you want to achieve.

Relationship database design - object specific many to many, do I solve with self join table or new table

Being new to relational database design, I am trying to clarify one piece of information to properly design this database. Although I am using Filemaker as the platform, I believe this is a universal question.
Using the logic of ideally having all one to many relationships, and using separate tables or join tables to solve these.
I have a database with multiple products, made by multiple brands, in multiple product categories. I also want this to be as scale-able as possible when it comes to reporting, being able to slice and dice the data in as many ways as possible since the needs of the users are constantly changing.
So when I ask the question "Does each Brand have multiple products" I get a yes, and "Does each product have multiple brands" the answer is no. So this is a one to many relationship, but it also seems that a self-join table might give me everything that I need.
This methodology also seems to go down a rabbit hole for other "product related" information such as product category, each product is tied to one product category, but only one product category is related to a product.
So I see 2 possibilities, make three tables and join them with primary and foreign keys, one for Brand, one for Product Category, and one for Products.
Or the second possibility is to create one table that has the brand and product category and product info all in one table (since they are all product related) and simply do self-joins and other query based tables to give me the future reporting requirements that will be changing over time.
I am looking for input from experiences that might point me in the right direction.
Thanks in advance!
Could you ever want to store additional information about a brand (company URL, phone number, etc.) or about a product category (description, etc.)?
If the answer is yes, you definitely want to use three tables. If you don't, you'll be repeating all that information for every single item that belongs to the same brand or same category.
If the answer is no, there is still an advantage to using three tables - it will prevent typos or other spelling inconsistencies from getting into your database. For example, it would prevent you from writing a brand as "Coca Cola" for some items and as "Coca-Cola" for other items. These inconsistencies get harder and harder to find and correct as your database grows. By having each brand only listed once in it's own table, it will always be written the same way.
The disadvantage of multiple tables is the SQL for your queries is more complicated. There's definitely a tradeoff, but when in doubt, normalize into multiple tables. You'll learn when it's better to de-normalize with more experience.
I am not sure where do you see a room for a self-join here. It seems to me you are saying: I have a table of products; each product has one brand and one (?) category. If that's the case then you need either three tables:
Brands -< Products >- Categories
or - in Filemaker only - you can replace either or both the Brands and the Categories tables with a value list (assuming you won't be renaming brands/categories and at the expense of some reporting capabilities). So really it depends on what type of information you want to get out in the end.
If you truly want your solution to be scalable you need to parse and partition your data now. Otherwise you will be faced with the re-structuring of the solution down the road when the solution grows in size. You will also be faced with parsing and relocating the data to new tables. Since you've also included the SQL and MySQL tags if you plan on connecting Filemaker to an external data source then you will definitely need to up your game structurally.
Building everything in one table is essentially using Filemaker to do Excel work and it won't cut it if you are connecting to SQL, MySQL, etc.
Self join tables are a great tool. However, they should really only be used for calculating small data points and should not be used as pivot points or foundations for your reporting features. It can grow out of control as time goes on and you need to keep your backend clean.
Use summary and sub-summary reporting features to slice product based data.
For retail and general product management solutions, whether it's Filemaker/SQL/or whatever the "Brand" or "Vendor" is it's own table. Then you would have a "Products" table (the match key being the "Brand ID").
The "Product Category" field should be a field in the "Products" table. You can manage the category values by building a standard value list or building a value list based on a "Product Category" table. The second scenario is better for long term administration.

Migrating from MySQL to MongoDB - best practices

So, it may be best to just try it out and see through some trial and error, but I'm trying to figure out the best way to migrate a pretty simple structure from mysql to mongodb. Let's say that I have a main table in mysql called 'articles' and I have two other tables, one called 'categories' and the other 'category_linkage'. The categories all have an ID and a name. The articles all have an ID and other data. The linkage table relates articles to categories, so that you can have unlimited categories related to each article.
From a MongoDB approach, would it make sense to just store the article data and category ID's that belong to that article in the same collection, thus having just 2 data collections (one for the articles and one for the categories)? My thinking is that to add/remove categories from an article, you would just update($pull/$push) on that particular article document, no?
In my opinion, a good model would look like this:
{"article_name": "name",
"category": ["category1_name", "category2_name", ...],
"other_data": "other data value"
}
So, to embed the category names directly to the article document. Updating article categories is easy, but removing a category altogether requires modifying all articles belonging to the category. If removing categories is frequent, then keeping them separate might be a good idea performance-wise.
This approach makes it also easy to make queries on the category name (no need to map name to id with a separate query).
Thus, the "correct" way to model the data depends on the assumed use case, as is typically the case with mongodb and other nosql databases.
If you have access to a Mac computer, you could give the MongoHub GUI a try. It has an "Import from MySQL" feature.

structuring my mysql table correctly

I have a list of businesses and each business could be part of any number of categories. So what I would normally do is have a table 'business' then a table 'categories' and a table 'businesscategories' which would have the id of the business and category so therefore a business could be linked to any number of categories.
However, I wondered if there's a much simpler way of assigning businesses to any number of categories? Just keeping it all to 1 or 2 tables would be brilliant if possible...
Thanks
No, it wouldn't be brilliant. Your original approach is right.
The keyword here is "normalization". Only your original approach presents a normalized model of your data.
Don't worry about having numerous tables. The tables have to accommodate the logical structure of the information, not the other way around.
(If you want, though, you can represent bounded data by an enum rather than a category table. But that's a minor decision.)