I ask a similar question (1 table 150,000,000,000 rows) now I will add some details.
500,000 Items
Unlimited # Categories
15 Sections
The site allows users to create their own categories and place as many items into that category. Before they can add anything they must choose what section the category is best represented.
Each of the above will have: id, title, description, imageURL
I have two issues:
Each CATEGORY/ITEM will beable to re-arrange items/categories greatest to worst. COLUMN: rank
Users will be acknowledged for contributing most to category. COLUMN: king
This feature of the site is pretty simple but the ranking process is throwing me for a loop. I have tried multiple test runs cramming as much data into one table as possible but the results are crashing my spirits. The division of data to tables is not easy because of the individual ranking for each category.
The original design was to Have the above 3 and individual tables of each category/item to allow individual ranking(boost speed/performance) then:
User contributor: sectionID, categoryID, itemID, userID
Individual Rank: categoryID/itemID, rank
The outcome would be 150,000,000,000 tabled labyrinth. Has anyone dealt with this concept before? What is the best plan of action? Am I on the right track?
I just got High Performance MySQL, 3rd Edition
Optimization, Backups, and Replication and Beginning PHP and MySQL: From Novice to Professional 4th (fourth) EditionI am not promoting or endorsing these books...
These are the first of many steps I will be taking to tackle this design problem I face. Any comments and assistance will be appreciated. Thoughts; Concerns??
Related
Okay, my partners and myself created these a while ago. We are going to be transferring this into SQL through Visual Basic soon, but I want to make sure everything is ready to go. Two major complaints that we were not able to fix was...
"By having Transaction and Product directly connected you are unable to allow multiple products on the same order (customer can't order both a latte and cappuccino on the same order)."
"Membership table: Not sure what the data in DiscountTypeTotal means - do you have multiple pieces of data in the same field? (then your table isn't on 1NF).
It looks like you need to allow each member to have multiple discounts - so you need another table to capture that."
How do we effectively correct these? How else would we connect Transaction with Products? I understand that the customer can only purchase one item per transactions, so would we have products and another table for multiple items? Allowing multiple customers to have discounts, I am lost. Any help would be appreciated.
Being new to relational database design, I am trying to clarify one piece of information to properly design this database. Although I am using Filemaker as the platform, I believe this is a universal question.
Using the logic of ideally having all one to many relationships, and using separate tables or join tables to solve these.
I have a database with multiple products, made by multiple brands, in multiple product categories. I also want this to be as scale-able as possible when it comes to reporting, being able to slice and dice the data in as many ways as possible since the needs of the users are constantly changing.
So when I ask the question "Does each Brand have multiple products" I get a yes, and "Does each product have multiple brands" the answer is no. So this is a one to many relationship, but it also seems that a self-join table might give me everything that I need.
This methodology also seems to go down a rabbit hole for other "product related" information such as product category, each product is tied to one product category, but only one product category is related to a product.
So I see 2 possibilities, make three tables and join them with primary and foreign keys, one for Brand, one for Product Category, and one for Products.
Or the second possibility is to create one table that has the brand and product category and product info all in one table (since they are all product related) and simply do self-joins and other query based tables to give me the future reporting requirements that will be changing over time.
I am looking for input from experiences that might point me in the right direction.
Thanks in advance!
Could you ever want to store additional information about a brand (company URL, phone number, etc.) or about a product category (description, etc.)?
If the answer is yes, you definitely want to use three tables. If you don't, you'll be repeating all that information for every single item that belongs to the same brand or same category.
If the answer is no, there is still an advantage to using three tables - it will prevent typos or other spelling inconsistencies from getting into your database. For example, it would prevent you from writing a brand as "Coca Cola" for some items and as "Coca-Cola" for other items. These inconsistencies get harder and harder to find and correct as your database grows. By having each brand only listed once in it's own table, it will always be written the same way.
The disadvantage of multiple tables is the SQL for your queries is more complicated. There's definitely a tradeoff, but when in doubt, normalize into multiple tables. You'll learn when it's better to de-normalize with more experience.
I am not sure where do you see a room for a self-join here. It seems to me you are saying: I have a table of products; each product has one brand and one (?) category. If that's the case then you need either three tables:
Brands -< Products >- Categories
or - in Filemaker only - you can replace either or both the Brands and the Categories tables with a value list (assuming you won't be renaming brands/categories and at the expense of some reporting capabilities). So really it depends on what type of information you want to get out in the end.
If you truly want your solution to be scalable you need to parse and partition your data now. Otherwise you will be faced with the re-structuring of the solution down the road when the solution grows in size. You will also be faced with parsing and relocating the data to new tables. Since you've also included the SQL and MySQL tags if you plan on connecting Filemaker to an external data source then you will definitely need to up your game structurally.
Building everything in one table is essentially using Filemaker to do Excel work and it won't cut it if you are connecting to SQL, MySQL, etc.
Self join tables are a great tool. However, they should really only be used for calculating small data points and should not be used as pivot points or foundations for your reporting features. It can grow out of control as time goes on and you need to keep your backend clean.
Use summary and sub-summary reporting features to slice product based data.
For retail and general product management solutions, whether it's Filemaker/SQL/or whatever the "Brand" or "Vendor" is it's own table. Then you would have a "Products" table (the match key being the "Brand ID").
The "Product Category" field should be a field in the "Products" table. You can manage the category values by building a standard value list or building a value list based on a "Product Category" table. The second scenario is better for long term administration.
I'm relatively new to databases and MySQL, but I'm using it to connect a database to a program I've made in VB.NET. Along with many programming languages, I understand SQL, but I have very little experience with databases. Also, I'm using MySQL Workbench (if it helps to know).
I am creating a program which retrieves information from the database. This program in particular is a guide for cooking.
The Database
My database consists of one table named "recipes". Within the table are four columns, each named (in order): ID, Recipe Name, Origin, Ingredients.
My only problem is I plan on storing around 80 or so recipes within the database; however, this will not be a difficult task because I'm simply copy-and-pasting from a Wikia page.
The Problem
The Wikia page in which I'm copying my ingredients from are in a numerical list, therefor I cannot simply copy nearly ten steps, and past them into my ingredients column because it will not let you (typing it would take ages as well). This an issue because I need to retrieve all the ingredients in a list, and I thought it would be inefficient to create over ten different columns.
Conclusion
Is there a more efficient way to store a list of items rather than creating multiple columns? How can I combat this issue?
Have multiple tables. Have a table of recipies with acolumns ID, Recipe Name and Origin, and aother table of ingredients which contains ID, Recipe ID and ingredient (ie, one row per recipe per ingrediant).
You initial ideas (ie, either all ingredients in one column, or many columns, one for each ingredient) would be inefficient and also difficult to interrogate. For example finding which recipes contained a particular ingredient would be difficult.
So I have this application that I'm drawing up and I start to think about my users. Well, My initial thought was to create a table for each group type. I've been thinking this over though and I'm not sure that this is the best way.
Example:
// Users
Users [id, name, email, age, etc]
// User Groups
Player [id, years playing, etc]
Ref [id, certified, etc]
Manufacturer Rep [id, years employed, etc]
So everyone would be making an account, but each user would have a different group. They can also be in multiple different groups. Each group has it's own list of different columns. So what is the best way to do this? Lets say I have 5 groups. Do I need 8 tables + a relational table connecting each one to the user table?
I just want to be sure that this is the best way to organize it before I build it.
Edit:
A player would have columns regarding the gear that they use to play, the teams they've played with, events they've gone to.
A ref would have info regarding the certifications they have and the events they've reffed.
Manufacturer reps would have info regarding their position within the company they rep.
A parent would have information regarding how long they've been involved with the sport, perhaps relations with the users they are parent of.
Just as an example.
Edit 2:
**Player Table
id
user id
started date
stopped date
rank
**Ref Table
id
user id
started date
stopped date
is certified
certified by
verified
**Photographer / Videographer / News Reporter Table
id
user id
started date
stopped date
worked under name
website / channel link
about
verified
**Tournament / Big Game Rep Table
id
user id
started date
stopped date
position
tourney id
verified
**Store / Field / Manufacturer Rep Table
id
user id
started date
stopped date
position
store / field / man. id
verified
This is what I planned out so far. I'm still new to this so I could be doing it completely wrong. And it's only five groups. It was more until I condensed it some.
Although I find it weird having so many entities which are different from each other, but I will ignore this and get to the question.
It depends on the group criteria you need, in the case you described where each group has its own columns and information I guess your design is a good one, especially if you need the information in a readable form in the database. If you need all groups in a single table you will have to save the group relevant information in a kind of object, either a blob, XML string or any other form, but then you will lose the ability to filter on these criteria using the database.
In a relational Database I would do it using the design you described.
The design of your tables greatly depends on the requirements of your software.
E.g. your description of users led me in a wrong direction, I was at first thinking about a "normal" user of a software. Basically name, login-information and stuff like that. This I would never split over different tables as it really makes tasks like login, session handling, ... really complicated.
Another point which surprised me, was that you want to store the equipment in columns of those user's tables. Usually the relationship between a person and his equipment is not 1 to 1 and in most cases the amount of different equipment varies. Thus you usually have a relationship between users and their equipment (1:n). Thus you would design an equipment table and there refer to the owner's user id.
But after you have an idea of which data you have in your application and which relationships exist between your data, the design of the tables and so on is rather straitforward.
The good news is, that your data model and database design will develop over time. Try to start with a basic model, covering the majority of your use cases. Then slowly add more use cases / aspects.
As long as you are in the stage of planning and early implementation phasis, it is rather easy to change your database design.
I am currently in the process of rolling a custom order-processing system. My current structure is pretty standard, invoices, purchase orders, and items are all kept in separate tables. Items know which form(s) they are on by keeping track of the form's id, but forms don't know what items are in them (in the database). This was all well and good until I had a new requirement added to the mix: stocking orders.
The way a stocking order works is that sometimes a customer places an order for more units than what is in stock, so we want to place an order with our supplier for enough units to fulfill the order and replenish our stock. However, we often have to build these orders up as the minimums are pretty high, so one stocking order is usually comprised of several customer orders (sometimes for the same item) PLUS a few line items that are NOT connected to an order and are just for stocking purposes.
This presents a problem with my current architecture because I now need to keep track of what comes in from the stocking orders as often suppliers ship partial orders, where items have been allocated, and which incoming items are for stock.
My initial idea was to create a new database table that mostly mimics the items table, but is kind of like an aggregate (but not calculated) table that only keeps track of the items and their corresponding metadata (how many units received, how many for stock, etc) for only the stocking orders. I would have to keep the two tables in synch if something changed from one of them (like a quantity).
Is this overkill, and maybe there's a better way to do it? Database architecture is definitely not my forte, so I was hoping that someone could either tell me that this is an ok way to do things or that there is a better, more correct way to do it.
Thanks so much!
For what it's worth, what I'm using: VB, .NET 4.0, MySQL 5.0
Also, if you want clarification on anything, please ask! I will be closely monitoring this question.
Visit databaseanswers.com. Navigate to "free data models", and look for "Inventory Management". You should find some good examples.
you didnt mention them but you will need tables for:
SUPPLIERS, ORDERS, and INVENTORY
also, the base tables you mention 'knowing about' - these probably need associative style many to many tables which tell you things like which items are on which order, and which suppliers supply which items, lead times, costs etc.
it would be helpful to see your actual schema.
I use a single Documents table, with a DocType field. Client documents (Order, Invoice, ProForma, Delivery, Credit Notes) are mixed with Suppliers documents (PO, Reception).
This makes it quite easy to calculate Client backorders, Supplier backorders, etc...
I am just a bit unhappy because I have different tables for SUPPLIERS and CLIENTS, and therefore the DOCUMENTS table has both a SupplierId field and a ClientId field, which is a bit bad. If I had to redo it I might consider a single Companies table containing both Clients and Suppliers.
I use a PK (DocId) that's autoincrement, and a unique key (DocNum) that's like XYY00000, with
x= doc type
YY= year
00000 = increment.
This is because a doc can be saved but is only at validation time it receives a DocNum.
To track backorders (Supplier or Client), you will need to have a Grouping field in the DocDetails table, so that if you have an Order line 12345, you copy that Link field to every Detail line related to it (invoice, Delivery).
Hope I am not to confusing. The thing works well, after 3 years and over 50,000 docs.
This approach also implies that you will have a stock holding which is allocated for orders - without individual item tracking its a bit tricky to manage this. Consider, customer A orders
3 pink widgets
1 blue widget
But you only have 1 pink widget in stock - you order 3 pink widgets, and 1 blue.
Customer B orders
2 pink widgets
But you've still only got 1 in stock - you order another pink widget
3 pink widgets arrive from the first supplier order. What are you going to do? You can either reserve all of them for customer A's order and wait for the blue and red widget to arrive, or you can fulfill customer B's order.
What if the lead time on a pink widget is 3 days and for a blue widget it's 3 weeks? Do you ship partial orders to your customers? Do you have a limit on the amount of stock you will hold?
Just keeping a new table of backorders is not going to suffice.
This stuff gets scary complicated really quickly. You certainly need to spend a lot more time analysing the problem.