Database design issue - trying to avoid circular reference - mysql

I have been at this for a day and a bit now, trying to figure out how to best model the database (MySQL) for an app I'm developing for a friend who owns a bakery. The assumptions are as follows:
many (external) Bakers produce many Products
BakersProducts is updated fortnightly by certain staff who either call bakers for their product prices, or the bakers fax through their pricelist themselves, which the staff then update via a front-end UI.
the manager should be able to generate an order based on the products that she anticipates having.
So the front-end UI must be able to allow the manager to purely choose the products she would like in the order, and then present her with a list of Bakers to choose from for each product in the order.
In other words, Orders_has_Products should also include a reference to BakersProducts.bpID. I'm sure though that if I do this, then I would create a circular reference (of sort) to Products.
Im sure I've gone about this the wrong way, and would really appreciate anyone's advice as to how I can restructure my design to acccommodate the chosen Product Price - ie. to include BakersProducts.bpID.
Thank you!

This is not a circular reference, since
Order_has_products references Products
Order_has_products references BakersProducts
BakersProducts references Products
a circular reference would be if, for example,
Order_has_products references Products
Products references BakersProducts
BakersProducts references Order_has_products
Aside from that, circular references are relatively normal in a database (i.e. Employees table with a manager field, where the manager is herself an employee is a one table circular reference)
What your design has is a simple redundancy, because one product is referenced twice in the Order_has_products table - once directly from the Products table, and once via the related BakersProducts record. There is posibility for getting out of sync, but, since you stated that the business rule is that the product is chosen before the baker, it's quite all right.
I would include the productID even if it was the other way around, because a little denormalization can go a long way when speeding up queries, because otherwise you would have to scan the BakersProducts table, even for simle questions, like, 'Did we have any bagels on wednesday?'

I think the mix-up is from a business process standpoint: you're getting requisitions mixed up with orders.
A requisition has a list of products needed without necessarily specifying the supplier of each, whereas an order is directed at a specific supplier, for specific price look-up codes (what bpID seems to represent). One requisition may spawn multiple orders if it is split across multiple suppliers, and even a single product may have its order split across multiple suppliers, perhaps due to vendor volume limits or locality of delivery.
You may want to provide a view of a requisition that shows the order line item(s) generated from each requisition line item, but that is a user interface concern.

one way of solving this is simply to eliminate the Products table, and move productName into the BakersProducts table.
This would essentially only work if you do not expect the bakers to carry the same product, if the products are unique to the bakers.
If you do expect the bakers to carry the same product, then you may want to leave the separate Products table, but instead of having Order_has_Products.Products_productID, I would change it to Order_has_Products.bpID. If/when you need to access the productName (or other product related metadata that may go in that table) you could just do a join between BakersProducts and Products.

Related

Relationship database design - object specific many to many, do I solve with self join table or new table

Being new to relational database design, I am trying to clarify one piece of information to properly design this database. Although I am using Filemaker as the platform, I believe this is a universal question.
Using the logic of ideally having all one to many relationships, and using separate tables or join tables to solve these.
I have a database with multiple products, made by multiple brands, in multiple product categories. I also want this to be as scale-able as possible when it comes to reporting, being able to slice and dice the data in as many ways as possible since the needs of the users are constantly changing.
So when I ask the question "Does each Brand have multiple products" I get a yes, and "Does each product have multiple brands" the answer is no. So this is a one to many relationship, but it also seems that a self-join table might give me everything that I need.
This methodology also seems to go down a rabbit hole for other "product related" information such as product category, each product is tied to one product category, but only one product category is related to a product.
So I see 2 possibilities, make three tables and join them with primary and foreign keys, one for Brand, one for Product Category, and one for Products.
Or the second possibility is to create one table that has the brand and product category and product info all in one table (since they are all product related) and simply do self-joins and other query based tables to give me the future reporting requirements that will be changing over time.
I am looking for input from experiences that might point me in the right direction.
Thanks in advance!
Could you ever want to store additional information about a brand (company URL, phone number, etc.) or about a product category (description, etc.)?
If the answer is yes, you definitely want to use three tables. If you don't, you'll be repeating all that information for every single item that belongs to the same brand or same category.
If the answer is no, there is still an advantage to using three tables - it will prevent typos or other spelling inconsistencies from getting into your database. For example, it would prevent you from writing a brand as "Coca Cola" for some items and as "Coca-Cola" for other items. These inconsistencies get harder and harder to find and correct as your database grows. By having each brand only listed once in it's own table, it will always be written the same way.
The disadvantage of multiple tables is the SQL for your queries is more complicated. There's definitely a tradeoff, but when in doubt, normalize into multiple tables. You'll learn when it's better to de-normalize with more experience.
I am not sure where do you see a room for a self-join here. It seems to me you are saying: I have a table of products; each product has one brand and one (?) category. If that's the case then you need either three tables:
Brands -< Products >- Categories
or - in Filemaker only - you can replace either or both the Brands and the Categories tables with a value list (assuming you won't be renaming brands/categories and at the expense of some reporting capabilities). So really it depends on what type of information you want to get out in the end.
If you truly want your solution to be scalable you need to parse and partition your data now. Otherwise you will be faced with the re-structuring of the solution down the road when the solution grows in size. You will also be faced with parsing and relocating the data to new tables. Since you've also included the SQL and MySQL tags if you plan on connecting Filemaker to an external data source then you will definitely need to up your game structurally.
Building everything in one table is essentially using Filemaker to do Excel work and it won't cut it if you are connecting to SQL, MySQL, etc.
Self join tables are a great tool. However, they should really only be used for calculating small data points and should not be used as pivot points or foundations for your reporting features. It can grow out of control as time goes on and you need to keep your backend clean.
Use summary and sub-summary reporting features to slice product based data.
For retail and general product management solutions, whether it's Filemaker/SQL/or whatever the "Brand" or "Vendor" is it's own table. Then you would have a "Products" table (the match key being the "Brand ID").
The "Product Category" field should be a field in the "Products" table. You can manage the category values by building a standard value list or building a value list based on a "Product Category" table. The second scenario is better for long term administration.

SSAS many to many dimension hierarchy

First of all, this is very, very simple data warehouse that I made only to ask following, specific question.
Scenario:
I have one fact table FactSales, and 2 dimensions: DimShop and DimProduct, and they are both separated from each other and directly connected to the fact table. some shops can sell selected products and vice versa, some products can be selled in specific shops. This give us many to many relationship. The problem is when I try to slice my cube i get all combinations between shops and products.
Question:
How can I create hierarchy between two separated dimensions in SSAS with many to many relationship? i tried to use brigde table but i was unable to configure hierarchy in SSAS. Is it even possible?
If you're trying to report on "what can happen" rather than "what did happen", you need a separate fact table & cube to represent the relationship between products and the shops that can sell the products. It's not really a hierarchy since it's many to many.
A simple cross reference fact should be fine:
FACT_PRODUCT_SHOP
ProductID
ShopID
Then when doing reports that want to see what products are allowed to be sold in what stores, you can use this fact table. The sales fact only shows "what actually happens".
You can even modify this fact to be your Inventory fact table, just adding a date and "In Stock amount" and "On order amount" etc..
It is possible to implement such a design but it may not perform well.
Basically instead of product and shop key in the fact table, you need an alternative key.
This key will be the unique combination of products and shops. That needs to be prepared in the ETL.
In a new dimension named "Shops and Products", on top of this key, you can create 2 hierarchies Product and Shop in the same dimension.
Additionaly, you can also create an unnatural hierarchy as you requested. But since it is an unnatural hierarchy, it may not perform well.
So in addition to Product and Shop hierarchies, you can provide following unnatural hierarchies: Shop -> Product, Product -> Shop.

How does one store 'packages' of multiple products within a database

I have a site that provides an e-commerce service, basically allowing sellers in the industry to place their products up for sale to clients. Sellers select the products they wish to sell and set the price. They are then hands off.
In a simplified form, I have the following relevant tables in my database for storing product and order information:
Product_Info
--------------------
ID (autonumber)
name
...
Order_Head
--------------------
ID (autonumber)
CustomerID
...
Order_Line
--------------------
ID (autonumber)
OrderHeadID
ProductID
...
This works great for simplified orders where customers choose any number of products and add them to their cart. However, I'm now faced with the problem of adding seller created and managed 'packages', wherein sellers can group multiple products together into a single item and sell it at a lower price than the individual items would cost together. For instance, if oranges costs $15 and apples costs $20, a package containing 2 oranges and 1 apple may only cost $35.
The twist, and the part that has me stymied right now, is that I would very much like packages to be able to contain other packages. For example, a seller could make an "assorted oranges" package containing 3 oranges. They could then make an "assorted fruit" package that contains 2 apples and 1 "assorted oranges".
How to manage that is confusing me both from how to list the products within a package when I could be referencing an ID from either the product table or from the package table, and from how to record the products in the order_line table since the productID could be pointing to either a product or to a package. And, of course, this needs to be designed in an efficient manner so we're not taxing the database server unneccessarily.
I'm primarily a web developer and haven't done much with e-commerce before. Could anyone offer some direction as to an appropriate set of tables and table modifications to apply to the database? I don't know why, as it doesn't seem like it should be that complicated, but this one has me stuck.
For reference I'm using MySQL 5.1, connected to ColdFusion MX 7 (soon to be 9).
EDIT:
Thank you for the responses so far. I will take a little time to think on them further. In the mean time I wanted to clarify how the order process works since it appears to be more relevant than I may have originally assumed.
My product works similar to Shutterfly. Photographers post photos, and clients may purchase prints. The photographers need tools to set all the pricing, and will often offer packages if it is from a professional shoot. Fulfillment is done by a lab that my product submits orders to automatically, but they have no knowledge of pricing, packages, etc.
Photographers also have the option of running specials to provide clients with x% off or BOGO deals, but I can handle that separately. For right now I'm more concerned about an efficient and simple way to store the photographer defined packages, the client's image selection for each product in the package as they shop (currently stored in a order_line mirror table), and the eventual order details, so that they can be queried for display and reporting quickly and easily.
Create an additional table which lists the items which are members of each package, and the quantity included in a package.
Table Package_Items
CREATE TABLE Package_Items (
package_id INT NOT NULL
item_id INT NOT NULL,
item_quantity INT NOT NULL DEFAULT 1,
FOREIGN KEY (package_id) REFERENCES Product_Info (ID),
FOREIGN KEY (item_id) REFERENCES Product_Info (ID)
PRIMARY KEY (package_id, item_id)
);
The package_id column references the row in Product_Info which is the main package item. item_id refers to other items which are package members. There can be multiple package_id, item_id combinations.
Using this method, you can create new rows in Product_Info which represent packages. All that needs to be done to add items to the package is to add corresponding rows in Package_Items. If a row added to Package_Items happens also to be a package product itself, no extra work needs to be done.
Just take care not to add a package to itself.
This is a tricky set of requirements, especially because you haven't told us about fulfillment yet, or any other discounting schemes, or what happens when customers return a single item from a package for a refund...
Michael's design is a good way of storing the hierarchical nature of your product offering.
The next question is "where do you store the price of the package" - as it is not the sum of all products. I'd recommend storing the price in the "package_items" table.
What do you then do with the "order_line" table? You have three options:
add a package with it's price (but not the items that make up the package),
add the package with its price and the items that make up the package at a price of zero,
add the items with their regular price, along with a discount line for the package.
If whoever fulfills the order knows about packages, you could go for option 1. However, the customer, comparing their shipping note against the products they receive, might be confused.
If you want to show each line item, but hold on to the "package price", option two allows you to show the line items; if a customer wants to return an item, you have to work out what the value of that item is in some off-line way.
Most supermarket tills use option 3 - it's also a nice way of showing customers who are just shopping for items that they got a "bonus" discount because their order matched a package.
If there's any chance that the order items won't all be shipped at the same time, you have to decide what to print on the shipping note - this is a huge pain in the backside, and often affects terms and conditions, and (in Europe) tax.
If you plan to offer other types of discount (e.g. "spend x, get y free", or "10% discount on orders over x"), you need to get this very clearly defined, because the way discounts can compound eachother often upsets retailers - that's why they usually offer "cheapest item free" in package deals.

MySQL: how to do row-level security (like Oracle's Virtual Private Database)?

Say that I have vendors selling various products. So, at a basic level, I will have the following tables: vendor, product, vendor_product.
If vendor-1 adds Widget 1 to the product table, I want only vendor-1 to see that information (because that information is "owned" by vendor-1). Same goes for vendor-2. Say vendor-2 adds Widget 2, only vendor-2 should see that information.
If vendor-1 tries to add Widget 2, which was already entered by vendor-2, a duplicate entry for Widget 2 should not be made in the product table. This means that, somehow, I need to know that vendor-2 now also "owns" Widget 2.
A problem with having multiple "owners" of a piece of information is how to deal owners editing/deleting the data. Perhaps vendor-1 no longer wants Widget 2 to be available to him/her, but that doesn't necessarily apply for vendor-2.
Finally, I want the ability to flag(?) certain records as "yes, I have reviewed this data and it is correct" such that it then becomes available to all the vendors. Say I flag Widget 1 as good data, that product should now be seen by all vendors.
It seems that the solution is row level security. The problem is that I'm not too familiar with its concepts or how to implement it in MySQL. Any help is highly appreciated. Thanks.
NOTE: this problem is somewhat discussed here: Database Design: use composite key as FK, flag data for sharing?. When I asked the question, I wasn't sure how to phrase the question very well. Hopefully, I explained my problem better this time.
Mysql doesn't natively support row level security on tables. However, you can sort of implement it with views. So, just create a view on your table that exposes only the rows you want a given client to see. Then, only provide that client access to those views, and not the underlying tables.
See http://www.sqlmaestro.com/resources/all/row_level_security_mysql/
You already suggested a vendor, product and vendor_product mapping table. You want vendors to share the same product if they both want to use it, but you don't want duplicate products. Right?
If so, then define a unique index/constraint on the natural key that identifies a product (product name?).
If a vendor adds a product, and it doesn't exist, insert it into the product table, and map it to that vendor via the vendor_product table.
If the product already exists, but is mapped to another vendor, do not insert anything into the product table, and add another mapping row mapping the new vendor to the existing product (so that now the product is mapped to two vendors).
Finally, when a vendor removes a product, instead of actually removing it, just delete the vendor_product reference mapping the two. Finally, if no other vendors are still referencing a product, you can remove the product. Alternatively, you could run a script periodically that deletes all products that no longer have vendors referencing them.
Finally, have a flag on the product table that says that you've reviewed the product, and then use something like this to query for products viewable by a given vendor (we'll say vendor id 7):
select product.*
from product
left join vendor_map
on vendor_map.product_id = product.product_id
where vendor_map.vendor_id = 7
or product.reviewed = 1;
Finally, if a product is owned by multiple vendors, then you can either disallow edits or perhaps "split" the single product into a new unique product when one of the owning vendors tries to edit it, and allow them to edit their own copy of the product. They would likely need to modify the product name though, unless you come up with some other natural key to base your unique constraint on.
This sounds to me that you want to normalize your data. What you have is a 1 (product) to many (vendors) relationship. That the relationship is 1:1 for most cases and only 1:n for some doesn't really matter I would say - in general terms it's still 1:n and therefor you should design your database this way. The basic layout would probably be this:
Vendor Table
VendorId VendorName OtherVendorRelatedInformation
WidgetTable
WidgetId WidgetName WidgetFlag CreatorVendor OtherWidgetInformation
WidgetOwnerships
VendorId WidgetId OwnershipStatus OtherInformation
Update: The question of who is allowed to do what is a business problem so you need to have all the rules laid out. In the above structure you can flag which vendor created the widget. And in the ownership you can flag what the status of the ownership is, for example
CreatorFullOwnership
SharedOwnership
...
You would have to make up the flags based on your business rules and then design the business logic and data access part accordingly.

Interesting Database Architecture Scenario

I am currently in the process of rolling a custom order-processing system. My current structure is pretty standard, invoices, purchase orders, and items are all kept in separate tables. Items know which form(s) they are on by keeping track of the form's id, but forms don't know what items are in them (in the database). This was all well and good until I had a new requirement added to the mix: stocking orders.
The way a stocking order works is that sometimes a customer places an order for more units than what is in stock, so we want to place an order with our supplier for enough units to fulfill the order and replenish our stock. However, we often have to build these orders up as the minimums are pretty high, so one stocking order is usually comprised of several customer orders (sometimes for the same item) PLUS a few line items that are NOT connected to an order and are just for stocking purposes.
This presents a problem with my current architecture because I now need to keep track of what comes in from the stocking orders as often suppliers ship partial orders, where items have been allocated, and which incoming items are for stock.
My initial idea was to create a new database table that mostly mimics the items table, but is kind of like an aggregate (but not calculated) table that only keeps track of the items and their corresponding metadata (how many units received, how many for stock, etc) for only the stocking orders. I would have to keep the two tables in synch if something changed from one of them (like a quantity).
Is this overkill, and maybe there's a better way to do it? Database architecture is definitely not my forte, so I was hoping that someone could either tell me that this is an ok way to do things or that there is a better, more correct way to do it.
Thanks so much!
For what it's worth, what I'm using: VB, .NET 4.0, MySQL 5.0
Also, if you want clarification on anything, please ask! I will be closely monitoring this question.
Visit databaseanswers.com. Navigate to "free data models", and look for "Inventory Management". You should find some good examples.
you didnt mention them but you will need tables for:
SUPPLIERS, ORDERS, and INVENTORY
also, the base tables you mention 'knowing about' - these probably need associative style many to many tables which tell you things like which items are on which order, and which suppliers supply which items, lead times, costs etc.
it would be helpful to see your actual schema.
I use a single Documents table, with a DocType field. Client documents (Order, Invoice, ProForma, Delivery, Credit Notes) are mixed with Suppliers documents (PO, Reception).
This makes it quite easy to calculate Client backorders, Supplier backorders, etc...
I am just a bit unhappy because I have different tables for SUPPLIERS and CLIENTS, and therefore the DOCUMENTS table has both a SupplierId field and a ClientId field, which is a bit bad. If I had to redo it I might consider a single Companies table containing both Clients and Suppliers.
I use a PK (DocId) that's autoincrement, and a unique key (DocNum) that's like XYY00000, with
x= doc type
YY= year
00000 = increment.
This is because a doc can be saved but is only at validation time it receives a DocNum.
To track backorders (Supplier or Client), you will need to have a Grouping field in the DocDetails table, so that if you have an Order line 12345, you copy that Link field to every Detail line related to it (invoice, Delivery).
Hope I am not to confusing. The thing works well, after 3 years and over 50,000 docs.
This approach also implies that you will have a stock holding which is allocated for orders - without individual item tracking its a bit tricky to manage this. Consider, customer A orders
3 pink widgets
1 blue widget
But you only have 1 pink widget in stock - you order 3 pink widgets, and 1 blue.
Customer B orders
2 pink widgets
But you've still only got 1 in stock - you order another pink widget
3 pink widgets arrive from the first supplier order. What are you going to do? You can either reserve all of them for customer A's order and wait for the blue and red widget to arrive, or you can fulfill customer B's order.
What if the lead time on a pink widget is 3 days and for a blue widget it's 3 weeks? Do you ship partial orders to your customers? Do you have a limit on the amount of stock you will hold?
Just keeping a new table of backorders is not going to suffice.
This stuff gets scary complicated really quickly. You certainly need to spend a lot more time analysing the problem.