Interesting Database Architecture Scenario - mysql

I am currently in the process of rolling a custom order-processing system. My current structure is pretty standard, invoices, purchase orders, and items are all kept in separate tables. Items know which form(s) they are on by keeping track of the form's id, but forms don't know what items are in them (in the database). This was all well and good until I had a new requirement added to the mix: stocking orders.
The way a stocking order works is that sometimes a customer places an order for more units than what is in stock, so we want to place an order with our supplier for enough units to fulfill the order and replenish our stock. However, we often have to build these orders up as the minimums are pretty high, so one stocking order is usually comprised of several customer orders (sometimes for the same item) PLUS a few line items that are NOT connected to an order and are just for stocking purposes.
This presents a problem with my current architecture because I now need to keep track of what comes in from the stocking orders as often suppliers ship partial orders, where items have been allocated, and which incoming items are for stock.
My initial idea was to create a new database table that mostly mimics the items table, but is kind of like an aggregate (but not calculated) table that only keeps track of the items and their corresponding metadata (how many units received, how many for stock, etc) for only the stocking orders. I would have to keep the two tables in synch if something changed from one of them (like a quantity).
Is this overkill, and maybe there's a better way to do it? Database architecture is definitely not my forte, so I was hoping that someone could either tell me that this is an ok way to do things or that there is a better, more correct way to do it.
Thanks so much!
For what it's worth, what I'm using: VB, .NET 4.0, MySQL 5.0
Also, if you want clarification on anything, please ask! I will be closely monitoring this question.

Visit databaseanswers.com. Navigate to "free data models", and look for "Inventory Management". You should find some good examples.

you didnt mention them but you will need tables for:
SUPPLIERS, ORDERS, and INVENTORY
also, the base tables you mention 'knowing about' - these probably need associative style many to many tables which tell you things like which items are on which order, and which suppliers supply which items, lead times, costs etc.
it would be helpful to see your actual schema.

I use a single Documents table, with a DocType field. Client documents (Order, Invoice, ProForma, Delivery, Credit Notes) are mixed with Suppliers documents (PO, Reception).
This makes it quite easy to calculate Client backorders, Supplier backorders, etc...
I am just a bit unhappy because I have different tables for SUPPLIERS and CLIENTS, and therefore the DOCUMENTS table has both a SupplierId field and a ClientId field, which is a bit bad. If I had to redo it I might consider a single Companies table containing both Clients and Suppliers.
I use a PK (DocId) that's autoincrement, and a unique key (DocNum) that's like XYY00000, with
x= doc type
YY= year
00000 = increment.
This is because a doc can be saved but is only at validation time it receives a DocNum.
To track backorders (Supplier or Client), you will need to have a Grouping field in the DocDetails table, so that if you have an Order line 12345, you copy that Link field to every Detail line related to it (invoice, Delivery).
Hope I am not to confusing. The thing works well, after 3 years and over 50,000 docs.

This approach also implies that you will have a stock holding which is allocated for orders - without individual item tracking its a bit tricky to manage this. Consider, customer A orders
3 pink widgets
1 blue widget
But you only have 1 pink widget in stock - you order 3 pink widgets, and 1 blue.
Customer B orders
2 pink widgets
But you've still only got 1 in stock - you order another pink widget
3 pink widgets arrive from the first supplier order. What are you going to do? You can either reserve all of them for customer A's order and wait for the blue and red widget to arrive, or you can fulfill customer B's order.
What if the lead time on a pink widget is 3 days and for a blue widget it's 3 weeks? Do you ship partial orders to your customers? Do you have a limit on the amount of stock you will hold?
Just keeping a new table of backorders is not going to suffice.
This stuff gets scary complicated really quickly. You certainly need to spend a lot more time analysing the problem.

Related

Forming my ERD, couple of issues with it

Okay, my partners and myself created these a while ago. We are going to be transferring this into SQL through Visual Basic soon, but I want to make sure everything is ready to go. Two major complaints that we were not able to fix was...
"By having Transaction and Product directly connected you are unable to allow multiple products on the same order (customer can't order both a latte and cappuccino on the same order)."
"Membership table: Not sure what the data in DiscountTypeTotal means - do you have multiple pieces of data in the same field? (then your table isn't on 1NF).
It looks like you need to allow each member to have multiple discounts - so you need another table to capture that."
How do we effectively correct these? How else would we connect Transaction with Products? I understand that the customer can only purchase one item per transactions, so would we have products and another table for multiple items? Allowing multiple customers to have discounts, I am lost. Any help would be appreciated.

Stock management database design

I'm creating an Intranet for my company, and we want to have a stock management in it. We sell and rent alarm systems, and we want to have a good overview of what product is still in our offices, what has been rented or sold, at what time, etc.
At the moment I thought about this database design :
Everytime we create a new contract, this contract is about a location or a sale of an item. So we have an Product table (which is the type of product : alarms, alarm watches, etc.), and an Item table, which is the item itself, with it unique serial number. I thought about doing this, because I'll need to have a trace of where a specific item is, if it's at a client house (rented), if it's sold, etc. Products are related to a specific supplier, to whom we can take orders. But here, I have a problem, shouldn't the order table be related to Product ?
The main concern here is the link between Stock, Item, Movement stock. I wanted to create a design where I'd be able to see when a specific Item is pulled out of our stock, and when it enters the stock with the date. That's why I thought about a Movement_stock table. The Type_Movement is either In / Out.
But I'm a bit lost here, I really don't know how to do it nicely. That's why I'm asking for a bit of help.
I have the same need, and here is how I tackled your stock movement issue (which became my issue too).
In order to modelize stock movement (+/-), I have my supplying and my order tables. Supplying act as my +stock, and my orders my -stock.
If we stop to this, we could compute our actual stock which would be transcribed into this SQL query:
SELECT
id,
name,
sup.length - ord.length AS 'stock'
FROM
product
# Computes the number of items arrived
INNER JOIN (
SELECT
productId,
SUM(quantity) AS 'length'
FROM
supplying
WHERE
arrived IS TRUE
GROUP BY
productId
) AS sup ON sup.productId = product.id
# Computes the number of order
INNER JOIN (
SELECT
productId,
SUM(quantity) AS 'length'
FROM
product_order
GROUP BY
productId
) AS ord ON ord.productId = product.id
Which would give something like:
id name stock
=========================
1 ASUS Vivobook 3
2 HP Spectre 10
3 ASUS Zenbook 0
...
While this could save you one table, you will not be able to scale with it, hence the fact that most of the modelization (imho) use an intermediate stock table, mostly for performance concerns.
One of the downside is the data duplication, because you will need to rerun the query above to update your stock (see the updatedAt column).
The good side is client performance. You will deliver faster responses through your API.
I think another downside could be if you are managing high traffic store. You could imagine creating another table that stores the fact that a stock is being recomputed, and make the user wait until the recomputation is finished (push request or long polling) in order to check if every of his/her items are still available (stock >= user demand). But that is another deal...
Anyway even if the stock recomputation query is using anonymous subqueries, it should actually be quite fast enough in most of the relatively medium stores.
Note
You see in the product_order, I duplicated the price and the vat. This is for reliability reasons: to freeze the price at the moment of the purchase, and to be able to recompute the total with a lot of decimals (without loosing cents in the way).
Hope it helps someone passing by.
Edit
In practice, I use it with Laravel, and I use a console command, which will compute my product stock in batch (I also use an optional parameter to compute only for a certain product id), so my stock is always correct (relative to the query above), and I never manually update the stock table.
This is an interesting discussion and one that also could be augmented with stock availability as of a certain date...
This means storing:
Planned Orders for the Product on a certain date
Confirmed Orders as of a certain date
Orders Delivered
Orders Returned (especially if this is a hire product)
Each one of these product movements could be from and to a location
The user queries would then include:
What is my overall stock on hand
What is due to be delivered on a certain date
What will the stock on hand be as of a date overall
What will the stock on hand be as of a date for a location
The inventory design MUST take into account the queries and use cases of the users to determine design and also breaking normalisation rules to provide adequate performance at the right time.
Lots to consider and it all depends on the software use cases.

Database model for a 24/7 Staff roster at a casino

We presently use a pen/paper based roster to manage table games staff at the casino. Each row is an employee, each column is a 20 minute block of time and each cell represents what table the employee is assigned to, or alternatively they've been assigned to a break. The start and end time of shifts for employees vary as do the games/skills they can deal. We need to keep a copy of the rosters for 7 years, with paper this is fairly easy, I'm wanting to develop a digital application and am having difficulty how to store the data in a database for archiving.
I'm fairly new to working with databases, I think I understand how to model the data for a graph database like neo4j, but I had difficulty when it came to working with time. I've tried to learn about RDBMS databases like MySQL, below is how I think the data should be modelled. Please point out if I'm going in the wrong direction or if a different database type would be more appropriate, it would be greatly appreciated!
Basic Data
Here is some basic data to work with before we factor in scheduling/time.
Employee
- ID Number
- Name
- Skills (Blackjack, Baccarat, Roulette, etc)
Table
- ID Number
- Skill/Type (Can only be one skill)
It may be better to store the roster data as a file like JSON instead? Time sensitive data wouldn't be so much of a problem then. The benefit of going digital with a database would be queries, these could help assist time consuming tasks where human error is common.
Possible Queries
Note: Staff that are on shift are either on a break or on the floor (assigned to a table), Skills have a major or minor type based on difficulty to learn.
What staff have been on the floor for 80 minutes or more? (They are due for a break)
What open tables can I assign this employee to based on their skillset?
I need an employee that has Baccarat skill but is not already been assigned to a Baccarat table.
What employee(s) was on this table during this period of time?
Where was this employee at this point in time?
Who is on shift right now?
How many staff on shift can deal Blackjack?
How many staff have 3 major skills?
What staff have had the Baccarat skill for at least 3 months?
These queries could also be sorted by alphabetical order or time, skill etc.
I'm pretty sure I know how to perform these queries with cypher for neo4j provided I model the data right. I'm not as knowledgeable with SQL queries, I've read it can get a bit complicated depending on the query and structure.
----------------------------------------------------------------------------------------
MYSQL Specific
An employee table could contain properties such as their ID number and Name, but am I right that for their skills and shifts these would be separate tables that reference the employee by a unique integer(I think this is called a foreign key?).
Another table could store the gaming Tables, these would have their own ID and reference a skill/gametype with a foreign key.
To record data like the pen/paper roster, each day could have a table with columns starting from 0000 increasing by 20 in value going all the way to 2340? Prior to the time columns I could have one for staff where each employee is represented with their foreign key, the time columns would then have foreign keys to the assigned gaming Tables, the row data is bound to have many cells that aren't populated since the employee shift won't be 24/7. If I'm using foreign keys to reference gaming Tables I now have a problem when the employee is on break? Unless I treat say the first gaming Table entry as a break?
I may need to further complicate things though, management will over time try different gaming Table layouts, some of the gaming Tables can be converted from say Blackjack to Baccarat. this is bound to happen quite a bit over 7 years, would I want to be creating new gaming Table entries or add a column to use a foreign key and refer to a new table that stores the history of game types during periods of time? Employees will also learn to deal new games during their career, very rarely they may also have the skill removed.
----------------------------------------------------------------------------------------
Neo4j Specific
With this data would I have an Employee and a Table node that have "isA" relationship edges mapping to actual employees or tables?
I imagine with the skills for the two types I would be best with a Skill node and establish relationships like so?: Blackjack->isA->Skill, Employee->hasSkill->Blackjack, Table->typeIs->Blackjack?
TIME
I find difficulty when I want this database to now work with a timeline. I've come across the following suggestions for connecting nodes with time:
Unix Epoch seems to be a common recommendation?
Connecting nodes to a year/month/day graph?
Lucene timeline? (I don't know much about this or how to work with it, have seen some mention it)
And some cases with how time and data relate:
Staff have varied days and start/end times from week to week, this could be shift node with properties {shiftStart,shiftEnd,actualStart,actualEnd}, staff may arrive late or get sick during shift. Would this be the right way to link each shift to an employee? Employee(node)->Shifts(groupNode)->Shift(node)
Tables and Staff may have skill data modified, with archived data this could be an issue, I think the solution is to have time property on the relationship to the skill?
We open and close tables throughout the day, each table has open/close times for each day, this could change in a month depending on what management wants, in addition the times are not strict, for various reasons a manager may open or close tables during the shift. The open/closed status of a table node may only be relevant for queries during the shift, which confuses me as I'd want this for queries but for archiving with time it might not make sense?
It's with queries that I have trouble deciding when to use a node or add a property to a node. For an Employee they have a name and ID number, if I wanted to find an employee by their ID number would it be better to have that as a node of it's own? It would be more direct right, instead of going through all employees for that unique ID number.
I've also come across labels just recently, I can understand that those would be useful for typing employee and table nodes rather than grouping them under a node. With the shifts for an employee I think should continue to be grouped with a shifts node, If I were to do cypher queries for employees working shifts through a time period a label might be appropriate, however should it be applied to individual shift nodes or the shifts group node that links back to the employee? I might need to add a property to individual shift nodes or the relationship to the shifts group node? I'm not sure if there should be a shifts group node, I'm assuming that reducing the edges connecting to the employee node would be optimal for queries.
----------------------------------------------------------------------------------------
If there are any great resources I can learn about database development that'd be great, there is so much information and options out there it's difficult to know what to begin with. Thanks for your time :)
Thanks for spending the time to put a quality question together. Your requirements are great and your specifications of your system are very detailed. I was able to translate your specs into a graph data model for Neo4j. See below.
Above you'll see a fairly explanatory graph data model. In case you are unfamiliar with this, I suggest reading Graph Databases: http://graphdatabases.com/ -- This website you can get a free digital PDF copy of the book but in case you want to buy a hard copy you can find it on Amazon.
Let's break down the graph model in the image. At the top you'll see a time indexing structure that is (Year)->(Month)->(Day)->(Hour), which I have abbreviated as Y M D H. The ellipses indicate that the graph is continuing, but for the sake of space on the screen I've only showed a sub-graph.
This time index gives you a way to generate time series or ask certain questions on your data model that are time specific. Very useful.
The bottom portion of the image contains your enterprise data model for your casino. The nodes represent your business objects:
Game
Table
Employee
Skill
What's great about graph databases is that you can look at this image and semantically understand the language of your question by jumping from one node to another by their relationships.
Here is a Cypher query you can use to ask your questions about the data model. You can just tweak it slightly to match your questions.
MATCH (employee:Employee)-[:HAS_SKILL]->(skill:Skill),
(employee)<-[:DEALS]-(game:Game)-[:LOCATION]->(table:Table),
(game)-[:BEGINS]->(hour:H)<-[*]-(day:D)<-[*]-(month:M)<-[*]-(year:Y)
WHERE skill.type = "Blackjack" AND
day.day = 17 AND
month.month = 1 AND
year.year = 2014
RETURN employee, skill, game, table
The above query finds the sub-graph for all employees who have the skill Blackjack and their table and location on a specific date (1/17/14).
To do this in SQL would be very difficult. The next thing you need to think about is importing your data into a Neo4j database. If you're curious on how to do that please look at other questions here on SO and if you need more help, feel free to post another question or reach out to me on Twitter #kennybastani.
Cheers,
Kenny

Disregard changes to a product description when retrieving order records

The title is somewhat hard to understand, so here is the explanation:
I am building a system, that deals with retail transactions. Meaning - purchases. I have a database with products, where each product has an ID, that is also known to the POS system. When a customer makes a purchase, the data is sent to the back-end for parsing, and is saved. Now everything is fine and dandy, until there are changes to the products name, since my client wants to see the name of the product, as it was purchased then.
How do I save this data, while also keeping a nice, normal-formed database?
Solutions I could think of are:
De-normalization, where we correlate the incoming data with the info we have in the database, and then save only the final text values, not id's.
Versioning, where we keep multiple versions of every product, and save the transactions with the id of the products version, when it came in. The problem with this one is, that as our retail store chain grows, and there are more and more changes happening to the products, the complexity of the whole product will greatly increase.
Any thoughts on this?
This is called a slowly changing dimension.
Either solution that you mention works. My preference is the second, versioning. I would have a product table that has an effdate and enddate on the record. You can easily find the current record (where enddate is null) or the record at any point in time.
The first method always strikes me as more "quick-and-dirty", but it also works. It just gets cumbersome when you have more fields and more objects you are trying to track. It does, in general though, win on performance.
If the name has to be the name as it was originally, the easiest, simplest and most reliable way to do that is to save the name of the product in the invoice line item record.
You should still link to the product with a ProductID, of course.
If you want to keep a history of name changes, you can do that in a separate table if you wish:
ProductNameID
ProductID
Date
Description
And store a ProductNameID with the invoice line item.

Database design issue - trying to avoid circular reference

I have been at this for a day and a bit now, trying to figure out how to best model the database (MySQL) for an app I'm developing for a friend who owns a bakery. The assumptions are as follows:
many (external) Bakers produce many Products
BakersProducts is updated fortnightly by certain staff who either call bakers for their product prices, or the bakers fax through their pricelist themselves, which the staff then update via a front-end UI.
the manager should be able to generate an order based on the products that she anticipates having.
So the front-end UI must be able to allow the manager to purely choose the products she would like in the order, and then present her with a list of Bakers to choose from for each product in the order.
In other words, Orders_has_Products should also include a reference to BakersProducts.bpID. I'm sure though that if I do this, then I would create a circular reference (of sort) to Products.
Im sure I've gone about this the wrong way, and would really appreciate anyone's advice as to how I can restructure my design to acccommodate the chosen Product Price - ie. to include BakersProducts.bpID.
Thank you!
This is not a circular reference, since
Order_has_products references Products
Order_has_products references BakersProducts
BakersProducts references Products
a circular reference would be if, for example,
Order_has_products references Products
Products references BakersProducts
BakersProducts references Order_has_products
Aside from that, circular references are relatively normal in a database (i.e. Employees table with a manager field, where the manager is herself an employee is a one table circular reference)
What your design has is a simple redundancy, because one product is referenced twice in the Order_has_products table - once directly from the Products table, and once via the related BakersProducts record. There is posibility for getting out of sync, but, since you stated that the business rule is that the product is chosen before the baker, it's quite all right.
I would include the productID even if it was the other way around, because a little denormalization can go a long way when speeding up queries, because otherwise you would have to scan the BakersProducts table, even for simle questions, like, 'Did we have any bagels on wednesday?'
I think the mix-up is from a business process standpoint: you're getting requisitions mixed up with orders.
A requisition has a list of products needed without necessarily specifying the supplier of each, whereas an order is directed at a specific supplier, for specific price look-up codes (what bpID seems to represent). One requisition may spawn multiple orders if it is split across multiple suppliers, and even a single product may have its order split across multiple suppliers, perhaps due to vendor volume limits or locality of delivery.
You may want to provide a view of a requisition that shows the order line item(s) generated from each requisition line item, but that is a user interface concern.
one way of solving this is simply to eliminate the Products table, and move productName into the BakersProducts table.
This would essentially only work if you do not expect the bakers to carry the same product, if the products are unique to the bakers.
If you do expect the bakers to carry the same product, then you may want to leave the separate Products table, but instead of having Order_has_Products.Products_productID, I would change it to Order_has_Products.bpID. If/when you need to access the productName (or other product related metadata that may go in that table) you could just do a join between BakersProducts and Products.