I was looking at the following db model and I had some questions on it. I'm sure it's a good design as the guy behind it seems to be reasonably well qualified, although some things don't make sense:
Why's he seperated out bidders and sellers? I thought you'd have users, and users can place bids and sell items. You'd have a bids table with a reference to user, and a auctions table, with reference to user table. He talks a lot in his tutorials about making sure models are scalable and ready for change (don't have a status column for instance, have statuses in another table and reference that) so what's up here?
Why are their fields like "planned close date" and "winner". Isn't this data duplication, as the planned close date could be calculated using the last bid time (for acutions that use auto extend) and the winner is simply the last bid when the auction closes..?
FYI: I'm trying to build my own auction site in PHP/MySQL from scratch and it's proving to be quite difficult, so tutorials on this would be great!
Thanks!
Why's he seperated out bidders and sellers?
Each table has unique columns specific to each one, so he keeps them separate. I would actually go with user and sub-type bidder and seller to the user, like:
TABLE User (UserID (PK), ... all common fields for any user)
TABLE Bidder (UserID (PK,FK) ... all fields specific to bidders)
TABLE Seller (UserID (PK,FK) ... all fields specific to sellers)
Concerning "planned close date" and "winner":
Yes, it is data duplication, but in some cases you have to live with that in order to scale properly.
Of course you can use the last bid time from the "Bids" table to calculate the close date of the auction, but if your site gets really big, you don't want to calculate this every time someone loads the "auctions ending soon" list - because you have to calculate it for every single active auction, every time, just to find the few ones that are ending soon.
(and this list will get loaded a lot, believe me!).
Same with the winner - it's just faster to load if you have the information in the auctions
table, so you don't always have to join the "Bids" table and get the user from the last bid of every auction.
Think of the page in "My eBay" which shows all the auctions you won in the last 60 days - you would have to search all the bids of all auctions for the winner every single time someone loads this list!
A perfectly normalized database isn't always the best solution if you expect it to scale with lots of users.
Related
I have a process wherein I need to keep the history of a database records information, however the user needs to be able to change it at any time they please.
Scenario:
Seller creates an item with price of $5 and name of "foo"
Buyers buys item, an order is created linking to that item id
A while later, seller updates item name to "foobar" and item price to $6
Buyer views order history. The item name should be "foo" and price should be $5 since that's what they bought it at, but they are "foobar" and $6, respectively
This happens because when the seller updates the item, they are updating the same item the order is related to.
I thought of 3 possible solutions to this problem, and I would like to get your thoughts on which one you think is best (maybe from your prior experience), or a better solution I have not yet thought of. This is my first time dealing with this situation, so not sure how best to proceed without needing a refactor later.
My solutions:
Make the item name and price immutable.
Bad UX, cause now user has to delete item and recreate it if they want to make a modification
Requires some kind of deleted_at column in case user wants to delete the item after it has been purchased so that I can still keep it for referencing later to grab history data
Create a second table for history purposes
Not horrible, but requires a second table with a different name, not a big fan of the idea
Would have to run queries potentially twice to check both tables for similar data, as opposed to just querying one table
Create two records in the same table, and mark a boolean flag or some other flag to differentiate from historical/current records
I like this one the best, but not sure if the boolean flag may have any negative performance implications
I've encountered this issue too, particularly in product catalogs where the price changes frequently. Or the price may be on sale or discounted for a specific customer for some reason.
The only solution I've found is to copy the relevant product details to the customer's order record at the time they buy the product. In your example, at least the product name and the product price would be copied.
This might seem like it goes against the philosophy of "don't store redundant data" but it's not redundant—it's a fact that the customer bought the product for some specific price on a specific date, and that is still a useful fact forever, even if the current price for that product changes.
There should still be a link to the original product table, so managers can track how many orders included each product, for example. But the current price in the product table does not affect the record of each customer's order.
You might also need to create a product history table, to keep a record of all the times the price or name was changed. But that's for historical record-keeping only, it wouldn't affect typical queries during shopping or buying activities.
In this design:
Product table always stores the current price.
When a customer buys a product, they copy the current price into their own order record.
When a manager changes a price, the app creates a new record in the ProductHistory table.
The most recent record for each product in the ProductHistory table matches the current price for the same product.
I'm creating a database to keep track on various statistics on my self and I'm wondering if there's a better way to store multiple entries for a single date.
E.g. from my table I have AllergyMedicine which can track multiple medicines taken on the same date, is there a better way to do this?
Also the tables Food and Allergy seems unnecessary, is there a better way to group tables?
Any suggestions are appreciated!
I find it helps to state the problem in a semi structured way, as below.
The system monitors one or more **persons**.
Each person consumes zero or more **items**. Each consumption has an attribute of date and time.
Items can be **food**, or **medicines**.
Food can be of the types **snack**, **fruit** or **meal**.
A meal has a **type**.
A person may report **symptoms**. Each report will cover a period of time, and be reported at a specific date/time.
Symptoms may be associated with zero or more **allergies**.
I do not believe that "date" is an entity in your schema - it's an attribute of events that occur, e.g. consuming something, or noticing a symptom.
If the statements above are true, the schema might be:
Persons
ID
name
...
FoodItemType
ID
Name
FoodItem
ID
Name
FoodItemTypeID (FK)
Medicine
ID
Name
FoodConsumption
PersonID
FoodID
ConsumptionDateTime
MedicineConsumption
PersonID
MedicineID
ConsumptionDateTime
Symptom
ID
Name
....
SymptomObservation
PersonID
SymptomID
SymptomStartDateTime
SymptomEndDateTime
SymptomReportDateTime
Allergy
ID
Name
AllergySymptom
AllergyID
SymptomID
Of course, if you take more than one medicine on one day, why not isolate that day (=date) in its own table?
So you'll have a table "days" with only dates, that you either prefill (like a calendar) or only fill with those days when you really took that medicine.
That way, you save a lot of space by "centering" the date in one table and relating everything else to it. Which is actually a very precise model of reality.
All your "FoodSnack", "FoodMeal", "AllergyMedicine" etc. with a date in them will become plain N:M mapping tables then.
You could even abstract further, reduce tables and make just three tables:
symptoms
causes
treatment
All of those related to the central "day" table (I wouldn't call it "Date", cause that's a keyword and easily mistaken also), plus related to each other, where applicable.
The title is somewhat hard to understand, so here is the explanation:
I am building a system, that deals with retail transactions. Meaning - purchases. I have a database with products, where each product has an ID, that is also known to the POS system. When a customer makes a purchase, the data is sent to the back-end for parsing, and is saved. Now everything is fine and dandy, until there are changes to the products name, since my client wants to see the name of the product, as it was purchased then.
How do I save this data, while also keeping a nice, normal-formed database?
Solutions I could think of are:
De-normalization, where we correlate the incoming data with the info we have in the database, and then save only the final text values, not id's.
Versioning, where we keep multiple versions of every product, and save the transactions with the id of the products version, when it came in. The problem with this one is, that as our retail store chain grows, and there are more and more changes happening to the products, the complexity of the whole product will greatly increase.
Any thoughts on this?
This is called a slowly changing dimension.
Either solution that you mention works. My preference is the second, versioning. I would have a product table that has an effdate and enddate on the record. You can easily find the current record (where enddate is null) or the record at any point in time.
The first method always strikes me as more "quick-and-dirty", but it also works. It just gets cumbersome when you have more fields and more objects you are trying to track. It does, in general though, win on performance.
If the name has to be the name as it was originally, the easiest, simplest and most reliable way to do that is to save the name of the product in the invoice line item record.
You should still link to the product with a ProductID, of course.
If you want to keep a history of name changes, you can do that in a separate table if you wish:
ProductNameID
ProductID
Date
Description
And store a ProductNameID with the invoice line item.
So I have this application that I'm drawing up and I start to think about my users. Well, My initial thought was to create a table for each group type. I've been thinking this over though and I'm not sure that this is the best way.
Example:
// Users
Users [id, name, email, age, etc]
// User Groups
Player [id, years playing, etc]
Ref [id, certified, etc]
Manufacturer Rep [id, years employed, etc]
So everyone would be making an account, but each user would have a different group. They can also be in multiple different groups. Each group has it's own list of different columns. So what is the best way to do this? Lets say I have 5 groups. Do I need 8 tables + a relational table connecting each one to the user table?
I just want to be sure that this is the best way to organize it before I build it.
Edit:
A player would have columns regarding the gear that they use to play, the teams they've played with, events they've gone to.
A ref would have info regarding the certifications they have and the events they've reffed.
Manufacturer reps would have info regarding their position within the company they rep.
A parent would have information regarding how long they've been involved with the sport, perhaps relations with the users they are parent of.
Just as an example.
Edit 2:
**Player Table
id
user id
started date
stopped date
rank
**Ref Table
id
user id
started date
stopped date
is certified
certified by
verified
**Photographer / Videographer / News Reporter Table
id
user id
started date
stopped date
worked under name
website / channel link
about
verified
**Tournament / Big Game Rep Table
id
user id
started date
stopped date
position
tourney id
verified
**Store / Field / Manufacturer Rep Table
id
user id
started date
stopped date
position
store / field / man. id
verified
This is what I planned out so far. I'm still new to this so I could be doing it completely wrong. And it's only five groups. It was more until I condensed it some.
Although I find it weird having so many entities which are different from each other, but I will ignore this and get to the question.
It depends on the group criteria you need, in the case you described where each group has its own columns and information I guess your design is a good one, especially if you need the information in a readable form in the database. If you need all groups in a single table you will have to save the group relevant information in a kind of object, either a blob, XML string or any other form, but then you will lose the ability to filter on these criteria using the database.
In a relational Database I would do it using the design you described.
The design of your tables greatly depends on the requirements of your software.
E.g. your description of users led me in a wrong direction, I was at first thinking about a "normal" user of a software. Basically name, login-information and stuff like that. This I would never split over different tables as it really makes tasks like login, session handling, ... really complicated.
Another point which surprised me, was that you want to store the equipment in columns of those user's tables. Usually the relationship between a person and his equipment is not 1 to 1 and in most cases the amount of different equipment varies. Thus you usually have a relationship between users and their equipment (1:n). Thus you would design an equipment table and there refer to the owner's user id.
But after you have an idea of which data you have in your application and which relationships exist between your data, the design of the tables and so on is rather straitforward.
The good news is, that your data model and database design will develop over time. Try to start with a basic model, covering the majority of your use cases. Then slowly add more use cases / aspects.
As long as you are in the stage of planning and early implementation phasis, it is rather easy to change your database design.
I am currently in the process of rolling a custom order-processing system. My current structure is pretty standard, invoices, purchase orders, and items are all kept in separate tables. Items know which form(s) they are on by keeping track of the form's id, but forms don't know what items are in them (in the database). This was all well and good until I had a new requirement added to the mix: stocking orders.
The way a stocking order works is that sometimes a customer places an order for more units than what is in stock, so we want to place an order with our supplier for enough units to fulfill the order and replenish our stock. However, we often have to build these orders up as the minimums are pretty high, so one stocking order is usually comprised of several customer orders (sometimes for the same item) PLUS a few line items that are NOT connected to an order and are just for stocking purposes.
This presents a problem with my current architecture because I now need to keep track of what comes in from the stocking orders as often suppliers ship partial orders, where items have been allocated, and which incoming items are for stock.
My initial idea was to create a new database table that mostly mimics the items table, but is kind of like an aggregate (but not calculated) table that only keeps track of the items and their corresponding metadata (how many units received, how many for stock, etc) for only the stocking orders. I would have to keep the two tables in synch if something changed from one of them (like a quantity).
Is this overkill, and maybe there's a better way to do it? Database architecture is definitely not my forte, so I was hoping that someone could either tell me that this is an ok way to do things or that there is a better, more correct way to do it.
Thanks so much!
For what it's worth, what I'm using: VB, .NET 4.0, MySQL 5.0
Also, if you want clarification on anything, please ask! I will be closely monitoring this question.
Visit databaseanswers.com. Navigate to "free data models", and look for "Inventory Management". You should find some good examples.
you didnt mention them but you will need tables for:
SUPPLIERS, ORDERS, and INVENTORY
also, the base tables you mention 'knowing about' - these probably need associative style many to many tables which tell you things like which items are on which order, and which suppliers supply which items, lead times, costs etc.
it would be helpful to see your actual schema.
I use a single Documents table, with a DocType field. Client documents (Order, Invoice, ProForma, Delivery, Credit Notes) are mixed with Suppliers documents (PO, Reception).
This makes it quite easy to calculate Client backorders, Supplier backorders, etc...
I am just a bit unhappy because I have different tables for SUPPLIERS and CLIENTS, and therefore the DOCUMENTS table has both a SupplierId field and a ClientId field, which is a bit bad. If I had to redo it I might consider a single Companies table containing both Clients and Suppliers.
I use a PK (DocId) that's autoincrement, and a unique key (DocNum) that's like XYY00000, with
x= doc type
YY= year
00000 = increment.
This is because a doc can be saved but is only at validation time it receives a DocNum.
To track backorders (Supplier or Client), you will need to have a Grouping field in the DocDetails table, so that if you have an Order line 12345, you copy that Link field to every Detail line related to it (invoice, Delivery).
Hope I am not to confusing. The thing works well, after 3 years and over 50,000 docs.
This approach also implies that you will have a stock holding which is allocated for orders - without individual item tracking its a bit tricky to manage this. Consider, customer A orders
3 pink widgets
1 blue widget
But you only have 1 pink widget in stock - you order 3 pink widgets, and 1 blue.
Customer B orders
2 pink widgets
But you've still only got 1 in stock - you order another pink widget
3 pink widgets arrive from the first supplier order. What are you going to do? You can either reserve all of them for customer A's order and wait for the blue and red widget to arrive, or you can fulfill customer B's order.
What if the lead time on a pink widget is 3 days and for a blue widget it's 3 weeks? Do you ship partial orders to your customers? Do you have a limit on the amount of stock you will hold?
Just keeping a new table of backorders is not going to suffice.
This stuff gets scary complicated really quickly. You certainly need to spend a lot more time analysing the problem.