maintaining a price table changes every day - mysql

state (st_id,st_name)
district (d_id,d_name,st_id[FK])
product (pid,pnme)
price (max_price,min_price,pid[FK],d_id[FK])
1.)This is my table structure, i want to show the price of products in 5 states and its districts,but in price tbl i'm repeating the product(more than 10) for each district.
Whats wrong with my price tbl, Could you plz give an idea to normalize it..
2.) NOw i'm just planning to add date stamp(start date) field to price tbl so that i can maintain historical price list, but how can i do it without repeating product(like shown below) on each date..any better solution to reduce the tbl rows
_______________________________________
product| price |district|date(mm/dd/yy)|
_______|_______|________|______________|
fan 200 delhi 3/15/2013
speaker 400 delhi 3/15/2013
fan 210 chenni 3/15/2013
speaker 403 chenni 3/15/2013
fan 200 delhi 3/16/2013
fan 210 chenni 3/16/2013

1) There's nothing much wrong with your table design - however, the sample data doesn't make sense, as there's a repeat for product 1 and district 111. You might want to create a composite primary key on pid and d_id.
2) Again, nothing much wrong with the table design; you might consider only entering data if there's a change, so that retrieving the price for a given date searches for the last record before the desired data. That reduces the size of the table.
General points: please pick a naming convention and stick to it - you use pid and d_id (one with an underscore, one without); in general, I prefer more descriptive column names, but consistency is key.
Also, there's nothing wrong with large tables, as long as the data isn't redundant. Your design seems to have no redundancies.

1.)This is my table structure, i want to show the price of products in 5
states and its districts,but in price tbl i'm repeating the product(more than 10)
for each district.
If you're offering all your products in all those districts, and the price varies depending on which district the product is sold in, then it only makes sense that you'd repeat the product for each district.
Whats wrong with my price tbl, Could you plz give an idea to normalize it..
It looks like your price table doesn't have a sensible primary key.
If you'd built the table along these lines . . .
create table prices (
district_id integer not null references districts (district_id),
product_id integer not null references products (product_id),
primary key (district_id, product_id),
min_price numeric(14,2) not null,
max_price numeric(14,2) not null
);
you'd have a table in 5NF, assuming that minimum and maximum product prices vary among the districts. But your sample data couldn't possibly fit in it.

1) In your Price (Price Range) table, I don't understand why (d_id, pid) repeats? There should be only one price range, unless you put an effective date column in the table.
2) You could have a future price table, a current price table, and a history price table. This allows you to enter price changes in advance, keeps the current price table short and allows you to get the historical prices infrequently when you need them. Your application code maintains the relationship between these price tables.
I'm not sure where city came from in your other Price table, since you defined state and district.

Related

DB design for product pricing history for multiple suppliers

I am currently trying to design DB(mysql) structure for my project which is an online shop for wholesale company - I already created everything when it comes to products, it's multiple variants etc but I have problem with following which is price and historic data for multiple suppliers:
Please find below main assumptions for the project:
We are going to have several suppliers for products
Thanks to the above each product will have few different prices
We want to be able to have historic price data for each product with each supplier
Variant 1
At first I thought about adding 2 tables to my DB:
suppliers table: supplier_id, name
prices table: id, product_id, price_supplier1, price_supplier2, price_supplier3, timestamp
However in such example whenever we want to add another supplier we need to add row to the database (I am not a db expert but I guess that's not the best approach)
Variant 2
Another idea was just to have price table with following:
suppliers table: supplier_id, name
prices table: id, product_id, supplier_id, timestamp
However in this case if we have 5 suppliers we get 5 records created for 1 products every single day so let's imagine that we have only 1000 products and want to keep historic data for last 6 months - such table would grow very rapidly
So to summarize - which approach is better or maybe there is a different one that I could implement? Thanks a lot for any suggestions.
You should go with variant 2. It's best practice to avoid frequent table restructure, which you would have to do in variant 1 any time you add or remove a supplier (although MySQL is fairly fast at this in recent versions). Using a single column to identify the distinct supplier values is better. It also promotes query reuse when you don't have to worry about column values changing or being dropped altogether. Also, space shouldn't really be a concern. To give you an idea, if your prices table had 1,000,000 rows (6 months), it would be about 40-50M in size (assuming only a primary key index). MySQL also offers compression and partitioning to reduce storage as well, if that's really a concern.

SQL Database design - three tables: product, sales order, and purchase order - how to store product quantity?

I have three tables: product, sales_order (where I sell products) and purchase_order (where I buy products). Now I can think of two ways of keeping the quantity of each product:
Have a column in the product table called quantity; when inserting into sales_order, I subtract the quantity; when inserting into purchase_order, I add the quantity
Instead of storing the quantity in the product table, I calculate the quantity from the sales_order and the purchase_order table each time I need to get the product table
I am wondering if the second approach is preferable to the first one? I like the second one more because it doesn't store any redundant data; however, I am not so sure if calculating the quantity every time is a bit too much calculation. I am wondering what is the convention and best practice here? Thank you!
I would use the first one. Add a column to the product table in the coding u code -x amount when order and you would then display this in the order table. You could right a script for when the products get to a certain amount it emails you and tells u to replenish stocks. However the second would also work and sql is very powerful so i wouldnt wprry about it being ro demanding as it will prbably work it out faster than we can lol
I prefer the first one because in-memory calculations are faster than issuing select statements to check the sales orders and purchase orders assuming that the number of times the quantity value is retrieved is significantly more than the number of times the quantity value is updated.

Storing pincodes and shipping charge for each pincode

I'm Creating a E-Commerce application using Magento. I need to build a custom Shipping module for this. Currently i'm designing the tables to store data.
The issue is when a customer places an order i need to get the Shipping companies who provides service in that locations ie Pickup and delivery. Once i have the details of the shipping company i need to get the shipping charge to that particular location. I have asked a question about how to store pincodes and shipping company detail.
The suggestion i got was to create a table like as follows
Shipping Companies
--------
ID (int, PK)
Name (string)
Pincodes
--------
ID (int, PK)
Pincode (string)
These entities have a many-to-many relationship. So create a table to link them:
Shipping Company Pincodes
--------
ID (int, PK)
Shipping Company ID (int, FK)
Pincode ID (int, FK)
Pickup (bit)
Delivery (bit)
Using this table structure i can track the shipping companies which will provide pickup and delivery . However once i have these shipping company ids, the next step i need to do is get the shipping charges to deliver the product at that location. One suggestion from one of my colleague was to store the range of pincodes instead of storing all the pincodes. And row will store the rate for multiple shipping company For example:
Pincode | Fedex Rate | DHL Rate | UPS Rate
----------------------------------------------------
67 - 69 7.7 6.5 5.5
But since i'm storing a range of pincodes how will i identify if shipping company does not provide delivery or pickup in any pincode in that range. Also is there any other better method to store the shipping rates for pincodes. There are actually around 19000+ pincodes. I thought of storing individual rates for each pincode and shipping company, but that will make the table very huge.
Tens of thousands of rows is small for MySQL/MariaDB. I would forego the Pincodes table as well as the surrogate ID in the Shipping Company Pincodes table and use the Shipping Company ID and Pincode as a composite primary key. The Pincode looks like an integer (no less efficient than a surrogate id) and a meaningful natural (externally defined) key, meaning you'll likely need it frequently in queries. If it forms part of your primary keys, it'll be conveniently available and indexed by default. I would also add a Rate column to this table.
To summarize:
Shipping Companies
--------
ID (int, PK)
Name (string)
Shipping Company Pincodes
--------
Shipping Company ID (int, PK/FK)
Pincode (int, PK/FK)
Pickup (bit)
Delivery (bit)
Rate (decimal)
This addresses a more complex question, so it is not really addressing the one asked.
Is the main query is "What will the companies charge to ship from Pin 12345 to Pin 29876?"
Plan A is a 360 million row table with all possible start/end pins. This may be the best, since it is very efficient to do SELECT ... WHERE pin_from = $from AND pin_to = $to while having PRIMARY KEY(pin_from, pin_to). This table might take 20GB; is that OK? The SELECT might typically take 10ms.
Plan B, which you alluded to, would need a table like
CREATE TABLE Rates (
from_a, from_z, -- min and max pins for source pin range
to_a, to_z, -- ditto for destination
fedex DECIMAL(6,2) NULL -- NULLable in case fedex does not run that route
etc.
PRIMARY KEY(from_a, from_z, to_a, to_z)
) ENGINE=InnoDB;
The table would be much smaller. The query is something like:
SELECT IFNULL(fedex, 'N/A') AS Fedex, ...
FROM Rates
WHERE $from BETWEEN from_a AND from_z
AND $to BETWEEN to_a AND to_z;
The problem is that there is no good way to index this. This encounters two problems -- testing within a range in that way is not optimizable, and it is essentially a 2-dimensional problem.
If the table is only thousands of rows, then a table scan is not "too bad". If it is millions of rows, it would probably be too slow.
Loading the table would be a lot of challenging code -- you don't want any overlapping rectangles. Updating the table would be even more challenging.
Plan C... Perhaps a SPATIAL index is exactly what you need. The (x,y) of a Spatial "Point" is the pair (pin_from, pin_to). Sorry, I don't know where to take it next.
Plan D... This is a variant on Plan B, but it greatly improves the efficiency. It adds 2 columns; x, y. They have values 0..190, calculated as floor(pin/100). The idea is to have 190*190 "buckets". In each bucket is every rectangle (a la Plan B) that has a point in the bucket. Yes, that means some rectangles will show up in more than one bucket; this is a small price to pay for significant performance improvement.
PRIMARY KEY(x, y, from_a, from_z, to_a, to_z)
SELECT ...
FROM Rates
WHERE x = FLOOR($from/100)
AND y = FLOOR($to/100)
AND the rest of Plan B's WHERE
Since a "bucket" cannot have more than 100*100 rows, and they are "clustered" in the table, the scan is reasonably bounded. If, say, the average bucket is 10 pins by 10 pins, then the average bucket has only 100 rows -- quite efficient.
Sorry, loading and updating is still complex.
(I picked 100x100 for bucket size; there may be a better choice, based on the size of the typical rectangle. Note the advantage of 100: it leads to 0..190 range, allowing x and y to be small: a 1-byte TINYINT UNSIGNED.)

Can I create a table structure with dynamic columns in MySQL?

I've created a stock control database which contains two tables (actually more than two, but these are the two that are relevant to my question): Stock, and Receipts
I would like the link between the stock in the stock table,and the stock in the receipts table to be a little more clearer, this would be fine if a customer could only order one item of stock per receipt, as i'd simply have a StockID column and a Quantity column in the Recipts table, with the StockID column as an FK to the ID in the Stock table, however, the customer can make a receipt with any number of items of stock on it, which would mean i'd have to have a large number of columns in the Receipts table (i.e. StockID_1, Quantity_1, StockID_2, Quantity_2 etc.)
Is there a way around this (can you have like a dynamically expanding set of columns in MySQL) within MySQL, other than what i've done at the moment, which is to have an OrderContents column with the following structure (which isn't enforced by the database or anything) StockID1xQuantity,StockID2xQuantity and so on?
I would post an image of the DB structure, but I don't have enough repuation yet. My lecturer mentioned something about that it could be done, by normalising the database into 4th or 5th normal form?
I'd suggest having 3 tables:
Stock (StockID) + stock specific fields
Receipt (ReceiptID) + receipt specific fields.
StockReceipt (ReceiptID, StockID, Quantity) (could have a StockReceiptID, or use StockID+ReceiptID as Primary Key)
A solution including prices could look like:
Stock (StockID, Price)
PriceHistory (StockID, Price, Date) or (DateFrom, DateTo)
Receipt (ReceiptID, ReceiptDate)
StockReceipt (ReceiptID, StockID, Quantity)
That way you can calculate TotalStockReceiptPrice and TotalReceiptPrice for any receipt in the past.
I suspect this might be what you're looking for:
Stock (StockID, StockPrice)
Receipt (ReceiptID)
StockReceipt (ReceiptID, StockID, Quantity)
SELECT r.ReceiptID, SUM(s.StockPrice * sr.Quantity) AS ReceiptPrice
FROM Receipt r
INNER JOIN StockReceipt sr ON r.ReceiptID = sr.ReceiptID
INNER JOIN Stock s ON sr.StockID = s.StockID
GROUP BY r.ReceiptID
This is all very normalised (again, no idea to what normal form - 3rd?). However it only works if the StockPrice on the Stock record NEVER changes. As soon as it changes your ReceiptPrices would all reflect the new price instead of what the customer actually paid.
If the price can change, you'd need to either keep a price history table (ItemID, Price, DateTo, DateFrom) or record the StockPrice on the StockReceipt record (and then get rid of the JOIN to the Stock record in the above query and make it use sr.StockPrice instead of s.StockPrice)
To do the INSERT you posted below, you'd have to do:
INSERT INTO StockReceipts (ReceiptID, StockID, Quantity, TotalStockPrice)
SELECT 1, 99, 2, s.StockPrice
FROM Stock s
WHERE s.StockID = 99
However it's quite likely that whatever is issuing this receipt (and triggers the INSERT) already knows the price so could just insert the value.
No, relational databases do not allow dynamic columns. The definition of a relational table is that it has a header that name the columns, and every row has the same columns.
Your technique of repeating the groups of stock columns is a violation of First Normal Form, and it also has a lot of practical problems, for instance:
How do you know how many extra columns to create?
How do you search for a given value when you don't know which column it's in?
How do you enforce uniqueness?
The simplest solution is as #OGHaza described, store extra stock/quantity data on rows in another table. That way the problems above are solved.
You don't need to create extra columns, just extra rows, which is easy with INSERT.
You can search for a given value over one column to find it.
You can put constraints on the column.
If you really want to understand relational concepts, a nice book that is easy to read is: SQL and Relational Theory: How to Write Accurate SQL Code by C. J. Date.
There are also situations where you want to expand a table definition with dynamic columns that aren't repeating -- they're just new attributes. This is not relational, but it doesn't mean that we don't need some data modeling techniques to handle the scenario you describe.
For this type of problem, you might like to read my presentation Extensible Data Modeling with MySQL, for an overview of different solutions, and their pros and cons.
PS: Fourth and Fifth normal form have nothing to do with this scenario. Your lecturer obviously doesn't understand them.

How should I store types of coupons in my db?

I'm creating a coupon system with many different types of coupons. I'd like to structure it so it can be extended. I'm not sure the best way to store the different types of coupons, for example:
Fixed amount off entire order
Fixed precentage off entire order
Fixed amount off one item (could be by sku of item or could be most expensive item)
Fixed percent off one item (could be by sku of item or could be most expensive item)
Buy x get y
Free product (by sku of item, by price)
x for $y (3 for $2, 10 for $15, etc.)
I'm using mysql, would it best to store as an array? json? Any other ideas or people with similar issues care to share how they did it?
Off of the top of my head you could have tables designed as follows:
Table: Coupon_Summary
Columns: Coupon_ID(primary key), Coupon_Name
This table will hold 'top-level' data about the Coupon.
Table: Coupon_Description
Columns: Coupon_ID(foreign key), Coupon_Description
This table will hold the description of the coupon.
Table: Coupon_Value
Columns: Coupon_ID(foreign key), Coupon_Value, Coupon_Currancy
This table will hold how much discount the coupon offers. Coupon_Value can be a percentage or a hard value(percentage will be appended with a % sign), if this is zero the coupon offers full discount, or the item is free in other words. This also includes the currency to base the discount amount off of, so that you can do conversions between currencies.
Table: Coupon_Target_Order
Columns: Coupon_ID(foreign key), Order_IDs
This table holds data related to which Order the coupon effects. If the Order_ID is null or zero, the coupon is valid for all orders. Otherwise you can have multiple IDs for multiple orders.
I hope this was of some help =).
I would create another table - tblCouponType for instance and populate it with a unique numerical and string for notes of the types I have, and add to it as new types become available. Then add another column to your coupon table that references the numerical value of your coupon type. This helps with the whole -"Relational" part of the database:)
I assume you have some sort of products table that contains all products you can sell. You can create a coupons table that looks something like:
id discount discount_percentage
INT PK DECIMAL(10,2) DECIMAL(3,2)
1 5.00 NULL
2 NULL 10.00
Then you could create a link table coupons_products like this:
coupon_id required_product_id
INT FK INT FK
1 4773
1 993
So in this example, coupon ID 1 gives a $5.00 discount and requires two products to be present on the order: 4773 and 993. Coupon ID 2 gives a 10% discount and requires no products.
With SomeSQL, I'd rather not use JSON for it is nearly impossible to efficiently query it.
My Approach would Be, simplistically speaking, to have one Table for Coupon types (columns: id, name, ...) and another one for The actual coupons.
In The coupon Table, I would have a column "type_id" Cross-referencing to The "couponTypes" Table (or a Foreign Key, for that matter).
This way, you can always add new Coupon types later on without invalidating The Data you had to this Point.
Querying "Coupons by Type" is a matter of
"SELECT 'id' FROM couponTypes WHERE 'name' = 'fixed sum'"; => $id
"SELECT * FROM coupons WHERE 'type_id' = $id"; => All fixed sum Coupons.
Welcome to MySQL!