I would like to build a very basic daily sales report. I'm am trying to decide how to structure the database to best accomplish this. Here is a use case for it:
On Jan 5, 2011, Provider A makes $500 total off of its products
On Jan 5 2011, Provider A makes $200 total off of its products
On Jan 6, 2011, Provider B makes $450 total off of its products
On Jan 6, Provider B makes $75 total off of its products
The current structure I have is:
PROVIDER table
pk
provider
PRODUCT table
provider (FK)
product
start_date (sales)
end_date
The start_date and end_date are when sales on the product may occur. It is only used for reference, and does not really affect anything else.
SALES table
product (FK)
sales
How to store date ??
sales would be the daily proceed ($) for sales from that product.
I'm not quite sure how to store the sales. Sales would only be calculated as a daily sum for each product. What would be the best way to structure the SALES table? Thank you.
Product Table:
This table should be 'transitory' in nature. Meaning that it is flexible to change over time.
In 2011 you may sell Product A for $15.99, but in 2012 you want to sell the same product for $16.99, so you want your table to be flexible enough to this type of change.
Although there is flexibility with the data stored within this table, care must be taken if you ever delete a product. If you delete a product, it will either orphan any matching sales transactions in the sales table, or delete them (depending upon FK behavior).
Sales Table:
This table should be 'transactional' in nature. Meaning that a row represents a product linked to a sale, frozen in time.
If buyer A purchased Product A for $15.99 in 2011, you want to record this transaction as is, nothing changes with the data at any point in time, it reflects a transaction.
If buyer B purchased the same Product A but for $19.99 in 2012, you want to record this as a separate transaction. Sure it is the same product, but this row represents a new transaction.
With the aforementioned setup, you can change prices as you see fit in the product table, but it won't affect already occurred sales recorded in the sales_transaction table.
Pseudo schema:
product:
id int(11) unsigned not null auto_increment
name varchar(255)
price decimal(14.2)
primary key (id)
unique key (name)
sales_transaction:
id int(11) unsigned not null auto_increment
product_id int(11) unsigned not null
provider_id int(11) unsigned not null
price decimal(14.2)
created_at datetime default '0000-00-00 00:00:00'
foreign key (product_id) references product('id') on delete cascade
foreign key (provider_id) references provider('id') on delete cascade
provider:
id int(11) unsigned not null
name varchar(255) not null
primary key (id)
unique key (name)
Now, you can run queries to get summations of any product for any date for any provider, as you requested in your question.
Sample Query:
# On Jan 5, 2011, Provider A makes $500 total off of its products
SELECT prov.*, SUM(sales.price)
FROM
provider AS prov
INNER JOIN
sales_transaction AS sales on sales.provider_id = prov.id
WHERE
provider.name = 'Provider A'
AND
sales.created_at BETWEEN '2012-01-05 00:00:00' AND '2012-01-05 23:59:59'
GROUP BY
prov.id
The schema provided is skeletal in nature, so feel free to add columns as your business requirements dictate, but it should get you going in the right direction.
Also, a final piece of advice, I would recommend storing your datetimes in UTC. If you opt to store in local timezone you will run into any number of headaches if you do any sales requiring conversion from your local timezone.
Why not a date field in the sales table? That would make each record the sales total for a certain product, on a certain date.
Related
The situation. I am trying to make a report that lists the sales rep names (all of them; there are 50), the sales rep's state, and the total sales broken down by YYYY-MM for that state. If there are two or more sales reps in the state, they should each be listed with the same information for their state. How the final list is ordered does not matter, so long as all the information is included.
The problem. I also need totals by state in addition to the totals I have.
Here is my code:
SELECT
dim_sales_rep.sales_rep_name as 'Sales Rep',
dim_state.abbreviation as 'State',
date_format(dim_date.date, '%Y-%m') as 'Year-Month',
concat('$',sum(fact_sales.total_sales)) as 'Sales'
FROM
(
(
dim_sales_rep
JOIN dim_state ON dim_sales_rep.state_key = dim_state.state_key
)
JOIN fact_sales ON dim_sales_rep.sales_rep_key = fact_sales.sales_rep_key
)
JOIN dim_date ON fact_sales.date_key = dim_date.date_key
GROUP BY dim_date.year, dim_date.month, dim_state.abbreviation, dim_sales_rep.sales_rep_name
Sample Output:
Rep State Year-Month Sales
Michele Harris GA 2010-08 $679.79
T.S. Eliot GA 2010-07 $2938.74
It should look like this:
Rep State Year-Month Sales
Michele Harris GA 2010-08 $679.79
Georgiana Woe GA 2010-08 $482.98
State total $1162.77
Or like this:
Rep State Year-Month YM Total State Total
Michaele Harris GA 2010-08 $679.79 $1162.77
Georgiana Woe GA 2010-08 $482.98 $1162.77
Here is the data structure:
table fact_sales
date_key (PK) Surrogate Key
account_key (PK) Surrogate Key
sales_rep_key (PK) Surrogate Key
total_sales Total sales dollars.
count_of_products Number of products sold
table dim_state
state_key (PK) Surrogate Key
abbreviation e.g. AL or CA
name e.g. California
table dim_account
account_key (PK) Surrogate Key
account_name
account_address
state_key Surrogate Key
effective_date Starting date that this record is active
expiration_date Ending date that this record is active
is_current Represents the active record
table dim_sales_rep
sales_rep_key (PK) Surrogate Key
sales_rep_name
state_key Surrogate Key
effective_date Starting date that this record is active
expiration_date Ending date that this record is active
is_current Represents the active record
table dim_date
date_key (PK) Surrogate Key date e.g. 2011-01-01
month e.g. 01
year e.g. 2011
Notes:
PK: Denotes that the column is the primary key or is part of the primary key of the table.
Surrogate Keys are represented as numeric and do not represent actual values from an application. For example date_key could be 1,2,3,4, etc. and is not a real date.
Assume that dim_date contains all dates for all time.
Assume that if the column has the same name in a different table that they are equivalent.
When you group by just the state, you get one row per state in your result set. That's what GROUP BY means: aggregate your data broken out by the column values it mentions.
Use this:
GROUP BY dim_date.year, dim_date.month, dim_state.abbreviation, dim_sales_rep.sales_rep_name WITH ROLLUP
And, formatting your dates well can be done using MySQL's date functions.
Try this:
SELECT DATE_FORMAT(STR_TO_DATE(CONCAT_WS('-',2014,3,1),'%Y-%m-%e'),'%Y-%m')
My reservation system allows us to purchase credits for clients in terms of pre defined packages. I'm struggling with how I record and calculate available credits.
Let's say we're talking about a car wash service. A client can have multiple cars and can purchase the following services, 'Wash and Wax' and 'Detailing'.
Client 1 has two cars, Car A and Car B. He brings them both in and purchases:
Car A - 1 Wash and Wax
Car A - 1 Detailing
Car B - 10 Wash and Wax
Car B - 1 Detailing
This generates 4 rows in my Purchases table, one for each service purchased.
In my DB I have two related tables tracking purchases and reservations. Table 1 Purchases, Table 2 Reservations.
In Purchases I have the following fields of note:
id
client_id
car_id
service_id
credits_purchased
credits_scheduled
credits_used
cart_id
Then in my Reservation table I have the following fields of note:
id
client_id
car_id
service_id
reservation_date
completed_datetime
car_in_datetime
car_out_datetime
purchase_id
I track the credits available by updating the Purchases table fields credits_used and credits_on_schedule as events happen.
For example, when the client makes a reservation the system adds a new record in the Reservations table, once this happens the system also runs an update query and adds +1 to the related Purchases table credits_on_schedule. When the Reservation is updated to complete the system also updates the Purchases table and adds -1 to credits_on_schedule and +1 to credits used. Simple math between credits_purchased, credits_used, and credits_on_schedule derive what credits are available for a client to use.
I feel like this isn't a good way to track the credits. My question is what is a better implementation? Should I just track credits_purchased then use count queries on the Reservation table to calculate credits_used and credits_on_schedule? Should I be using a pivot table to track? I can't seem to wrap my head around what is the cleanest design.
It looks to me that the design is ok in general.
A reservation can only have one purchased related to it so purchase_id field is a foreign key in Reservation table.
Nevertheless, my advise to you is to create a log system of all these record updates.
As you mentioned, as events are fired the system updates the calculated fields.
What if for some reason the system fails at a certain point? You should be able to track these events.
One way to avoid this is, as you mentioned, calculate credit_used by a count query on all completed reservations.
This might sound like a silly question but here it is; I am sure it has happened to anyone around here, you build a web app with a db structure per specifications (php/mysql), but then the specs change slightly and you need to make the change in the db to reflect it, here is a short example:
Order table
->order id
->user id
->closed
->timestamp
but because the orders are paid in different currency than in the one, which is quoted in the db, I need to add the field exchange rate, which is only checked and known when closing the order, not upon insertion of the record. Thus I can either add the new field to the current table, and leave it null/blank when inserting, and then update when necessary; or I can create a new table with the following structure:
Order exchange rates
->exchange id
->order id
->exchange rate
Although I believe that the letter is better because it is a less intrusive change, and won't affect the rest of the application functionality, you could end up with insane amount of joined queries to get all the information necessary. On the other hand the former approach could mess up some other queries you have in the db, but it is definitely more practical and also logical in terms of the overall db structure. Also I don't think that it is a good practice to use the structure of insert null and update later, but that might be just my lonely opinion...
Thus I would like to ask what do you think is the preferable approach.
I'm thinking of another alternative. Setup an exchange rate table like:
create table exchange_rate(
cur_code_from varchar2(3) not null
,cur_code_to varchar2(3) not null
,valid_from date not null
,valid_to date not null
,rate number(20,6) not null
);
alter table exchange_rate
add constraint exchange_rate_pk
primary key(cur_code_from, cur_code_to, valid_from);
The table should hold data that looks something like:
cur_code_from cur_code_to valid_from valid_to rate
============= =========== ========== ======== ====
EUR EUR 2014-01-01 9999-12-31 1
EUR USD 2014-01-01 9999-12-31 1,311702
EUR SEK 2014-01-01 2014-03-30 8,808322
EUR SEK 2014-04-01 9999-12-31 8,658084
EUR GBP 2014-01-01 9999-12-31 0,842865
EUR PLN 2014-01-01 9999-12-31 4,211555
Note the special case when you convert from and to the same currency.
From a normalization perspective, you don't need valid_to since it can be computed from the next valid_from, but from a practical point of view, it's easier to work with a valid-to-date than using a sub-query every time.
Then, to convert into the customers currency you would join with this table:
select o.order_value * x.rate as value_in_customer_currency
from orders o
join exchange_rate_t x on(
x.cur_code_from = 'EUR' -- Your- default currency here
and x.cur_code_to = 'SEK' -- The customers currency here
and o.order_close_date between x.valid_from and x.valid_to
)
where o.order_id = 1234;
Here I have used the rates valid as of the order_close_date. So if you have two orders, one with a close date of 2014-02-01, then it would pick up a different rate than an order with a close date of 2014-04-05.
I think you just need to add exchange_rate_id in the order table and create a look up table Exchange_Rates with columns ex_rate_id, description , deleted, created_date.
So when an order closes you just need to update the exchange_rate_id column in order table with id and later on you can create a join with the look up table to pull records.
Keep in mind that
one order have only one currency upon closing.
one currency can be updated against one or many orders
It is a one to many relationship, so i don't think that you have to make a separate table for that. If you do so I think that will consider in extra normalization.
Background:
We are setting up a promotions system to give away free products to registered customers. We're trying to design a database which is flexible enough to handle multiple products, and giveaways. The requirements are that products may be given away on a drip basis on a first come basis to qualified customers.
Example:
Apple wants to give away 1000 ipads in March.
They want to give away maximum of 1 per hour.
They want to give it to customers who are in California or New York.
They want to limit how many free ipads a customer can get (limit 1 per 15 days).
Data Structure:
Products - 1 entry per unique product. e.g. Apple iPad
ProductGiveAways
ProductID: AppleIpad
Quantity:1000
StartDate: 03/01/2014
End Date 03/31/2014
CustomerState: California,NewYork
PurchaseLimitDays: 15
Problem:
With the above structure we are able to do a query against our customers table and find out which are qualified for the promotion.
What I cannot figure out is the best way to:
Query customers in California or New York (is this a good use case for a join and another table?)
When a customer logs in to see what free items are not available to him, how can I exclude the Apple iPad if the customer has already gotten this freebie?
In other words:
Say amazon.com wants to show me DVDs which I have not already bought. What is the proper way to query that?
Is the right approach to first get a list of previously bought products and then Query with a NOT clause?
I'm assuming you'll have a table for what has been given away. In this table I would include a column for recipient id which can map back to the customer table. You can then create queries to find eligible recipients by searching for customers who have not met disqualifying conditions.
select customerid
from customer
where customerid not in (
select recipientid
from givenaway
where ..... and ....
)
Because there's not a definitive data structure defined, I'm going to use the following which you can tailor to whatever data structure you have designed yourself:
Product
ProductId - INTEGER (IDENTITY and PRIMARY KEY)
ProductName - VARCHAR
States
StateId - INTEGER (IDENTITY and PRIMARY KEY)
StateName - VARCHAR
Customer
CustomerId - INTEGER (IDENTITY and PRIMARY KEY)
StateId - INTEGER (FOREIGN KEY)
Promotion
PromotionId - INTEGER (IDENTITY and PRIMARY KEY)
ProductId - INTEGER (FOREIGN KEY)
Quantity - INTEGER
StartDate - DATETIME
End Date - DATETIME
PurchaseLimitDays - INTEGER
PromotionState
PromotionId - INTEGER (FOREIGN KEY)
StateId - INTEGER (FOREIGN KEY)
So in answer to your questions:
Query customers in California or New York (is this a good use case for a join and another table?)
Personally I would join to a centralized state table (PromotionState) in my above example, I'm sure there's a better way but you could do a condition such as:
WHERE
(SELECT COUNT * FROM PromotionState x WHERE x.PromotionId = p.PromotionId) = 0
OR NOT(ps.PromotionId IS NULL)
Alternatively you could do a GROUP BY and HAVING, using all the other columns as the items to GROUP BY and something like HAVING COUNT * = 0 OR HAVING SUM CASE WHEN (Conditions met) THEN 1 ELSE 0 END = 0
When a customer logs in to see what free items are not available to him, how can I exclude the Apple iPad if the customer has already gotten this freebie?
Say amazon.com wants to show me DVDs which I have not already bought. What is the proper way to query that?
As I've said you could use GROUP BY and HAVING to determine whether an item has been previously "won" by either using COUNT or SUM
Is the right approach to first get a list of previously bought products and then Query with a NOT clause?
There are probably better ways, sub queries can get very heavy and sluggish, I'd recommend trying some of the above techniques and then using a profiler to hopefully make it more efficient.
Some database design
First, when you set the CustomerState to California,NewYork you are violating the First Normal Form of database design.
So let's reorganize your domain model.
State - 1 Entry per unique state
...
Customer - 1 Entry per unique customer
StateId: (California|NewYork|...)
...
Product - 1 Entry per unique product
...
ProductGiveAways - Many entries per product
ProductID
Quantity
StartDate
End Date
PurchaseLimitDays
...
ProductGiveAways_State
ProductGiveAwaysId
StateId
...
Customer_Product - 1 Entry per bought product by customer
CustomerId
ProductId
PurchaseDate
...
Technical issues
When you want to query custoners in California or New York, all you have to do now is :
// This is just an example, you have to change the 'California', 'New York' with their ids
SELECT * FROM Customer WHERE StateId IN ('California', 'New York')
When a customer logs in to see what free items are available to him :
// It's not an accurate sql, just an example
SELECT Product.*
FROM Product
JOIN ProductGiveAways ON ProductId
JOIN ProductGiveAways_State ON ProductGiveAwaysId
WHERE ProductId NOT IN (
SELECT ProductId FROM Customer_Product JOIN ProductGiveAways ON ProductId
WHERE CustomerId = /* the customer id */
AND ( TO_DAYS(now()) - TO_DAYS(PurchaseDate) ) < PurchaseLimitDays
)
AND StateId = /* customer StateId */
AND StartDate < now() < End Date // Elligible ProductGiveAways
For Laravel We Use Something Like this, i hope you can relate to this query or you can use online laravel query converter for using it in mysql ( orator )
$user_id = auth()->user()->id;
Product::where('status', 'active')->whereNotIn('id', function($query) use ($user_id) { $query->select('product_id')->from(new OrderProduct->getTable())->where('user_id', $user_id)->where('status', 'delivered'); });
I am developing an attendance system for school which will cater to both employees as well as students.
The current db schema is
attendance table
id - primary key for this table
daydate int(11) - stores timestamp of current day
timing_in varchar(18) - Start time for institution
timing_out - Closing time for institution
status - Status for the day, can be working day - 1 or holiday - 2
Then there are different tables for staff & students, which store the actual attendance values.
For staff, the attendance is stored in attendance_staff. The database schema is
attendance_id - foreign key, references attendance table
staff_id - id of staff member, references staff master table
time_in - stores in timing of a staff member
time_out - stores out timing of a staff member
status - attendance status - can be one among the list, like present, absent, casual leave, half day, late mark, on duty, maternity leave, medical leave etc
For staff, i am storing both present as well as not present entries in the table.
Now attendance for students has to be included with it.
Since status of each day is already stored in attendance table, can we store not present values for each student in the student attendance table.
Like, the student attendance table will store only entries for those days who are not present on a particular day.
The schema for attendance_student will be
attendance_id - references attendance table
student_id - references student table
status - will be leave / absent etc other than present.
Will it be efficient to calculate the present days from attendance table using outer join??
Thanks in advance.
You don't need an outer join to calculate attendance for students. You could simply count the records in your attendance table (one time, since it would be the same for all students) and then just select from your student attendance table to get absences.
If you'd prefer to count attendance with an outer join you could. It is likely to be more than efficient enough if you have an index on your attendance table primary key and on the foreign key from student attendance table to your attendance table.