How to construct price time series from price changes using MS-Access 2007 queries? - ms-access

Background
I've got a table of price changes tblPriceChanges of various items tblItems (example below - data made up):
[tblPriceChanges]:
Timestamp Item Price
9AM 01/01/2013 Orange 50p
9AM 01/01/2013 Apple 30p
2PM 01/01/2013 Pen 80p
2PM 02/01/2013 Orange 55p
2PM 02/01/2013 Pen 85p
9AM 03/01/2013 Apple 25p
9AM 05/01/2013 Pencil 10p
9AM 05/01/2013 Pen 70p
2PM 05/01/2013 Pencil 15p <- Notice there can be multiple price changes on the same day
...
[tblItems]:
Item Category Ratio
Orange Fruit 1
Apple Fruit 3
Pen Stationary 2
Pencil Stationary 5
...
Problem
The end result is that I want to be able to see how the average price of each category changes through time - for example:
Specifically, the average price series of Fruits, for instance, should be calculated as the weighted average of Orange and Apple prices in a Ratio of 1:3. So in the end I'm looking to generate (via some combination of queries) the following table for the underlying data of the chart:
Timestamp Fruit Stationary
01/01/2013 40.0 80.0
02/01/2013 55.0 85.0
03/01/2013 50.0 85.0
04/01/2013 50.0 85.0
05/01/2013 50.0 75.0
...
(this data is also made up and so probably not consistent with the original example)
I've managed to get an inner join on the two tables, but not too sure how to proceed. My main problem is how to handle days with no prices changes such as 01/04/13. The average prices still exist, but is not getting picked up by any query I try.
So how to use queries to construct data for the chart?

One remark before we start on a solution: be careful not to use reserved words for field names. Timestamp is a reserved word. Access will let you name a column that way, but you may encounter strange issues later on, especially if you ever move to another database in the future or use some other tools that fetch data from your Access database.
So here, I renamed your Timestamp column to DateTimeStamp.
I think that your requirements would be more straightforward to implement in VBA rather than in pure SQL queries: while you can easily build a query that gets you the average of each category per day, you are going to struggle to fill-in the data for the days where you have no data.
Simple but incomplete SQL solution
However, maybe having these holes in your data isn't that much of an issue since the graph will simply skip those missing values (it's not like they would show up as a data value of 0.00).
In that case, the following query should give you the results:
SELECT Dateserial(Year([DateTimeStamp]),
Month([DateTimeStamp]),
Day([DateTimeStamp])) AS NormalisedDate,
tblItems.Category,
SUM([Price]*[Ratio])/SUM([Ratio]) AS AvgPrice
FROM tblPriceChanges
INNER JOIN tblItems
ON tblPriceChanges.Item=tblItems.Item
GROUP BY Dateserial(Year([DateTimeStamp]),
Month([DateTimeStamp]),
Day([DateTimeStamp])),
tblItems.Category
Would result in the following:
NormalisedDate Category AvgPrice
01/01/2013 Fruit 35
01/01/2013 Stationary 80
02/01/2013 Fruit 55
02/01/2013 Stationary 85
03/01/2013 Fruit 25
05/01/2013 Stationary 22.0833333333333

Related

Is it possible to run 2 Where Clauses for Different Columns? Not sure how to set up this query

I am trying to find products that were bought outside of working hours (9-5). I'm trying to find all of the times that products were bought outside of the 9-5 working hours. However, yellow shirts can be purchased from 7 AM to 5 PM. I'm not sure how to do 2 Where Clauses for that.
There are 100 products. Here is an example:
Product Time_Purchased
Toothbrush 8:00 AM
Yellow Shirt 7:00 AM
Orange Sweatshirt 9:00 AM
Tablet Decoration 10:00 AM
Yellow Shirt 6:00 AM
With this example, the output for Yellow Shirts should not include the 8 AM Time, but it should include the 6 AM.
This is the code I tried running:
PROC SQL
Select * FROM Example
Where (Time_Purchased NOT BETWEEN 32400 AND 61200)
AND Product = 'Yellow Shirt' AND Time_Purchased <25200
quit;
When I run this code, I only get Yellow Shirts purchased before 7:30 and it ignores all of the other products. I'm not sure how to edit the code to show all of the products other than yellow shirts that were purchased outside of 9-5 and the yellow shirts that were purchased outside of 7:30-5 using Where statements.
You can use the OR operator to seperate the 2 conditions in your query. What the OR operator does is return a result if any of the WHERE conditions is true. In your case this would be either 1) when the product is not a yellow shirt and is sold between 9-5 or 2) when the product is a yellow shirt and is sold between 7-5. The query would be similar to the one below:
PROC SQL
SELECT * FROM Example
WHERE Time_Purchased BETWEEN 25200 AND 61200 AND Product = 'Yellow Shirt'
OR Time_Purchased BETWEEN 32400 AND 61200 AND Product != 'Yellow Shirt'
QUIT;
This should work for you.

How to set up crosstab queries to count days for negative stock counts?

Hello Stack overflow (and anyone googling similar questions in the future)!
I have a dataset that regularly reports which products are absent on a warehouse stockcheck, which I am trying to use to analyse when stock is or isn’t available. I’m essentially trying to identify “Has a part been reported as missing? -> If so, count the number of days it is missing until another part in the same category is reported as missing, but the original part was not reported as missing on that date (as we can assume it’s back in stock)”.
I’ve managed to make this work in excel, but my spreadsheet began to die from the calculation of 5 locations worth of categories and parts, let alone across the 600+ I’m working on! As a result, I’m trying to set up a similar function in Access to analyse which, and for how long, parts were out of stock.
My dataset looks something like:
Location number
Location
Category
Date reported
Part Number
Part Description
Order number
1
London
Car
03/06/2021
2021
Wheel A
1
2
London
Bus
03/06/2021
1491
Seat C
2
3
Manchester
Car
01/06/2021
2021
Wheel A
3
My assumptions are that:-
• My data is fed by individual workers who each cover a location, and check all stock for a random selection of categories each visit (with the idea that they cover all of their location’s categories within a certain number of visits) and record which parts are missing. There is no particular visit plan – it can be a random number of days between each visit. This data gets fed into a central table, which I have access to.
• As my workers may not check all categories in a location on each visit, I must assume that a previously reported missing part is OOS until they check products in the same category, but do not report that part again.
I made this work on excel by setting up another column that concatenated my location, part number, and date reported, and then set up three tables (all of which are essentially locations, categories, and parts down my X axis, and dates across the Y axis):-
• Table1, to look if my concatenated code was reported for each day (and if so, output 1 – essentially working in days) – essentially, was each part reported as missing for each category and location?
• Table2, to look if any parts were reported for each category, for each location – essentially, how many parts were reported for each category for each location, and a value greater than 0 means we can assume that that category at that location has been checked by my workers for that date.
• Table3, that for each location+category+day asked as a formula – IF(category was checked as per table2 = yes , pull the value of 1 for that part/location/category in table 1 , re-use yesterday’s value for this part/location/day in this table). For the 1st day in my date range, I used the values for table1 for that day as a “starting up” point.
When I look at table 3, I can visually the run of days products were out of stock, and can from there crunch numbers related to that, which is what I want!
My initial Access plan was to set up three crosstab queries, to mirror my three excel tables. I can make Table1 and Table2 very easily, but for the life of me can’t make table3 work (currently have a calculated expression that mirrors the formula I had in table 3, but something has gone amiss…).
I’m looking for a steer/advice on setting up the expression in my crosstab query, or other ideas/approaches I could use to calculate how long each part is missing for. Any help would be greatly appreciated, as I’ve lost my mind going in circles today!
Edit:-
Simplified dataset I'm working with:-
Location
Category
Date Reported
Part number
Part Description
Order number
Concatenate code
Concatenate Code 2
1
London
Car
03/06/2021
2021
Wheel
1
1443502021
1
London
Bus
03/06/2021
1491
Seat
2
1443501491
2
Manchester
Car
05/06/2021
2021
Wheel
3
2443522021
1
London
Car
05/06/2021
2021
Wheel
4
1443522021
1
London
Car
07/06/2021
2021
Wheel
5
1443542021
1
London
Bus
05/06/2021
1860
Seatbelt
6
1443521860
1
London
Bus
05/06/2021
1860
Seatbelt
7
1443521860
2
manchester
Bus
01/06/2021
1860
Seatbelt
8
2443481860
2
Manchester
Bus
06/06/2021
1860
Seatbelt
9
2443531860
2
manchester
Bus
04/06/2021
1491
Seat
10
2443511491
2
Manchester
Bus
06/06/2021
1491
Seat
11
2443531491
I'm trying to output something like (which I've made work in Excel):-
Location
Category
Part code
01/06/2021
02/06/2021
03/06/2021
04/06/2021
05/06/2021
06/06/2021
07/06/2021
1
London
Car
2021
1
1
1
1
1
London
Car
2626
1
London
Bus
1491
1
1
1
London
Bus
1860
1
1
2
Manchester
Car
2021
1
1
2
Manchester
Car
2626
2
Manchester
Bus
1491
1
1
1
2
Manchester
Bus
1860
1
1
1
1
3
Liverpool
Car
2021
3
Liverpool
Car
2626
3
Liverpool
Bus
1491
Or to return the value for how many concurrent days out of stock a part has been, like per day of this version:-
Location
Category
Part code
01/06/2021
02/06/2021
03/06/2021
04/06/2021
05/06/2021
06/06/2021
07/06/2021
1
London
Car
2021
1
2
3
4
1
London
Car
2626
1
London
Bus
1491
1
2
1
London
Bus
1860
1
2
2
Manchester
Car
2021
1
2
2
Manchester
Car
2626
2
Manchester
Bus
1491
1
2
3
2
Manchester
Bus
1860
1
2
3
1
3
Liverpool
Car
2021
3
Liverpool
Car
2626
3
Liverpool
Bus
1491
My Access sql (that I then turned into a crosstab) to identify ordered parts per day:
SELECT DISTINCT T_stores.[Store Nos], T_stores.[Store Name], t_Stands.Brand, t_Productlookup.TPND, t_Productlookup.TITLE, t_gapdata.Quantity, t_gapdata.[Requested Date]
FROM ((T_stores
INNER JOIN t_Stands ON T_stores.[Store Nos] = t_Stands.[Store Nos])
INNER JOIN t_gapdata ON (t_Stands.[Brand] = t_gapdata.[Brand]) AND (t_Stands.[Store Nos] = t_gapdata.[Store No]))
INNER JOIN t_Productlookup ON t_gapdata.[Part Number] = t_Productlookup.[EAN];
And likewise, to identfy is parts were ordered for a location's category:-
SELECT DISTINCT T_stores.[Store Nos], T_stores.[Store Name], t_Stands.Brand, t_Productlookup.TPND, t_Productlookup.TITLE, t_gapdata.Quantity, t_gapdata.[Requested Date]
FROM ((T_stores
INNER JOIN t_Stands ON T_stores.[Store Nos] = t_Stands.[Store Nos])
INNER JOIN t_gapdata ON (t_Stands.[Brand] = t_gapdata.[Brand]) AND (t_Stands.[Store Nos] = t_gapdata.[Store No]))
INNER JOIN t_Productlookup ON t_gapdata.[Part Number] = t_Productlookup.[EAN];
These first two work fine, but I'm struggling to put them together with some sort of Iif calculated field for a third query:-
SELECT First(q_gaps_per_product.[Store Nos]) AS [FirstOfStore Nos], First(q_gaps_per_product.[Store Name]) AS [FirstOfStore Name], First(q_gaps_per_product.Brand) AS FirstOfBrand, First(q_gaps_per_brand_store.[Order Id]) AS [FirstOfOrder Id], First(q_gaps_per_product.TPND) AS FirstOfTPND, First(q_gaps_per_product.TITLE) AS FirstOfTITLE, First(q_gaps_per_product.[Requested Date]) AS [FirstOfRequested Date], First(IIf([q_gaps_per_brand_store]![Requested Date]>=[q_gaps_per_product]![Requested Date],[Quantity],"PREVIOUS DAY")) AS Expr1, [q_gaps_per_product]![Store Nos] & [q_gaps_per_product]![Quantity] & [q_gaps_per_product]![TPND] AS Expr2
FROM q_gaps_per_product LEFT JOIN q_gaps_per_brand_store ON q_gaps_per_product.[Brand] = q_gaps_per_brand_store.[Brand]
GROUP BY [q_gaps_per_product]![Store Nos] & [q_gaps_per_product]![Quantity] & [q_gaps_per_product]![TPND];
Expr1 is supposed to be how many days a product is out of stock, with the idea that "PREVIOUS DAY" would return the same criteria for the previous day, to show either running gaps or that a product was in fact available as a 0, but I haven't got that far yet.
Expr2 is basically something I tried to make up to group the results by, as I had an insane number of results due to my janky table relationships.
I sort of think this query is DOA, and I need to go back to the drawing board to reproduce something like my Excel tables / how many days out of stock products have been concurrently out of stock before.
Sorry for the sheer storm of words!

SQL Query - Pull data from ambiguous column names for growth/decline %

Re-post due to bad data set and bad formatting. I am trying to divide data from two separate tables that have ambiguous column names.
I am newer to SQL, I know it should be simple, however I just can not figure it out. So far I have tried to rename columns, alias columns, union the table, and select multiple data sets.
I keep hitting roadblocks.
I am trying to measure growth or decline week over week. Ideally I want to take the total sales for Plates and do the following equation: (75/100-1) which would equal a -25% decline from last week.
What would be the best way to go about this?
The two example tables are below
LastWeekData
Product Day Month TotalSales
Plates 7 3 $100
Spoons 7 3 $150
Forks 7 3 $120
CurrentData
Product Day Month TotalSales
Plates 14 3 $75
Spoons 14 3 $100
Forks 14 3 $115
You can use table alias to differentiate the table columns that you want to display. See demo here: http://sqlfiddle.com/#!9/0b0d81/29
select cur.Product,
cur.Day,
cur.Month,
cur.TotalSales as currweek_TotalSales,
pre.TotalSales as lastweek_TotalSales,
round((cur.TotalSales/pre.TotalSales-1)*100) as percent_change
from CurrentData as cur
inner join LastWeekData as pre
on pre.product=cur.product
where datediff(str_to_date(concat_ws('-','0001',cur.month,cur.day),'%Y-%m-%d'),
str_to_date(concat_ws('-','0001',pre.month,pre.day),'%Y-%m-%d'))
= 7
Result:
Product Day Month currweek_TotalSales lastweek_TotalSales percent_change
Plates 14 3 75 100 -25
Spoons 14 3 100 150 -33
Forks 14 3 115 120 -4

Database table structure for storing statistics data

I am trying to create a table in my MYSQL database for storing click data to my posts on daily basis, what I come up is something like this:
ID | post_id | click_type | created_date
1 1 page_click 2015-12-11 18:13:13
2 2 page_click 2015-12-13 11:16:34
3 3 page_click 2015-12-13 13:24:01
4 1 page_click 2015-12-15 15:31:10
For this type of storing I can get how many clicks does the post number 1 get in December 2015 and even I can get how many clicks does the post number something gets in 15 December between 01-11pm. However let's say I am getting 2000 clicks per day which means it will create 2000 rows per day which means 60.000 per month and 720.000 per year.
Another approach that comes to my mind is like this which stores a row for one day per post and if there is more than one click in that day it will increase the count
ID | post_id | click_type | created_date | count
1 1 page_click 2015-12-11 13
2 2 page_click 2015-12-11 26
3 3 page_click 2015-12-11 152
4 1 page_click 2015-12-12 14
5 2 page_click 2015-12-12 123
6 3 page_click 2015-12-12 163
In this approach if every page is clicked at least one time (which means creating the row) in every day it will generate 1000 rows each day (let's say I have 1000 posts) and 30.000 per month and 360.000 per year.
I am looking for an advice to how to store these statistics and if I want to get daily click statistics. I have some concerns about the performance (of course it's nothing for big data guys :D but sorry for my lack of experience). Do you think it will be ok if there is over 1 million rows in that table after 2-3 years? And which one is do you thing is going to be more effective for me?
720,000 records per year is not necessarily a lot of data. One option may be not to worry about it. Something to consider may be how long the click data matters. If after a year you don't really care anymore then you can have an historical data cleanup protocol that removes data that is older than you care about.
If you are worried about storing large amounts of data and you don't want to erase history, then you can consider pre-calculating your summarized statistics and storing them instead of your transaction detail.
The issue with this is that you have to know in advance what the smallest resolution of time will be that you will continue to care about. Also, if your motivation is saving space then you have to be careful that your summary data doesn't end up taking more space than the original transactions. This can easily happen if you store summarized data at multiple resolutions, as you might in a data warehouse arrangement.
This seems like a good application for rrdtool (http://oss.oetiker.ch/rrdtool/). Here you can specify several resolutions for different time intervals, e.g:
average 5 min for 1 day
average 30 min for 1 week
average 2 hours for 1 month
average 1 day for 1 Year
etc. This is also often used for graphs. Usually this is used with rrd-files, but it can also be based on mysql with rrdgraph_libdbi

Access Calculated Field

I am having difficulty trying to make a calculated field that I need. So here is what I am trying to do:
I have a query that combines the information based on three tables. The most important fields that for the application are as follows:
Family Income Age Patient
15,000 18 Yes
28,000 25 No
30,000 1 Yes
From here I want to make a calculated field that gives the correct program the patient was enrolled in. based on these fields ie:
Program Minimum Income Maximum Income Minimum Age Maximum Age Patient
Children's 0 20,000 1 19 Yes
Adult 0 12,000 19 65 No
Non Patient 0 20,000 1 19 No
Adult 2 12,000 50,000 19 65 No
Etc.
to create:
Family Income Age Patient Program
15,000 18 Yes Children's
28,000 25 No Adult 2
30,000 1 Yes Children's 2
I know I can use IIf to hard code it in to the field, but then it will be really difficult for other people to update the information as the guidelines change. Is it possible to have the information stored in a table? and use the information on the table form etc, or will I need to use IIf
Any Ideas? is it possible to dynamically create the IIf in SQL using VBA while pulling the information from the table?
EDIT:::
Thank you for your response and for formatting my tables, I still have no idea how you changed it, but it looks amazing!
I tried to add the SQL you added down below, but I was not able to make it work. I'm not sure if I made a mistake so I included the SQL of my Query. The query currently returns 0 values, so I think I messed something up. (The real Query is embarassing...I'm sorry for that). Unfortunately, I have done everything in my power to avoid SQL, and now I am paying the price.
SELECT qry_CombinedIndividual.qry_PrimaryApplicant.[Application Date],
qry_CombinedIndividual.qry_PrimaryApplicant.[Eligibility Rep],
qry_CombinedIndividual.qry_PrimaryApplicant.Name,
qry_CombinedIndividual.qry_PrimaryApplicant.Clinic,
qry_CombinedIndividual.qry_PrimaryApplicant.Outreach,
qry_CombinedIndividual.qry_PrimaryApplicant.[Content Type ID],
qry_CombinedIndividual.qry_PrimaryApplicant.[Application Status],
qry_CombinedIndividual.qry_PrimaryApplicant.Renewal,
qry_CombinedIndividual.qry_Enrolled.EthnicityEnr,
qry_CombinedIndividual.qry_Enrolled.GenderEnr, qry_CombinedIndividual.AgeAtApp,
qry_CombinedIndividual.[Percent FPL], tbl_ChildrensMedical.MinPercentFPL,
tbl_ChildrensMedical.MaxPercentFPL, tbl_ChildrensMedical.MinAge,
tbl_ChildrensMedical.MaxAge, tbl_ChildrensMedical.Program
FROM qry_CombinedIndividual
INNER JOIN tbl_ChildrensMedical ON qry_CombinedIndividual.qry_Enrolled.Patient = tbl_ChildrensMedical.Patient
WHERE (((qry_CombinedIndividual.AgeAtApp)>=[tbl_ChildrensMedical].[MinAge]
And (qry_CombinedIndividual.AgeAtApp)<[tbl_ChildrensMedical].[MinAge])
AND ((qry_CombinedIndividual.[Percent FPL])>=[tbl_ChildrensMedical].[MinPercentFPL]
And (qry_CombinedIndividual.[Percent FPL])<[tbl_ChildrensMedical].[MaxPercentFPL]));
Also there are many different programs. Here is the real Children's Table (eventually I would like to add adults if possible)
*Note the actual table uses FPL (which takes family size into account, but is used the same as income). I am again at a total loss as to how you formated the table.
Program Patient MinPercentFPL MaxPercentFPL MinAge MaxAge
SCHIP (No Premium) No 0 210 1 19
SCHIP (Tier 1) No 210 260 1 19
SCHIP (Tier 2) No 260 312 1 19
Newborn No 0 300 0 1
Newborn (Patient) Yes 0 300 0 1
Children's Medical Yes 0 200 1 19
CHIP (20 Premium) Yes 200 250 1 19
CHIP (30 Premium) Yes 250 300 1 19
Do I have the correct implementation for the table I have? Or should I be changing something. I can also send more information/sample data if that would help.
Thank you again!
I just created some tables with your sample data and used the following SQL. Your 3rd 'patient' doesn't match any of the ranges (Age 1, Income $30K)
SELECT tblPatient.PatName, tblPatient.FamInc, tblPatient.Age, tblPatient.Patient,
tblPatientRange.Program, tblPatientRange.MinInc, tblPatientRange.MaxInc, tblPatientRange.MinAge,
tblPatientRange.MaxAge, tblPatientRange.Patient
FROM tblPatient INNER JOIN tblPatientRange ON tblPatient.Patient = tblPatientRange.Patient
WHERE (((tblPatient.FamInc)>=[tblPatientRange]![MinInc] And (tblPatient.FamInc)<=[tblPatientRange]![MaxInc])
AND ((tblPatient.Age)>=[tblPatientRange]![MinAge] And (tblPatient.Age)<=[tblPatientRange]![MaxAge]));