MS Access monthly average for two criteria - ms-access

Still new to using Access. I searched, but could only find averages for 1 criteria, like Car average. I am trying to get monthly averages for an item with two options. Here is my example data
Here is the output I am trying to go for. It is the average for each Item for X and Z for the month of data.
Here is what I have, but I'm getting "Syntax error FROM clause". I used similar information that I received from a previous request so I may be way off base on how to get this information.
Select Item, [X or Z], Date, Value FROM MyTable
INNER JOIN (
SELECT Item, Date, AVG(Value) As Average FROM Mytable
GROUP BY Item, [X or Z]
AS t ON (t.[X or Z] = [X or Z])
AND (t.Item = Item)
AND (t.Average = Value)
I have been doing this in SQL view, but would this be something I could achieve in Design view?

You can do this in a single query, without joins:
SELECT Item, [X or Z], Format([Date], "MMMM") As Month, Avg(Value) As AvgOfValue
From MyTable
GROUP BY Item, [X or Z], Format([Date], "MMMM")
This query can be represented in design view.

Related

SSRS Sorting by Year group total

I have a tablix that has Customer as the row group and Month and Year as Column Groups. Sales amount is in the data area. I would like to sort the customers in descending order by the Year total sales.
I tried the following (psuedo code)
SELECT
Period (a CONCAT of YEAR(date) and MONTH(Date),
SUM(Amount),
Company
FROM [tables]
Group by Period and Company
ORDER BY Sum(Amount) Desc
I did it this way thinking that if I sorted in the query it would come through in the order I want, but obviously it's showing the customer with the highest single month sales first, not the highest year.
Thinking more about it, if I want the report to be able to span multiple years, then I have to figure out which Year to total on, but I'd be happy to restrict the report to a single Year (identified by a parameter).
When I try to sort the tablix or customer group on Sum(Fields!Amount.value, "xYear") I get the error that aggregates can include groups.
I switched from Tablix to Matrix and now sorting the Customer Group by SUM(Fields!Amount.Value) works.... kind of.
It sorts by the grand total as opposed to a given year, but I can live with that for now. Maybe I'll add a parameter that defaults to the current year and try to figure out how to use that to enforce the sort. I'm thinking I may have to get the total YTD sales by customer in a separate dataset (that doesn't display in the report).
You could do it two ways.. (not tested... it's midnight here...) assuming you have a parameter to select the sort year and the Period is a date - adjust to suit...
You could sort by an expression something like
=SUM(
IIF(
YEAR(Fields!Period.Value) = Parameters!pSortYear.Value,
Fields!Amount.Value,
0),
"myDataSetName")
NOte The dataset name must match your dataset name exactly (case sensitive) and be enclosed in double quotes.
Or.. what I normally do is do it in SQL
SELECT Period, Company, SUM(Amount) AS Amount
INTO #data
FROM myTable
GROUP BY Period, Company
SELECT d.*, s.SortOrder
FROM #data d
JOIN (
SELECT Company, ROW_NUMBER() OVER(ORDER BY Amount DESC) as SortOrder
FROM #data
WHERE Period = #pSortYear
) s on d.Company = s.Company
Then in your report you can simply sort by SortOrder
This is done off he top of my head so there could be some basic errors but hopefully close enough for you to follow.

Grouping COUNT by date and id from foreign table

I need to get the count of reports made by id_type and by day in the same result set.
My current query displays the total reports for each type, but doesn't separate the reports by day as well.
SELECT DATE(report.date_insert) AS date_insert, type.name, count(report.id_type) as number_of_orders
from type
left join report
on (type.id_type = report.id_type)
group by type.id_type
As you can see, the only difference between them, is that i've changed the value for type.id_type = XX, but this is not the effective way to achieve my requirement.
Another important requirement is that, if there are no reports from an id_type in a day where at least another id_type does have reports, there should be a result with the count of zero.
I've created a fiddle with the structure and some sample data, where id_type=1 should have 0 reports, id_type=2 should have 8 reports, and id_type=3 should have 5 reports.
http://sqlfiddle.com/#!9/6ceb48/2
Thanks!
You need to join with a subquery that gets all the different dates, and then add the date to the grouping.
SELECT alldates.date_insert, type.name, IFNULL(COUNT(report.id_type), 0) AS number_of_orders
FROM (
SELECT DISTINCT DATE(date_insert) AS date_insert
FROM report) AS alldates
CROSS JOIN type
LEFT JOIN report ON type.id_type = report.id_type AND alldates.date_insert = DATE(report.date_insert)
GROUP BY alldates.date_insert, type.id_type
ORDER BY alldates.date_insert, type.name
DEMO

Select average value X of SQL table column while not grouping by X

For the purposes of my question, I have a database in a MySQL server with info on many taxi rides (it is comprised of two tables, history_trips and trip_info).
In history_trips, each row's useful data is comprised of a unique alphanumeric ID, ride_id, the name of the rider, rider, and the time the ride ended, finishTime as a Y-m-d string.
In trip_info, each row's useful data similarly contains ride_id and rider, but also contains an integer, value (calculated in the back end from other data).
What I need to do is create a query that can find the average of all the maximum 'values' from all riders in a given time period. The riders included in this average are only considered if they completed less than X (let's say 3) rides within the aforementioned time period.
So far, I have a query that creates a grouped table containing the name of the rider, the finishTime of their highest 'value' ride, the value of said ride, and the number of rides, num_rides, they have taken in that time period. The AVG(b.value) column, however, gives me the same values as b.value, which is unexpected. I would like to find some way to return the average of the b.value column.
SELECT a.rider, a.finishTime, b.value, AVG(b.value), COUNT(a.rider) as num_rides
FROM history_trips as a, trip_info as b
WHERE a.finishTime > 'arbitrary_start_date_str' and a.ride_id = b.ride_id
and b.value = (SELECT MAX(value)
from trip_info where rider = b.rider and ride_id = b.ride_id)
GROUP BY a.rider
HAVING COUNT(a.rider) < 3
I am a novice in SQL but have read on some other forums that when using the AVG function on a value you must also GROUP BY that value. I was wondering if there is a way around that or if I am thinking of this problem incorrectly. Thanks in advance for any advice / solutions you might have!
The following worked for me:
SELECT AVG(ridergroups.maxvalues) avgmaxvalues FROM
(SELECT MAX(trip_info.value) maxvalues FROM trip_info
INNER JOIN history_trips
ON trip_info.rideid = history_trips.ride_id
WHERE history_trips.finishTime > '2010-06-20'
GROUP BY trip_info.rider
HAVING COUNT(trip_info.rider) < 3) ridergroups;
The subquery groups the maximum values by rider after filtering by date and rider count. The containing query calculates the average of the maximum values.

Access 2010 Sum wrong result

This is my issue.
First step.
I sum the column HH (alias SUM_Original_values) and I get 419. This result is correct. (see the pic below)
Second step.
I want take only INT values of the HH column's, and I get 417. This result is correct. (see the pic below)
Third step.
I want to create a column Global_Int_Sum_HH (416), but this value is different from Int_Sum_HH (417)
Why the results are differents ?
This is the query
SELECT
Year,
Month,
Customer,
User,
Int(Sum(HH)) AS Int_Sum_HH,
(
SELECT (int(sum(int(HH)))) AS Global_Int_Sum_HH
FROM T_Att
HAVING (((Year)="2016") AND ((month)="03") AND ((Customer)="FC"));
) AS Global_Int_Sum_HH,
Customer + Str(Global_Int_Sum_HH) AS [KEY]
FROM T_Att
GROUP BY Year, Month, Customer, User
HAVING (((Year)="2016") AND ((Month)="03") AND ((Customer)="FC"));
It looks to me like there's an inconsistency in your order of operations.
In one instance you int the sum, and in the second instance you sum the int.
SELECT
Year,
Month,
Customer,
User,
Sum(Int(HH)) AS Int_Sum_HH,
-- ^ changed order of events to match sub-query
(
-- v removed redundant int()
SELECT sum(int(HH)) AS Global_Int_Sum_HH
FROM T_Att
HAVING (((Year)="2016") AND ((month)="03") AND ((Customer)="FC"));
) AS Global_Int_Sum_HH,
Customer + Str(Global_Int_Sum_HH) AS [KEY]
FROM T_Att
GROUP BY Year, Month, Customer, User
HAVING (((Year)="2016") AND ((Month)="03") AND ((Customer)="FC"));
The above adjustment will make the "right" answer = 416 for both values. If you were to change your order of operations to both be Int(Sum(HH)), then the Global_Int_Sum_HH value would equal 419 and your Int_Sum_HH column would be 417 instead.

Calculating the Median with Mysql

I'm having trouble with calculating the median of a list of values, not the average.
I found this article
Simple way to calculate median with MySQL
It has a reference to the following query which I don't understand properly.
SELECT x.val from data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2
If I have a time column and I want to calculate the median value, what do the x and y columns refer to?
I propose a faster way.
Get the row count:
SELECT CEIL(COUNT(*)/2) FROM data;
Then take the middle value in a sorted subquery:
SELECT max(val) FROM (SELECT val FROM data ORDER BY val limit #middlevalue) x;
I tested this with a 5x10e6 dataset of random numbers and it will find the median in under 10 seconds.
This will find an arbitrary percentile by replacing the COUNT(*)/2 with COUNT(*)*n where n is the percentile (.5 for median, .75 for 75th percentile, etc).
val is your time column, x and y are two references to the data table (you can write data AS x, data AS y).
EDIT:
To avoid computing your sums twice, you can store the intermediate results.
CREATE TEMPORARY TABLE average_user_total_time
(SELECT SUM(time) AS time_taken
FROM scores
WHERE created_at >= '2010-10-10'
and created_at <= '2010-11-11'
GROUP BY user_id);
Then you can compute median over these values which are in a named table.
EDIT: Temporary table won't work here. You could try using a regular table with "MEMORY" table type. Or just have your subquery that computes the values for the median twice in your query. Apart from this, I don't see another solution. This doesn't mean there isn't a better way, maybe somebody else will come with an idea.
First try to understand what the median is: it is the middle value in the sorted list of values.
Once you understand that, the approach is two steps:
sort the values in either order
pick the middle value (if not an odd number of values, pick the average of the two middle values)
Example:
Median of 0 1 3 7 9 10: 5 (because (7+3)/2=5)
Median of 0 1 3 7 9 10 11: 7 (because 7 is the middle value)
So, to sort dates you need a numerical value; you can get their time stamp (as seconds elapsed from epoch) and use the definition of median.
Finding median in mysql using group_concat
Query:
SELECT
IF(count%2=1,
SUBSTRING_INDEX(substring_index(data_str,",",pos),",",-1),
(SUBSTRING_INDEX(substring_index(data_str,",",pos),",",-1)
+ SUBSTRING_INDEX(substring_index(data_str,",",pos+1),",",-1))/2)
as median
FROM (SELECT group_concat(val order by val) data_str,
CEILING(count(*)/2) pos,
count(*) as count from data)temp;
Explanation:
Sorting is done using order by inside group_concat function
Position(pos) and Total number of elements (count) is identified. CEILING to identify position helps us to use substring_index function in the below steps.
Based on count, even or odd number of values is decided.
Odd values: Directly choose the element belonging to the pos using substring_index.
Even values: Find the element belonging to the pos and pos+1, then add them and divide by 2 to get the median.
Finally the median is calculated.
If you have a table R with a column named A, and you want the median of A, you can do as follows:
SELECT A FROM R R1
WHERE ( SELECT COUNT(A) FROM R R2 WHERE R2.A < R1.A ) = ( SELECT COUNT(A) FROM R R3 WHERE R3.A > R1.A )
Note: This will only work if there are no duplicated values in A. Also, null values are not allowed.
Simplest ways me and my friend have found out... ENJOY!!
SELECT count(*) INTO #c from station;
select ROUND((#c+1)/2) into #final;
SELECT round(lat_n,4) from station a where #final-1=(select count(lat_n) from station b where b.lat_n > a.lat_n);
Here is a solution that is easy to understand. Just replace Your_Column and Your_Table as per your requirement.
SET #r = 0;
SELECT AVG(Your_Column)
FROM (SELECT (#r := #r + 1) AS r, Your_Column FROM Your_Table ORDER BY Your_Column) Temp
WHERE
r = (SELECT CEIL(COUNT(*) / 2) FROM Your_Table) OR
r = (SELECT FLOOR((COUNT(*) / 2) + 1) FROM Your_Table)
Originally adopted from this thread.