How to get weighted average for CTR in MySQL? - mysql

Here is what my data looks like:
I would like to see a roll up of the total impressions, total clicks and average ctr for a given query. E.g.,
Except this is just an average of the ctr column values; whereas I need a weighted average.
I think this SO answer Calculate Weighted Average in sql for duplicate items is taking me most of the way there, but since this calculation involves two metrics (ie, impressions and clicks), it's a bit different.

Related

How to exclude a value while calculating total based on other column in SSRS Calculation Expression

For more detail please go through the image
If Untrended MOM is 0,
Skip that record in the calculation. We basically treat both MOM and Total Cost as null.
Untrended YOC * Total Cost for each row. Sum all the results from that multiplication, then divide that result by the grand total project cost.
Example
enter image description here

How to get the right result for average from averages in ssrs?

I have report with totals.
In the end I have average values from route points.
I want to get the average value per route points.
results
for example average for 2041 is 00:00:12
average for 2042 is 00:00:04
I want to get average from from 2041 & 2042
I received 00:00:12 which is not true..
For average and avg of avg i used the same expression :
=Format(
TimeSerial(0,0,
Round(
IIf(sum(Fields!N_ANSWERED.Value)=0,
0,
sum(Fields!T_ANSWERED.Value) / iif(sum(Fields!N_ANSWERED.Value)=0,1,sum(Fields!N_ANSWERED.Value))
)
)
),
"HH:mm:ss")
I expected ~ 00:00:08 as result.
An average of averages is rarely right.
For example, in group 2041 you appear to have higher call volume at lunch time, (12:00 to 13:00) as the calls take longer to answer, and lower call volume first thing in the morning (8:00). Let's say the average time to answer at 13:00 was 00:00:24 because 50 calls came in, but at 8:00 there was only one call which took 00:00:02 to answer. Now, the average of those two hours isn't (00:00:24 + 00:00:02) / 2 = 00:00:13 because the amount of calls is very different in the two samples making up the average.
The real average is the average for that group multiplied by the number of calls in that group divided by the total calls (00:00:24 x (50/51)) + (00:00:02 x (1/51)) = 00:00:23.57
If you are rounding to a precision of zero decimal places, that is still 00:00:24.
This is called the weighted average as each group's average influences the outcome depending on how many results are in the original calculation of the average for that group.
This is why your averages of 00:00:12 and 00:00:04 probably won't be 00:00:08, it will vary depending on how many calls are in each group. Now, if there are exactly the same number of calls in each group, then the average of averages will be the same as the weighted average (this is the only case where you would get 00:00:08).
The closer the total number of calls are in each group, the closer to the right result the average of averages will be, but it is an unreliable calculation. Conversely, the more the number of results in each group varies, the more the weighted average will skew towards the average of the group that is more highly represented in the results.
Now, if there are a lot of results in the 2041 group and very few in the 2042 group then the 00:00:04 average result for 2042 will hardly influence the overall average, which may lead to the outcome where the result for 2041 overwhelms the result for 2042 and the overall average is the same as the average for 2041 within your level of precision and rounding, as per the example above.
The fact that there are several missing hours in the 2042 result set makes me think this is the case.
So your calculation looks correct - the sum of the time taken to call divided by the number of calls will give you the average for the groups and for the overall average. It is just that the average of averages won't be the same result because the groups aren't equally represented in the data used to calculate the overall average.
Based on your expression, your overall average looks accurate at 00:00:12.
By referencing the rendered cell rather than the dataset filed you can so this quite simply.
1. Get the name of the cell containing the detailed average you have already calculated, let's assume this is called textbox1.
Then your expression is simply
=AVG(ReportItems!textbox1.Value)

spotfire multiple over statements in one custom expression

I have a table of travel expenses for analysis.
I would like to create a calculated column with a value for the maximum count of records with a certain category for each employee on any given day.
For example, if the category being reviewed is "dinner", we would like to know what is the maximum number of dinner transactions charged on any given day.
The following custom expression was able to count how many dinner expenses per employee:
count(If([Expense Type]="Dinner",[Expense Type],null)) over ([Employee])
But when trying to get the max count over days, I cant seem to get it to work. Here is the expression used:
Max(count(If([Expense Type]="Dinner",[Expense Type],null)) over ([Employee])) over (Intersect([Employee],[Transaction Date]))
This seems to provide the same answer as the first expression. Any idea on how to get this code to identify the value on the date with the most expenses for each employee?
If i understand your question and comments correctly, you should be able to use intersect.
count(If([Expense Type]="Dinner",[Expense Type],null)) over (Intersect([Transaction Date],[Employee]))
You may need to cast [Transaction Date] as a date if it is an actual DateTime. Otherwise you'd get one for each unique DT.

Get number of entries grouped by X points in time in MySQL

I need to build the backend for a chart, which needs to have a fixed amount of data points, let's assume 10 for this example. I need to get all entries in a table, have them split into 10 chunks (by their respective date column) and show how many entries there were between each date interval.
I have managed to do kind of the opposite (I can get the entries for a fixed interval, and variable number of data points), but now I need a fixed number of data points and variable date interval.
What I was thinking (which didn't work) is to get the difference between the min and max date from the table, divide it by 10 (number of data points) and have each row's date column divided by that result and also grouped by it. I either screwed up the query somewhere or my logic is faulty, because it didn't work.
Something along these lines:
SELECT (UNIX_TIMESTAMP(created_at) DIV (SELECT (MAX(UNIX_TIMESTAMP(created_at)) - MIN(UNIX_TIMESTAMP(created_at))) / 10 FROM user)) x FROM user GROUP BY x;

Multiple Subqueries within a Query

When trying to manipulate and display date from ONE table, I am having difficulty coding it correctly.
I need to, from the same table, find the amount of Services done per day (Which has been done, based on the Count of ServiceId). I then need to find the OverallCharge (done) and find the min, max and avg of these overallCharge (s) per day (BasicCharge + AdditionalPartsCharge + AdditionalLabourCharge)
I need to display these charges per ServiceDate in the table
My draft is the following but is telling me that ServiceId is not part of an aggregate function.
SELECT Service.ServiceDate, Service.NumServices , Min(OverallCharge) AS MinOverallCharge, Max(OverallCharge) AS MaxOverallCharge, Avg(OverallCharge) AS AverageOverallCharge
FROM (SELECT Service.ServiceId, Sum([BasicCharges]+[AdditionalLabourCharges]+[AdditionalPartCharges]) AS OverallCharge, Service.ServiceDate, Count (Service.ServiceId) AS NumServices
FROM Service
GROUP BY Service.ServiceDate, NumServices, MinOverallCharge, MaxOverallCharge, AvgerageOverallCharge);
Thanks