Averaging the result of a subquery in mySQL

Averaging the result of a subquery in mySQL - mysql

I'm building a query that searches through a Medicare database listing how much doctors charge for various procedures.
Ideally, this query would:
Return every record, meaning every procedure for every doctor. (I'll add filtering WHERE clauses later)
Return the average amount doctors charge for each procedure
Return the percentage difference between the average cost and what each individual doctor charges
Return the average of all those percentage differences for each doctor, generating a meta cost-differential score.
With the query below, I've been able to achieve everything but the last goal.
SELECT medicare.*,
peerAverage.average AS charge_average,
( medicare.average_submitted_chrg_amt - peerAverage.average ) /
peerAverage.average * 100 AS difference_from_average,
Avg( ( medicare.average_submitted_chrg_amt - peerAverage.average ) /
peerAverage.average * 100 ) as total_difference_from_average
FROM medicare
JOIN (SELECT Avg(average_submitted_chrg_amt) AS average,
procedure_code
FROM medicare
GROUP BY procedure_code) AS peerAverage
ON medicare.procedure_code = peerAverage.procedure_code
ORDER BY procedure_code ASC,
difference_from_average DESC
When I add the final SELECT condition (Avg( ( medicare.average_submitted_chrg_amt - peerAverage.average ) / peerAverage.average * 100 ) as total_difference_from_average), the query only returns one record.
Delete that condition and the query returns the correct number of records. What am I doing wrong?

Aggregation functions move the aggregation level up. Until you have specified the grouping conditions for the average function, it will always return one row aggregated over all values, returned by the expression

Related

Create a MySQL view to Get Average for Count Column

I need to create a view to get the sum of the count column and display the average as a new column. I used the below code.
select
count(`t`.`surg_priority`) AS `Surgery_Count`,
`t`.`surg_priority` AS `Surgery_Type`
from
`DataBase`.`booking` t
group by
`t`.`surg_priority
This was the result. I need a new column called average to get the total of surgery Avg.
Surgery Average = (Surgery Count / Sum of Surgery Count) * 100
I also tried
select
count(`t`.`surg_priority`) AS `Surgery_Count`,
`t`.`surg_priority` AS `Surgery_Type`,
(
(count(`t`.`surg_priority`)/(sum(Surgery_Count))
)* 100 AS `Surgery_AVG`
from
`DataBase`.`orbkn_booking` t
group by
`t`.`surg_priority`
This too didn't work. Make sure this is a view. Cant use variable or cumulative functions

You can compute the total count in a subquery and divide the Surgery_Count by that:
select
count(`t`.`surg_priority`) AS `Surgery_Count`,
`t`.`surg_priority` AS `Surgery_Type`,
100.0 * count(`t`.`surg_priority`) /
(select count(`surg_priority`) from `DataBase`.`booking`) AS `Surgery_Avg`
from
`DataBase`.`booking` t
group by
`t`.`surg_priority

Use window functions:
select b.surg_priority as Surgery_Type, count(*) as surgery_count,
count(*) * 100.0 / sum(count(*)) over () as ratio
from DataBase.booking b
group by b.surg_priority;
These have been available since MySQL 8+ was released.
Also, don't clutter your queries with backticks. They just make queries harder to write and read.

MySQL - group by and count - best query

We have a statistics database of which we would like to group some results. Every entry has a timestamp 'tstarted'.
We would like to group by every quarter of the day. For each quarter, we would like to know the day count where we have > 0 results (for that quarter).
We could resolve this by using a subquery:
select quarter, sum(q), count(quarter), sum(q) / count(quarter) as average
from (
select SEC_TO_TIME((TIME_TO_SEC(tstarted) DIV 900) * 900) as quarter, sum(qdelivered) as q
from statistics
where stat_field = 1
group by SEC_TO_TIME((TIME_TO_SEC(tstarted) DIV 900) * 900), date(tstarted)
order by SEC_TO_TIME((TIME_TO_SEC(tstarted) DIV 900) * 900) asc
) as sub
group by quarter
My question: is there a more efficient way to retrieve this result (e.g. join or other way)?

Efficiency could be improved by eliminating the inline view (derived table aliased as sub), and doing all the work in a single query. (This is because of the way that MySQL processes the inline view, creating and populating a temporary MyISAM table.)
I don't understand why the expression date(tstarted) needs to be included in the GROUP BY clause; I don't see that removing that would change the result set returned by the query.
I do now see the effect of including the date(tstarted) in the GROUP BY of the inline view query.
I think this query returns the same result as the original:
SELECT SEC_TO_TIME((TIME_TO_SEC(s.tstarted) DIV 900) * 900) AS `quarter`
, SUM(s.qdelivered) AS `q`
, COUNT(DISTINCT DATE(s.tstarted)) AS `day_count`
, SUM(s.qdelivered) / COUNT(DISTINCT DATE(s.tstarted)) AS `average`
FROM statistics s
WHERE s.stat_field = 1
GROUP BY SEC_TO_TIME((TIME_TO_SEC(s.tstarted) DIV 900) * 900)
This should be more efficient since it avoids materializing an intermediate derived table.
Your question said you wanted a "day count"; that sounds like you want a count of the each day that had a row within a particular quarter hour.
To get that, you could just add an aggregate expression to the SELECT list,
, COUNT(DISTINCT DATE(s.tstarted)) AS `day_count`

I would be tempted to set up a table of quarters in the day. Use this table and LEFT JOIN your statistics table it.
CREATE TABLE quarters
(
id INT,
start_qtr INT,
end_qtr INT
);
INSERT INTO quarters (id, start_qtr, end_qtr) VALUES
(1,0,899),
(2,900,1799),
(3,1800,2699),
(4,2700,3599),
(5,3600,4499),
(6,4500,5399),
(7,5400,6299),
(8,6300,7199),
etc;
Your query can then be:-
SELECT SEC_TO_TIME(quarters.start_qtr) AS quarter,
sum(statistics.qdelivered),
count(statistics.qdelivered),
sum(statistics.qdelivered) / count(statistics.qdelivered) as average
FROM quarters
LEFT OUTER JOIN statistics
ON TIME_TO_SEC(statistics.tstarted) BETWEEN quarters.start_qtr AND quarters.end_qtr
AND statistics.stat_field = 1
AND DATE(statistics.tstarted) = '2014-06-30'
GROUP BY quarter
ORDER BY quarter;
Advantage of this is that it will give you entries with a count of 0 (and an average of NULL) for quarters where there are no statistics, and it saves some of the calculations.
You could save more calculations by adding time columns to the quarters table:-
CREATE TABLE quarters
(
id INT,
start_qtr INT,
end_qtr INT
start_qtr_time TIME,
end_qtr_time TIME,
);
INSERT INTO quarters (id, start_qtr, end_qtr, start_qtr_time, end_qtr_time) VALUES
(1,0,899, '00:00:00', '00:14:59'),
(2,900,1799, '00:15:00', '00:29:59'),
(3,1800,2699, '00:30:00', '00:44:59'),
(4,2700,3599, '00:45:00', '00:59:59'),
(5,3600,4499, '01:00:00', '01:14:59'),
(6,4500,5399, '01:15:00', '01:29:59'),
(7,5400,6299, '01:30:00', '01:44:59'),
(8,6300,7199, '01:45:00', '01:59:59'),
etc
Then this saves the use of a function on the JOIN:-
SELECT start_qtr_time AS quarter,
sum(statistics.qdelivered),
count(statistics.qdelivered),
sum(statistics.qdelivered) / count(statistics.qdelivered) as average
FROM quarters
LEFT OUTER JOIN statistics
ON TIME(statistics.tstarted) BETWEEN quarters.start_qtr_time AND quarters.end_qtr_time
AND statistics.stat_field = 1
AND DATE(statistics.tstarted) = '2014-06-30'
GROUP BY quarter
ORDER BY quarter;
These both assume you are interested in a particular day.

MS ACCESS Combining to result sets

SELECT c.siteno, a.sitename, a.location, Count(a.status) AS ChargeablePermit
FROM (PermitStatus AS a LEFT JOIN states AS b ON a.status = b.statusheading)
LEFT JOIN Sitedetails AS c ON a.zone = c.compexzone
WHERE b.statusheading like "Chargeable" and a.loaded_date between
(select monthstart from ChargeDate) and (select Monthend from ChargeDate)
GROUP BY a.sitename, c.siteno, a.location;
This query returns me the count of chargeable permits by site
Mar14
Siteno (1) Sitename (site1) Location (location1) Chargeablepermit (30)
these calculations are based on the period determined by the two sub selects (i.e. for the month of March 14)
i was wondering if i could change the date range covered by the subselects (i.e.to April 14) and do math on (subtract one count from the other) the counts of chargeable permits from the two different result sets and have that result displayed on the on one table
for instance if April 14 was
April
Siteno (1) Sitename (Site1) Location (Location1) ChargeablePermit (40) Difference (10)

Not in the way it seems you are proposing, you would simply double-up your SQL within a UNION query to return the data sets for the 2 periods, and then perform an aggregate on the results:
SELECT SUM(CP) FROM (
SELECT (ChargeablePermit * -1) AS CP FROM ... WHERE dates = Date1
UNION ALL
SELECT ChargeablePermit AS CP FROM ... WHERE dates = Date2
)
Depending on how many records you're dealing with, a UNION like this could be quite slow however. So the other approach would be to turn your SQL into an Append query which inserts the output into a temp table. You would run the query for each period, before running a 2nd query to aggregate the results from the temp table.
Also you should consider using joins to filter your results rather than subqueries.

MySQL Calculating a percentage from two counts in the same table

I'm trying to calculate a percentage of customers that live within a specified range (<=10miles) from the total number of customers, the data comes from the same table.
Customer_Loc (table name)
Cust_ID | Rd_dist (column names)
I've tried the query below which returns syntax errors.
select count(Rd_dist) / count (Cust_ID)
from customer_loc
where Rd_Dist <=10 *100 as percentage
I realise the solution to this may be fairly easy but I'm new to SQL and it's had me stuck for ages.

The problem with your query is that you are filtering out all the customers who are more than 10 miles away. You need conditional aggregation, and this is very easy in MySQL:
select (sum(Rd_Dist <= 10) / count(*)) * 100 as percentage
from customer_loc;

Difference between rows Mysql Query

I have one table which is having four fields:
trip_paramid, creation_time, fuel_content,vehicle_id
I want to find the difference between two rows.In my table i have one field fuel_content.Every two minutes i getting packets and inserting to database.From this i want to find out total refuel quantity.If fuel content between two packets is greater than 2,i will treat it as refueling quantity.Multiple refuel may happen in same day.So i want to find out total refuel quantity for a day for a vehicle.I created one table schema&sample data in sqlfiddle. Can anyone help me to find a solution for this.here is the link for table schema..http://www.sqlfiddle.com/#!2/4cf36

Here is a good query.
Parameters (vehicle_id=13) and (date='2012-11-08') are injected in the query, but they are parameters to be modified.
You can note that have I chosen an expression using creation_time<.. and creation_time>.. in instead of DATE(creation_time)='...', this is because the first expression can use indexes on "creation_time" while the second one cannot.
SELECT
SUM(fuel_content-prev_content) AS refuel_tot
, COUNT(*) AS refuel_nbr
FROM (
SELECT
p.trip_paramid
, fuel_content
, creation_time
, (
SELECT ps.fuel_content
FROM trip_parameters AS ps
WHERE (ps.vehicle_id=p.vehicle_id)
AND (ps.trip_paramid<p.trip_paramid)
ORDER BY trip_paramid DESC
LIMIT 1
) AS prev_content
FROM trip_parameters AS p
WHERE (p.vehicle_id=13)
AND (creation_time>='2012-11-08')
AND (creation_time<DATE_ADD('2012-11-08', INTERVAL 1 DAY))
ORDER BY p.trip_paramid
) AS log
WHERE (fuel_content-prev_content)>2

Test it:
select sum(t2.fuel_content-t1.fuel_content) TotalFuel,t1.vehicle_id,t1.trip_paramid as rowIdA,
t2.trip_paramid as rowIdB,
t1.creation_time as timeA,
t2.creation_time as timeB,
t2.fuel_content fuel2,
t1.fuel_content fuel1,
(t2.fuel_content-t1.fuel_content) diffFuel
from trip_parameters t1, trip_parameters t2
where t1.trip_paramid<t2.trip_paramid
and t1.vehicle_id=t2.vehicle_id
and t1.vehicle_id=13
and t2.fuel_content-t1.fuel_content>2
order by rowIdA,rowIdB
where (rowIdA,rowIdB) are all possibles tuples without repetition, diffFuel is the difference between fuel quantity and TotalFuel is the sum of all refuel quanty.
The query compare all fuel content diferences for same vehicle(in this example, for vehicle with id=13) and only sum fuel quantity when the diff fuel is >2.
Regards.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Averaging the result of a subquery in mySQL - mysql

Aggregation functions move the aggregation level up. Until you have specified the grouping conditions for the average function, it will always return one row aggregated over all values, returned by the expression

Related

Create a MySQL view to Get Average for Count Column

MySQL - group by and count - best query

MS ACCESS Combining to result sets

MySQL Calculating a percentage from two counts in the same table

Difference between rows Mysql Query

Categories

Resources