sum of differences between two rows in mysql - mysql

Above is a table and i need to get the total distance covered by tyre.
Looking for a way to get the difference the sum to get the total distance covered.
Each the total distance if gotten by sum of difference between "removal" AND "insert" action sharing.
The end results should be 1100+300 = 1400

If there always are just one 'insert' and one 'removal' row per tyre and position, you can use conditional aggregation to compute the distance covered by tuple, and then add another level of aggregation at tyre level:
select tyreId, sum(distance_covered) distance_covered
from (
select
tyreId,
position,
sum(case action when 'removal' then distance else - distance end) distance_covered
from mytable
where action in ('insert', 'removal')
group by tyreId, position
)t
group by tyreId

Related

How to use a MySQL case statement in a window frame to calculate running total of a column?

I have two columns date and sales and the objective is to use case statement to create another column that shows the cumulative sum of sales for each date.
Here's the sales table
date
sales
2019-04-01
50
2019-04-02
100
2019-04-03
100
What would be the best way to write a case statement in order to meet the requirements below?
Desired output
date
sales
cumulative
2019-04-01
50
50
2019-04-02
100
150
2019-04-03
100
250
You don't need a CASE expression, but rather just use SUM() as a window function:
SELECT date, sales, SUM(sales) OVER (ORDER BY date) AS cumulative
FROM yourTable
ORDER BY date;
There's no need for a case statement here; you just need the SUM window function:
select date, sales, sum(sales) over (order by date, id)
from sales
(If date is unique, ordering by date is enough. If it is not, it is best practice to specify additional columns to order by to produce a non-arbitrary result, such as you would get if sometimes it considered rows with the same date in one order and sometimes in another.)
When you use the sum window function with no order by clause, the default window frame is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, causing it to sum the expression for all rows being returned. When you specify an order by, the default frame becomes RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, summing the expression for all rows up through the current row in the given order, producing a cumulative total. Because this is the default, there is no need to specify it; if you do specify it, it goes after the ORDER BY in the window specification.

Calculate median - mysql

Can somebody explain to me how this query below works?
It's a query to calculate a median Latitude from the table. I get tired of understanding this and still, I can't.
SELECT *
FROM station as st
where (
select count(LAT_N)
from station
where LAT_N < st.LAT_N
) = (
select count(LAT_N)
from STATION where LAT_N > st.LAT_N
);
The median is the middle value in a collection, which means there are as many values above it as below.
So for each row in the table, the first subquery counts the number of rows where LAT_N is lower than in the current row, and the second counts the number of rows where it's higher. Then it only returns the rows where these counts are the same.
Note that this won't work in many situations. The simplest example is when there are an even number of distinct values:
1
2
3
4
The median should be 2.5 (the mean of the two middle values), which doesn't exist in the table.
Another case is when there are duplicate values:
1
1
2
the median should be 1. But the count of lower values is 0, while the count of higher values is 1, so they won't be equal.
Oh that's some clever code! It's got a bug, but here's what it's trying to do.
We define median as the value(s) that have the same number of values greater and less than them. So here's the query in pseudocode:
for each station in st:
compute number of stations with latitude greater than the current station's latitude
compute number of stations with latitude less than the current station's latitude
If these two values are equal, include it in the result.
Bug:
For tables with an even number of distinct values, the median should be defined as the mean of the two middle values. This code doesn't handle that.

How to get the top records and have a different condition in sql?

I need to have the top most record of 'RouteID' who has also the most 'Economy Price' in the total.
I already tried this code:
SELECT RouteID
, Count (RouteID )
FROM [Session6].[dbo].[Schedules]
Group
by RouteID
Order
by count(RouteID ) DESC
But the result is this:
Result
The result are only the top most count of 'RouteID' not the most sum of 'EconomyPrice'
This is the table:
Table
Then why not simply sum up EconomyPrice instead counting the RouteIDs?
SELECT
RouteID,
SUM(EconomyPrice) AS TotalEconomyPrice
FROM
[Session6].[dbo].[Schedules]
GROUP BY
RouteID
ORDER BY
SUM(EconomyPrice) DESC
You cannot have the routes with the highest count and the highest total economy price at the top at the same time, since these conditions might be met by different route Ids. This is not a limitation of SQL. It is simply illogical.
You could also use the average AVG(EconomyPrice) as measure.

Access Query for TopN by Group

I've reviewed quite a bit of the sites (e.g. Allen Brown) for creating a query that produces top 5 (or N) values by group. I think I am getting hung up on the creation of a subquery because I'm referencing a previous query not a table.
I have a query started which counts by month the number of PIs (qryPICountbyMonth). Currently the below gives a data mismatch expression error:
SELECT qryPI.EventMonth, qryPI.PI_Issue, Count(qryPI.PI_Issue) AS
CountOfPI_Issue
FROM qryPI
GROUP BY qryPI.EventMonth, qryPI.PI_Issue
HAVING (((Count(qryPI.PI_Issue)) In (Select Top 5 [PI_Issue] From [qryPI]
Where [EventMonth]=[qryPI].[EventMonth] Order By [PI_Issue] Desc)))
ORDER BY qryPI.EventMonth DESC , Count(qryPI.PI_Issue) DESC;
It is built off a a separate query, qryPI
SELECT tblPI.EventDate, Format([EventDate],'yyyy-mm',1,1) AS EventMonth, tblPI.PI_Issue
FROM tblPI
WHERE (((tblPI.EventDate) >= #4/1/2016# And (tblPI.EventDate) <= #5/31/2016#))
GROUP BY tblPI.EventDate, Format([EventDate],'yyyy-mm',1,1), tblPI.PI_Issue;
I'm hoping to have it generate the top 5 counts of PI_Issue by EventMonth. If I haven't provided enough info let me know.
The problem (or at least a problem) is with [EventMonth]=[qryPI].[EventMonth]. Both your primary source and your lookup are called qryPI. You have to alias at least one of them.
You can't do this:
HAVING (((Count(qryPI.PI_Issue)) In (Select Top 5 [PI_Issue] From [qryPI]
count(field) will return an integer, not the set of values you're counting
I thought you could specify TopN in an Access query (it's in the properties), but you have to specify an order by clause, so it knows how to determine the TOP.
Have you tried:
SELECT top 5
tblPI.EventDate, Format([EventDate],'yyyy-mm',1,1) AS EventMonth, tblPI.PI_Issue
FROM tblPI
WHERE (((tblPI.EventDate) >= #4/1/2016# And (tblPI.EventDate) <= #5/31/2016#))
GROUP BY tblPI.EventDate, Format([EventDate],'yyyy-mm',1,1), tblPI.PI_Issue
order by PI_Issue
also not sure why you're using GROUP BY in your inner query as you're not returning any aggregate functions. Do you just need DISTINCT instead?
try:
SELECT distinct top 5
tblPI.EventDate, Format([EventDate],'yyyy-mm',1,1) AS EventMonth, tblPI.PI_Issue
FROM tblPI
WHERE (((tblPI.EventDate) >= #4/1/2016# And (tblPI.EventDate) <= #5/31/2016#))
order by PI_Issue
Actually, if I understand what you want, you need that GROUP BY instead of DISTINCT, but you also need to return the COUNT(*):
SELECT
Year([eventDate]) AS yr,
Month([eventDate]) AS mo,
tblPI.PI_issue,
Min(tblPI.eventDate) AS MinOfeventDate,
Max(tblPI.eventDate) AS MaxOfeventDate,
Count(tblPI.PI_issue) AS CountOfPI_issue
FROM tblPI
WHERE
(((tblPI.EventDate)>=#4/1/2016# And
(tblPI.EventDate)<#6/1/2016#))
GROUP BY
Year([eventDate]),
Month([eventDate]),
tblPI.PI_issue;
then you want to apply the TOPN function to cnt_issue in an outer query:
SELECT TOP 5 from qryInner
order by cnt_issue desc
except that TOP5 applies to all the query results, not the results grouped by yy/mm, which is what I'm assuming you want, so try this:
SELECT TOP 5
qry_inner.yr,
qry_inner.mo,
qry_inner.CountOfPI_issue,
qry_inner.PI_issue,
qry_inner.MinOfeventDate,
qry_inner.MaxOfeventDate
FROM qry_inner
ORDER BY qry_inner.CountOfPI_issue DESC;
As far as I know, Access doesn't allow you to select the top number of rows within a group, so you'll need to limit your outer query results to one month, then apply the TOP function.

Select average value X of SQL table column while not grouping by X

For the purposes of my question, I have a database in a MySQL server with info on many taxi rides (it is comprised of two tables, history_trips and trip_info).
In history_trips, each row's useful data is comprised of a unique alphanumeric ID, ride_id, the name of the rider, rider, and the time the ride ended, finishTime as a Y-m-d string.
In trip_info, each row's useful data similarly contains ride_id and rider, but also contains an integer, value (calculated in the back end from other data).
What I need to do is create a query that can find the average of all the maximum 'values' from all riders in a given time period. The riders included in this average are only considered if they completed less than X (let's say 3) rides within the aforementioned time period.
So far, I have a query that creates a grouped table containing the name of the rider, the finishTime of their highest 'value' ride, the value of said ride, and the number of rides, num_rides, they have taken in that time period. The AVG(b.value) column, however, gives me the same values as b.value, which is unexpected. I would like to find some way to return the average of the b.value column.
SELECT a.rider, a.finishTime, b.value, AVG(b.value), COUNT(a.rider) as num_rides
FROM history_trips as a, trip_info as b
WHERE a.finishTime > 'arbitrary_start_date_str' and a.ride_id = b.ride_id
and b.value = (SELECT MAX(value)
from trip_info where rider = b.rider and ride_id = b.ride_id)
GROUP BY a.rider
HAVING COUNT(a.rider) < 3
I am a novice in SQL but have read on some other forums that when using the AVG function on a value you must also GROUP BY that value. I was wondering if there is a way around that or if I am thinking of this problem incorrectly. Thanks in advance for any advice / solutions you might have!
The following worked for me:
SELECT AVG(ridergroups.maxvalues) avgmaxvalues FROM
(SELECT MAX(trip_info.value) maxvalues FROM trip_info
INNER JOIN history_trips
ON trip_info.rideid = history_trips.ride_id
WHERE history_trips.finishTime > '2010-06-20'
GROUP BY trip_info.rider
HAVING COUNT(trip_info.rider) < 3) ridergroups;
The subquery groups the maximum values by rider after filtering by date and rider count. The containing query calculates the average of the maximum values.