current average for each row of data - mysql

I have a table output with as-Date Output
1-Jan 20
2-Jan 40
3-Jan 30
4-Jan 100
5-Jan 120
6-Jan 10
7-Jan 90
8-Jan 80
9-Jan 60till
31-Dec 120
I need to query the average of each date where the average is the culmilative average of values from 1st date to current date as below-
Date Output Average
1-Jan 20 20
2-Jan 40 30
3-Jan 30 30
4-Jan 100 47.5
5-Jan 120 62
6-Jan 10 53.5
Any one can help please?

SELECT `date`, `output`,
(SELECT avg(`output`) from Table1 where Table1.`date` <= b.`date`)
as `average` FROM Table1 b
sqlfiddle here

Axel's answer works, alternatively, you can do it in a single query, with variables:
set #count := 0;
set #total := 0;
select case when ((#count := #count + 1) and ((#total := #total + output) or 1))
then #total / #count
end rolling_average,
`date`,
`output`
from data
order by `date` asc
http://sqlfiddle.com/#!9/2e006/14
This avoids the dependent subquery, which depending on the size of your data may result in better performance.

Pala's idea is a good idea. In addition to lacking the order by, it also fails if the cumulative sum were ever zero or if output where ever NULL. This can easily be fixed:
select `date`, `output`,
if((#count := #count + 1) is not null,
if((#total := #total + coalesce(output, 0)) is not null,
#total/#count, 0
), 0
) as running_average
from data cross join
(select #count := 0, #total := 0) init
order by date;

Here's another way, although Pala's method scales better...
SELECT x.*
, AVG(y.output) avg
FROM output x
JOIN output y
ON y.date <= x.date
GROUP
BY x.date
ORDER
BY x.date;
The order by clause is apparently necessary post version 5.5/5.6

Related

How to eliminate only continuous duplicates but not all duplicates in a select query (MySQL)?

I have a table like this:
01-Jul-17 100
02-Jul-17 100
03-Jul-17 300
04-Jul-17 300
05-Jul-17 500
06-Jul-17 500
07-Jul-17 300
08-Jul-17 400
09-Jul-17 100
10-Jul-17 100
What I want to output is (in this order) by eliminating the continuous duplicates but not all duplicates:
100
300
500
300
400
100
I cannot select Distinct, as it will eliminate the second instances of 300, 100. Is there a way to achieve this result in MySQL?
Thanks!
You want to get the previous value. If the dates really have no gaps or duplicates, just do:
select t.*
from t left join
t tprev
on t.col1 = date_add(tprev.col1, interval 1 day)
where tprev.col2 is null or tprev.col2 <> t.col2;
EDIT:
If the dates don't meet these conditions, then you can use variables:
select t.*
from (select t.*,
(#rn := if(#v = col2, #rn + 1,
if(#v := col2, 1, 1)
)
) as rn
from t cross join
(select #v := 0, #rn := 0) params
order by t.col1
) t
where rn = 1;
Note that MySQL does not guarantee the order of evaluation of expressions in the SELECT. So variables should not be assigned in one expression and then used in another -- they should be assigned in a single expression.
One way to handle this problem is by using session variables to track the changes of the values as ordered by your date column. In the query below, we keep track of the value, ordered by date, and assign a row number to each group of identical value. Then, only the first value in each group is retained. Note that this approach is robust to any number of duplicates. It is also robust with respect to there being gaps in your dates, so long as each record can be ordered by date.
SET #rn = 1;
SET #val = NULL;
SELECT t.val
FROM
(
SELECT
#rn:=CASE WHEN #val = val THEN #rn+1 ELSE 1 END rn,
#val:=val AS val,
dt
FROM yourTable
ORDER BY dt
) t
WHERE t.rn = 1
ORDER BY t.dt;
Output:
Demo here:
Rextester
You can make use of lag and lead functions.
select y from (select y , lag(y,1,0) over (order by x) as prev_y from t1) where y <> prev_y;

Incremental Count(*) using YEARWEEK

I have a bunch of data stored in a table, each row ends with a ts column, type TIMESTAMP.
I want to get incremental counts up until a point, so for instance I have the following query:
SELECT YEARWEEK(ts), DATE(ts), COUNT(*) FROMorderWHERE DATE(ts) >= '01/12/13' GROUP BY YEARWEEK(ts)
Which produces something like:
201346 20/11/2013 59
201347 24/11/2013 44
201348 01/12/2013 21
However I need a column that adds up the COUNTS up until that point, so I'd need something like:
201346 20/11/2013 59 59
201347 24/11/2013 44 103
201348 01/12/2013 21 124
How can I achieve this with mysql?? It's for a line graph, so I need to show that the numbers go up each week and I can't do that with the current SQL statement.
SET #SUM:=0;
SELECT YEARWEEK(ts), DATE(ts), COUNT(*),(#SUM := #SUM+COUNT(*)) as CSUM
FROM orders WHERE DATE(ts) >= '01/12/13' GROUP BY YEARWEEK(ts)
courtesy this answer from Andomar
You can use user variables to get the running count:
set #total := 0
select YEARWEEK(ts),
date(ts),
COUNT(*),
#total := #total + COUNT(*) as running_count
FROM order
WHERE date(ts) >= '01/12/13'
group by YEARWEEK(ts)
order by YEARWEEK(ts);

Mysql query get SUM() specific row?

Is it possible to get specific row in query using like SUM?
Example:
id tickets
1 10 1-10 10=10
2 35 11-45 10+35=45
3 45 46-90 10+35+45=90
4 110 91-200 10+35+45+110=200
Total: 200 tickets(In SUM), I need to get row ID who have ticket with number like 23(Output would be ID: 2, because ID: 2 contains 11-45tickets in SUM)
You can do it by defining a local variable into your select query (in form clause), e.g.:
select id, #total := #total + tickets as seats
from test, (select #total := 0) t
Here is the SQL Fiddle.
You seem to want the row where "23" fits in. I think this does the trick:
select t.*
from (select t.*, (#total := #total + tickets) as running_total
from t cross join
(select #total := 0) params
order by id
) t
where 23 > running_total - tickets and 23 <= running_total;
SELECT
d.id
,d.tickets
,CONCAT(
TRIM(CAST(d.RunningTotal - d.tickets + 1 AS CHAR(10)))
,'-'
,TRIM(CAST(d.RunningTotal AS CHAR(10)))
) as TicketRange
,d.RunningTotal
FROM
(
SELECT
id
,tickets
,#total := #total + tickets as RunningTotal
FROM
test
CROSS JOIN (select #total := 0) var
ORDER BY
id
) d
This is similar to Darshan's answer but there are a few key differences:
You shouldn't use implicit join syntax, explicit join has more functionality in the long run and has been a standard for more than 20 years
ORDER BY will make a huge difference on your running total when calculated with a variable! if you change the order it will calculate differently so you need to consider how you want to do the running total, by date? by id? by??? and make sure you put it in the query.
finally I actually calculated the range as well.
And here is how you can do it without using variables:
SELECT
d.id
,d.tickets
,CONCAT(
TRIM(d.LowRange)
,'-'
,TRIM(
CAST(RunningTotal AS CHAR(10))
)
) as TicketRange
,d.RunningTotal
FROM
(
SELECT
t.id
,t.tickets
,CAST(COALESCE(SUM(t2.tickets),0) + 1 AS CHAR(10)) as LowRange
,t.tickets + COALESCE(SUM(t2.tickets),0) as RunningTotal
FROM
test t
LEFT JOIN test t2
ON t.id > t2. id
GROUP BY
t.id
,t.tickets
) d

calculate the differences between two rows in SQL

I have a SQL table, one row is the revenue in the specific day, and I want to add a new column in the table, the value is the incremental (could be positive or negative) revenue between a specific day and the previous day, and wondering how to implement by SQL?
Here is an example,
original table,
...
Day1 100
Day2 200
Day3 150
...
new table (add incremental column at the end, and for first column, could assign zero),
Day1 100 0
Day2 200 100
Day3 150 -50
I am using MySQL/MySQL Workbench.
thanks in advance,
Lin
SELECT a.day, a.revenue , a.revenue-COALESCE(b.revenue,0) as previous_day_rev
FROM DailyRevenue a
LEFT JOIN DailyRevenue b on a.day=b.day-1
the query assume that each day has one record in the table. If there could be more than 1 row for each day you need to create a view that sums up all days grouping by day.
If you're okay with re-ordering the columns slightly, something like this is pretty simple to understand:
SET #prev := 0;
SELECT day, revenue - #prev AS diff, #prev := revenue AS revenue
FROM revenue ORDER BY day ASC;
The trick is that we calculate the difference to the previous first, then set the previous to the current and display it as the current in one step.
Note, this depends on the order being correct since the calculations are done during the returning of the rows, so you need to make sure you have an ORDER BY clause that returns the days in the correct order.
Try;
select
t.date_col, t.val_col,
case when t1.val_col is null then 0
else t.val_col - t1.val_col end diff
from (
select t.* , #r := #r + 1 lev
from tbl t,
(select #r := 0) r
order by t.date_col
) t
left join (
select t.* , #r1 := #r1 + 1 lev
from tbl t,
(select #r1 := 1) r
order by t.date_col
) t1
on t.lev = t1.lev
This will calculate value diff even if there is a missing date

Limit rows of every month?

I need to get data between Decemember 2012 to November 2014.
Each month I only need 1500 rows.
For example:
SELECT * FROM data WHERE YEAR(submit_date) = 2012 AND MONTH(submit_date) = 12 limit 1500;
SELECT * FROM data WHERE YEAR(submit_date) = 2013 AND MONTH(submit_date) = 1 limit 1500;
SELECT * FROM data WHERE YEAR(submit_date) = 2013 AND MONTH(submit_date) = 2 limit 1500;
SELECT * FROM data WHERE YEAR(submit_date) = 2013 AND MONTH(submit_date) = 3 limit 1500;
and until Nov 2014
Is there a way to write SQL query smaller?
There are some options list here: http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
IMHO one of the best is using a row-counter:
set #num := 0, #type := '';
select id, name, submit_date,
#num := if(#type = CONCAT(YEAR(submit_date), MONTH(submit_date)), #num + 1, 1) as row_number,
#type := CONCAT(YEAR(submit_date), MONTH(submit_date)) as dummy
from data force index(IX_submit_date)
group by id, name, submit_date
having row_number <= 2;
You can test it here: http://sqlfiddle.com/#!2/e829c/13 (I do a cut for 2 elements, not for 1500)
I think you're looking for a GROUP BY clause. I would need to know a bit more to give you a definitive answer. But the following pseduo-query might guide you in the right direction.
SELECT *, SUM(some_field)
FROM data
GROUP BY MONTH(submit_date)
Or if you only need 1500 rows, select the top 1500 ordered by the date
SELECT TOP(1500) *
FROM data
WHERE submit_date > '12-01-2012' AND submit_date < '11-01-2014'
ORDER BY MONTH(submit_date)
With MySQL you can use LIMIT
SELECT *
FROM data
WHERE submit_date > '12-01-2012' AND submit_date < '11-01-2014'
ORDER BY MONTH(submit_date)
LIMIT 0,1500;
You can do it almost like you have it, just add a UNION between your queries. But you still have to create 1 query per month.
Otherwise you need to enumerate the rows that are returned. You need to first order and enumerate your records, then you can do a select on that select to get only the top X. Not sure if you want to include the last month or not.
SET #prev_date='';
SELECT * FROM (
SELECT IF(#prev_date=submit_date, #incr := #incr+1, #incr:=1) AS row_num,
data.*,
(#prev_date := submit_date) AS set_prev_date
FROM data WHERE submit_date BETWEEN "2012-12-01" AND "2014-11-30"
ORDER BY submit_date
) tmp WHERE row_num<1500;