I have a bunch of data stored in a table, each row ends with a ts column, type TIMESTAMP.
I want to get incremental counts up until a point, so for instance I have the following query:
SELECT YEARWEEK(ts), DATE(ts), COUNT(*) FROMorderWHERE DATE(ts) >= '01/12/13' GROUP BY YEARWEEK(ts)
Which produces something like:
201346 20/11/2013 59
201347 24/11/2013 44
201348 01/12/2013 21
However I need a column that adds up the COUNTS up until that point, so I'd need something like:
201346 20/11/2013 59 59
201347 24/11/2013 44 103
201348 01/12/2013 21 124
How can I achieve this with mysql?? It's for a line graph, so I need to show that the numbers go up each week and I can't do that with the current SQL statement.
SET #SUM:=0;
SELECT YEARWEEK(ts), DATE(ts), COUNT(*),(#SUM := #SUM+COUNT(*)) as CSUM
FROM orders WHERE DATE(ts) >= '01/12/13' GROUP BY YEARWEEK(ts)
courtesy this answer from Andomar
You can use user variables to get the running count:
set #total := 0
select YEARWEEK(ts),
date(ts),
COUNT(*),
#total := #total + COUNT(*) as running_count
FROM order
WHERE date(ts) >= '01/12/13'
group by YEARWEEK(ts)
order by YEARWEEK(ts);
Related
I have a SQL table, one row is the revenue in the specific day, and I want to add a new column in the table, the value is the incremental (could be positive or negative) revenue between a specific day and the previous day, and wondering how to implement by SQL?
Here is an example,
original table,
...
Day1 100
Day2 200
Day3 150
...
new table (add incremental column at the end, and for first column, could assign zero),
Day1 100 0
Day2 200 100
Day3 150 -50
I am using MySQL/MySQL Workbench.
thanks in advance,
Lin
SELECT a.day, a.revenue , a.revenue-COALESCE(b.revenue,0) as previous_day_rev
FROM DailyRevenue a
LEFT JOIN DailyRevenue b on a.day=b.day-1
the query assume that each day has one record in the table. If there could be more than 1 row for each day you need to create a view that sums up all days grouping by day.
If you're okay with re-ordering the columns slightly, something like this is pretty simple to understand:
SET #prev := 0;
SELECT day, revenue - #prev AS diff, #prev := revenue AS revenue
FROM revenue ORDER BY day ASC;
The trick is that we calculate the difference to the previous first, then set the previous to the current and display it as the current in one step.
Note, this depends on the order being correct since the calculations are done during the returning of the rows, so you need to make sure you have an ORDER BY clause that returns the days in the correct order.
Try;
select
t.date_col, t.val_col,
case when t1.val_col is null then 0
else t.val_col - t1.val_col end diff
from (
select t.* , #r := #r + 1 lev
from tbl t,
(select #r := 0) r
order by t.date_col
) t
left join (
select t.* , #r1 := #r1 + 1 lev
from tbl t,
(select #r1 := 1) r
order by t.date_col
) t1
on t.lev = t1.lev
This will calculate value diff even if there is a missing date
I have a table output with as-Date Output
1-Jan 20
2-Jan 40
3-Jan 30
4-Jan 100
5-Jan 120
6-Jan 10
7-Jan 90
8-Jan 80
9-Jan 60till
31-Dec 120
I need to query the average of each date where the average is the culmilative average of values from 1st date to current date as below-
Date Output Average
1-Jan 20 20
2-Jan 40 30
3-Jan 30 30
4-Jan 100 47.5
5-Jan 120 62
6-Jan 10 53.5
Any one can help please?
SELECT `date`, `output`,
(SELECT avg(`output`) from Table1 where Table1.`date` <= b.`date`)
as `average` FROM Table1 b
sqlfiddle here
Axel's answer works, alternatively, you can do it in a single query, with variables:
set #count := 0;
set #total := 0;
select case when ((#count := #count + 1) and ((#total := #total + output) or 1))
then #total / #count
end rolling_average,
`date`,
`output`
from data
order by `date` asc
http://sqlfiddle.com/#!9/2e006/14
This avoids the dependent subquery, which depending on the size of your data may result in better performance.
Pala's idea is a good idea. In addition to lacking the order by, it also fails if the cumulative sum were ever zero or if output where ever NULL. This can easily be fixed:
select `date`, `output`,
if((#count := #count + 1) is not null,
if((#total := #total + coalesce(output, 0)) is not null,
#total/#count, 0
), 0
) as running_average
from data cross join
(select #count := 0, #total := 0) init
order by date;
Here's another way, although Pala's method scales better...
SELECT x.*
, AVG(y.output) avg
FROM output x
JOIN output y
ON y.date <= x.date
GROUP
BY x.date
ORDER
BY x.date;
The order by clause is apparently necessary post version 5.5/5.6
How can I get the date for the latest value change in one column with one SQL query?
Possible database situation:
Date State
2012-11-25 state one
2012-11-26 state one
2012-11-27 state two
2012-11-28 state two
2012-11-29 state one
2012-11-30 state one
So result should return 2012-11-29 as latest change state. If I group by State value, I will get the date for first time I have that state in database.
The query will group the table on state and show the state and in the date field the latest date created of that state.
From the given input the output would be
Date State
2012-11-30 state one
2012-11-28 state two
This will get you the last state:
-- Query 1
SELECT state
FROM tableX
ORDER BY date DESC
LIMIT 1 ;
Encapsulating the above, we can use it to get the date just before the last change:
-- Query 2
SELECT t.date
FROM tableX AS t
JOIN
( SELECT state
FROM tableX
ORDER BY date DESC
LIMIT 1
) AS last
ON last.state <> t.state
ORDER BY t.date DESC
LIMIT 1 ;
And then use that to find the date (or the whole row) where the last change occurred:
-- Query 3
SELECT a.date -- can also be used: a.*
FROM tableX AS a
JOIN
( SELECT t.date
FROM tableX AS t
JOIN
( SELECT state
FROM tableX
ORDER BY date DESC
LIMIT 1
) AS last
ON last.state <> t.state
ORDER BY t.date DESC
LIMIT 1
) AS b
ON a.date > b.date
ORDER BY a.date
LIMIT 1 ;
Tested in SQL-Fiddle
And a solution that uses MySQL variables:
-- Query 4
SELECT date
FROM
( SELECT t.date
, #r := (#s <> state) AS result
, #s := state AS prev_state
FROM tableX AS t
CROSS JOIN
( SELECT #r := 0, #s := ''
) AS dummy
ORDER BY t.date ASC
) AS tmp
WHERE result = 1
ORDER BY date DESC
LIMIT 1 ;
I believe this is the answer:
SELECT
DISTINCT State AS State, `Date`
FROM
Table_1 t1
WHERE t1.`Date`=(SELECT MAX(`Date`) FROM Table_1 WHERE State=t1.State)
...and the test:
http://sqlfiddle.com/#!2/8b0d8/5
If you add another column 'changed datetime' you can fill this using an update trigger that inserts NOW(). If you query your table ordering on the changed column, it will endup first.
CREATE TRIGGER `trigger` BEFORE UPDATE ON `table`
FOR EACH ROW
BEGIN
SET ROW.changed = NOW();
END$$
Try this ::
Select
MAX(`Date`), state from mytable
group by state
If you had been using postgres, you could compare different rows in the same table using "LEAD .. OVER" I have not managed to find the same functionallity in mysql.
A bit hairy, but I think this will do:
select min(t1.date) from table_1 t1 where
(select count(distinct state) from table_1 where table_1.date>=t1.date)=1
Basically, this asks for the first time no changes in state is found for any later values. Be warned, it may be this query scales terribly for large data sets....
I think your best choice here are analytical functions. Try this - it should be OK performance-wise:
SELECT *
FROM test
WHERE my_date = (SELECT MAX (my_date)
FROM (SELECT MY_DATE
FROM ( SELECT MY_DATE,
STATE,
LAG (state) OVER (ORDER BY MY_DATE)
lag_val
FROM test
ORDER BY MY_DATE) a
WHERE state != lag_val))
In the inner select, the LAG function gets the previous value in the STATE column and in the outer select I mark the date of a change - those with lag value different than the current state value. And outside, I'm getting the latest date from those dates of a change... I hope that this is what you needed.
SELECT MAX(DATE) FROM YOUR_TABLE
Above answer doesn't seem to satisfy what OP needs.
UPDATED ANSWER WITH AFTER INSERT/UPDATE TRIGGER
DELCARE #latestState varchar;
DELCARE #latestDate date;
CREATE TRIGGER latestInsertTrigger AFTER INSERT ON myTable
FOR EACH ROW
BEGIN
IF OLD.DATE <> NEW.date THEN
SET #latestState = NEW.state
SET #latestDate = NEW.date
END IF
END
;
CREATE TRIGGER latestUpdateTrigger AFTER UPDATE ON myTable
FOR EACH ROW
BEGIN
IF OLD.DATE = NEW.date AND OLD.STATE <> NEW.STATE THEN
SET #latestState = NEW.state
SET #latestDate = NEW.date
END IF
END
;
You may use the following query to get the latest record added/updated:
SELECT DATE, STATE FROM myTable
WHERE STATE = #latestState
OR DATE = #latestDate
ORDER BY DATE DESC
;
Results:
DATE STATE
November, 30 2012 00:00:00+0000 state one
November, 28 2012 00:00:00+0000 state two
November, 27 2012 00:00:00+0000 state two
The above query results needs to be limitted to 2, 3 or n based on what you need.
Frankly it seems like you want to get max from both columns based on the data sample you have given. Assuming that your state only increases with the date. Only I wish if the state was an integer :D
Then union of two max sub queries on both columns would have solved it easily. Still a string manipulation regex can find what's the max in state column. Finally this approach needs limit x. However it still has lope hole. Anyway it took me sometime to figure your need out :$
I need to get data between Decemember 2012 to November 2014.
Each month I only need 1500 rows.
For example:
SELECT * FROM data WHERE YEAR(submit_date) = 2012 AND MONTH(submit_date) = 12 limit 1500;
SELECT * FROM data WHERE YEAR(submit_date) = 2013 AND MONTH(submit_date) = 1 limit 1500;
SELECT * FROM data WHERE YEAR(submit_date) = 2013 AND MONTH(submit_date) = 2 limit 1500;
SELECT * FROM data WHERE YEAR(submit_date) = 2013 AND MONTH(submit_date) = 3 limit 1500;
and until Nov 2014
Is there a way to write SQL query smaller?
There are some options list here: http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
IMHO one of the best is using a row-counter:
set #num := 0, #type := '';
select id, name, submit_date,
#num := if(#type = CONCAT(YEAR(submit_date), MONTH(submit_date)), #num + 1, 1) as row_number,
#type := CONCAT(YEAR(submit_date), MONTH(submit_date)) as dummy
from data force index(IX_submit_date)
group by id, name, submit_date
having row_number <= 2;
You can test it here: http://sqlfiddle.com/#!2/e829c/13 (I do a cut for 2 elements, not for 1500)
I think you're looking for a GROUP BY clause. I would need to know a bit more to give you a definitive answer. But the following pseduo-query might guide you in the right direction.
SELECT *, SUM(some_field)
FROM data
GROUP BY MONTH(submit_date)
Or if you only need 1500 rows, select the top 1500 ordered by the date
SELECT TOP(1500) *
FROM data
WHERE submit_date > '12-01-2012' AND submit_date < '11-01-2014'
ORDER BY MONTH(submit_date)
With MySQL you can use LIMIT
SELECT *
FROM data
WHERE submit_date > '12-01-2012' AND submit_date < '11-01-2014'
ORDER BY MONTH(submit_date)
LIMIT 0,1500;
You can do it almost like you have it, just add a UNION between your queries. But you still have to create 1 query per month.
Otherwise you need to enumerate the rows that are returned. You need to first order and enumerate your records, then you can do a select on that select to get only the top X. Not sure if you want to include the last month or not.
SET #prev_date='';
SELECT * FROM (
SELECT IF(#prev_date=submit_date, #incr := #incr+1, #incr:=1) AS row_num,
data.*,
(#prev_date := submit_date) AS set_prev_date
FROM data WHERE submit_date BETWEEN "2012-12-01" AND "2014-11-30"
ORDER BY submit_date
) tmp WHERE row_num<1500;
I wanted to get the latest 4 dates for each symbolid. I adapted the code here as follows:
set #num := 0, #symbolid := '';
select symbolid, date,
#num := if(#symbolid = symbolid, #num + 1, 1) as row_number,
#symbolid := symbolid as dummy
from projections
group by symbolid, date desc
having row_number < 5
and get the following results:
symbolid date row_number dummy
1 '2011-09-01 00:00:00' 1 1
1 '2011-08-31 00:00:00' 3 1
1 '2011-08-30 00:00:00' 5 1
2 '2011-09-01 00:00:00' 1 2
2 '2011-08-31 00:00:00' 3 2
2 '2011-08-30 00:00:00' 5 2
3 '2011-09-01 00:00:00' 1 3
3 '2011-08-31 00:00:00' 3 3
3 '2011-08-30 00:00:00' 5 3
4 '2011-09-01 00:00:00' 1 4
...
The obvious question is, why did I only get 3 rows per symbolid, and why are they numbered 1,3,5? A few details:
I tried both forcing an index and not (as seen here), and got the same results both ways.
The dates are correct, i.e., the listing correctly shows the top 3 dates per symbolid, but the row_number value is off
When I don't use the "having" statement, the row numbers are correct, i.e., the most recent date is 1, the next most recent is 2, etc
Obviously the row_number computed field is being affected by the "having" clause, but I don't know how to fix it.
I realize that I could just change the "having" to "having row_number < 7" (6 gives the same as 5), but it's very ugly and would like to know what to do to make it "behave".
I'm not 100% sure why it behaves this way (maybe it's because logically SELECT is processed prior to ORDER BY), but it should work as expected:
SELECT *
FROM
(
select symbolid, date,
#num := if(#symbolid = symbolid, #num + 1, 1) as row_number,
#symbolid := symbolid as dummy
from projections
INNER JOIN (SELECT #symbolid:=0)c
INNER JOIN (SELECT #num:=0)d
group by symbolid, date desc
) a
WHERE row_number < 5
The user defined variables does not work well, (refer here)
As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get the results you expect, but this is not guaranteed. The order of evaluation for expressions involving user variables is undefined and may change based on the elements contained within a given statement; in addition, this order is not guaranteed to be the same between releases of the MySQL Server. In SELECT #a, #a:=#a+1, ..., you might think that MySQL will evaluate #a first and then do an assignment second. However, changing the statement (for example, by adding a GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an execution plan with a different order of evaluation.
Here is my proposal
select symbolid,
substring_index(group_concat(date order by date desc), ',', 4) as last_4_dates
from projections
group by symbolid
The drawback of this approach is it will group collapse the date,
and you need to explode before you can actually use it.
Final code:
set #num := 0, #symbolid := '';
select d.* from
(
select symbolid, date,
#num := if(#symbolid = symbolid, #num + 1, 1) as row_number,
#symbolid := symbolid as dummy
from projections
order by symbolid, date desc
) d
where d.row_number < 5