SQL Query Conditional accumulation - mysql

it is possible to display accumulated data, resetting the count based on a condition?
I would like to create a script to accumulate if there is value 1 in cell number, but if another value the count should be restarted. Something like what is displayed in the column cumulative_with_condition.
+----+------------+--------+
| id | release | number |
+----+------------+--------+
| 1 | 2016-07-08 | 4 |
| 2 | 2016-07-09 | 1 |
| 3 | 2016-07-10 | 1 |
| 4 | 2016-07-12 | 2 |
| 5 | 2016-07-13 | 1 |
| 6 | 2016-07-14 | 1 |
| 7 | 2016-07-15 | 1 |
| 8 | 2016-07-16 | 2-3 |
| 9 | 2016-07-17 | 3 |
| 10 | 2016-07-18 | 1 |
+----+------------+--------+
select * from version where id > 1 and id < 9;
+----+------------+--------+---------------------------+
| id | release | number | cumulative_with_condition |
+----+------------+--------+---------------------------+
| 2 | 2016-07-09 | 1 | 1 |
| 3 | 2016-07-10 | 1 | 2 |
| 4 | 2016-07-12 | 2 | 0 |
| 5 | 2016-07-13 | 1 | 1 |
| 6 | 2016-07-14 | 1 | 2 |
| 7 | 2016-07-15 | 1 | 3 |
| 8 | 2016-07-16 | 2-3 | 0 |
+----+------------+--------+---------------------------+

You want something like row_number() (not exactly, but like that). You can do that using variables:
select t.*,
(#rn := if(number = 1, #rn + 1,
if(#n := number, 0, 0)
)
) as cumulative_with_condition
from t cross join
(select #n := '', #rn := 0) params
order by t.id;

As an alternative to using user variables, as demonstrated by Gordon Linoff, in this case it's also possible to self-join, group and count:
SELECT t.id, t.release, t.number, COUNT(version.id) AS cumulative_with_condition
FROM version RIGHT JOIN (
SELECT highs.*, MAX(lows.id) min
FROM version lows RIGHT JOIN version highs ON lows.id <= highs.id
WHERE lows.number <> '1'
GROUP BY highs.id
) t ON version.id > t.min AND version.id <= t.id
WHERE t.id > 1 AND t.id < 9
GROUP BY t.id
See it on sqlfiddle.
But, frankly, neither approach is particularly elegant—as I commented previously, you're probably best off implementing this within your application code.

Related

Adding a moving average column to a table using values from previous 2 entries

I currently have the following simplified tables in my database. The points table contains rows of points awarded to each user for every bid form they have voted in.
I would like to add a column to this table that for each row, it shows the AVERAGE of the previous TWO points awarded to THAT user.
Users
+----+----------------------+
| id | name |
+----+----------------------+
| 1 | Flossie Schamberger |
| 2 | Lawson Graham |
| 3 | Hadley Reilly |
+----+----------------------+
Bid Forms
+----+-----------------+
| id | name |
+----+-----------------+
| 1 | Summer 2017 |
| 2 | Winter 2017 |
| 3 | Summer 2018 |
| 4 | Winter 2019 |
| 5 | Summer 2019 |
+----+-----------------+
Points
+-----+---------+--------------------+------------+------------+
| id | user_id | leave_bid_forms_id | bid_points | date |
+-----+---------+--------------------+------------+------------+
| 1 | 1 | 1 | 6 | 2016-06-19 |
| 2 | 2 | 1 | 8 | 2016-06-19 |
| 3 | 3 | 1 | 10 | 2016-06-19 |
| 4 | 1 | 2 | 4 | 2016-12-18 |
| 5 | 2 | 2 | 8 | 2016-12-18 |
| 6 | 3 | 2 | 4 | 2016-12-18 |
| 7 | 1 | 3 | 10 | 2017-06-18 |
| 8 | 2 | 3 | 12 | 2017-06-18 |
| 9 | 3 | 3 | 4 | 2017-06-18 |
| 10 | 1 | 4 | 4 | 2017-12-17 |
| 11 | 2 | 4 | 4 | 2017-12-17 |
| 12 | 3 | 4 | 2 | 2017-12-17 |
| 13 | 1 | 5 | 16 | 2018-06-17 |
| 14 | 2 | 5 | 12 | 2018-06-17 |
| 15 | 3 | 5 | 10 | 2018-06-17 |
+-----+---------+--------------------+------------+------------+
For each row in the points table I would like an average_points column to be calculated like follows.
The average point column is the average of that users PREVIOUS 2 points. So for the first entry in the table for each user, the average is obviously 0 because there were no previous points awarded to them.
The previous 2 points for each user should be determined using the date column.
The table below is what I would like to have as the final output.
For clarity, to the side of the table, I have added the calculation and numbers used to arrive at the value in the averaged_points column.
+-----+---------+--------------------+------------+-----------------+
| id | user_id | leave_bid_forms_id | date | averaged_points |
+-----+---------+--------------------+------------+-----------------+
| 1 | 1 | 1 | 2016-06-19 | 0 | ( 0 + 0 ) / 2
| 2 | 2 | 1 | 2016-06-19 | 0 | ( 0 + 0 ) / 2
| 3 | 3 | 1 | 2016-06-19 | 0 | ( 0 + 0 ) / 2
| 4 | 1 | 2 | 2016-12-18 | 3 | ( 6 + 0 ) / 2
| 5 | 2 | 2 | 2016-12-18 | 4 | ( 8 + 0 ) / 2
| 6 | 3 | 2 | 2016-12-18 | 5 | ( 10 + 0) / 2
| 7 | 1 | 3 | 2017-06-18 | 5 | ( 4 + 6 ) / 2
| 8 | 2 | 3 | 2017-06-18 | 8 | ( 8 + 8 ) / 2
| 9 | 3 | 3 | 2017-06-18 | 7 | ( 4 + 10) / 2
| 10 | 1 | 4 | 2017-12-17 | 7 | ( 10 + 4) / 2
| 11 | 2 | 4 | 2017-12-17 | 10 | ( 12 + 8) / 2
| 12 | 3 | 4 | 2017-12-17 | 4 | ( 4 + 4 ) / 2
| 13 | 1 | 5 | 2018-06-17 | 7 | ( 4 + 10) / 2
| 14 | 2 | 5 | 2018-06-17 | 8 | ( 4 + 12) / 2
| 15 | 3 | 5 | 2018-06-17 | 3 | ( 2 + 4 ) / 2
+-----+---------+--------------------+------------+-----------------+
I've been trying to use subqueries to solve this issue as AVG doesn't seem to be affected by any LIMIT clause I have.
So far I have come up with
select id, user_id, leave_bid_forms_id, `date`,
(
SELECT
AVG(bid_points)
FROM (
Select `bid_points`
FROM points as p2
ORDER BY p2.date DESC
Limit 2
) as thing
) AS average_points
from points as p1
This is in this sqlfiddle but to be honest I'm out of my depth here.
Am I on the right path? Wondering if someone would be able to show me where I need to tweak things please!
Thanks.
EDIT
Using the the answer below as a basis I was able to tweak the sql to work with the tables provided in the original sqlfiddle.
I have added that to this sqlfiddle to show it working
The corrected sql to match the code above is
select p.*,
IFNULL(( (coalesce(points_1, 0) + coalesce(points_2, 0)) /
( (points_1 is not null) + (points_2 is not null) )
),0) as prev_2_avg
from (select p.*,
(select p2.bid_points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1
) as points_1,
(select p2.bid_points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1, 1
) as points_2
from points as p
) p;
Although I am about to ask another question about the best way to make this dynamic with the number of previous poingt that need to be averaged.
You can use window functions, which were introduced in MySQL 8.
select p.*,
avg(points) over (partition by user_id
order by date
rows between 2 preceding and 1 preceding
) as prev_2_avg
from p;
In earlier versions, this is a real pain, because MySQL does not support nested correlation clauses. One method is with a separate column for each one:
select p.*,
( (coalesce(points_1, 0) + coalesce(points_2, 0)) /
( (points_1 is not null) + (points_2 is not null) )
) as prev_2_avg
from (select p.*,
(select p2.points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1
) as points_1,
(select p2.points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1, 1
) as points_2
from p
) p;

mysql sequence number by value coloumn (query UPDATE)

example:
I have a table with the columns
______________________
|field_id|Code|seq_num|
| 1 | a | 1 |
| 1 | a | 2 |
| 1 | a | 3 |
| 2 | a | 4 |
| 2 | a | 5 |
| 3 | a | 6 |
| 3 | a | 7 |
| 3 | a | 8 |
how to query it, so sequence number look like this
_____________________
|field_id|Code|seq_num|
| 1 | a | 1 |
| 1 | a | 2 |
| 1 | a | 3 |
| 2 | a | 1 |
| 2 | a | 2 |
| 3 | a | 1 |
| 3 | a | 2 |
| 3 | a | 3 |
please help!!
One method is to get the minimum sequence for the field:
select t.field_id, t.code,
(seq_num - min_seqnum + 1) as seqnum
from t join
(select field_id, min(seq_num) as min_seq_num
from t
group by field_id
) f
on t.field_id = f.field_id;
You can also do this using variables, if you don't trust the current sequence numbers to have no gaps:
select . . .,
(#rn := if(#f = field_id, #rn + 1,
if(#f := field_id, 1, 1)
)
) as seq_no
from (select t.*
from t
order by field_id, seq_no
) t cross join
(select #f := '', #rn := 0) params;

Row counter per Column

Say I have a table like so
| id | user_id | event_id | created_at |
|----|---------|----------|------------|
| 1 | 5 | 10 | 2015-01-01 |
| 2 | 6 | 7 | 2015-01-02 |
| 3 | 3 | 8 | 2015-01-01 |
| 4 | 5 | 9 | 2015-01-04 |
| 5 | 5 | 10 | 2015-01-02 |
| 6 | 6 | 1 | 2015-01-01 |
I want to be able to generate a counter of events per user. So my result would be:
| counter | user_id | event_id | created_at |
|---------|---------|----------|------------|
| 1 | 5 | 10 | 2015-01-01 |
| 1 | 6 | 7 | 2015-01-02 |
| 1 | 3 | 8 | 2015-01-01 |
| 2 | 5 | 9 | 2015-01-04 |
| 3 | 5 | 10 | 2015-01-02 |
| 2 | 6 | 1 | 2015-01-01 |
One idea is to self join the table and group by to replicate row_number() over.. function available in other RDBMS.
Check this Rextester Demo and see second query, to understand how inner join works in this case.
select t1.user_id,
t1.event_id,
t1.created_at,
count(*) as counter
from your_table t1
inner join your_table t2
on t1.user_id=t2.user_id
and t1.id>=t2.id
group by t1.user_id,
t1.event_id,
t1.created_at
order by t1.user_id,t1.event_id;
Output:
+---------+----------+------------+---------+
| user_id | event_id | created_at | counter |
+---------+----------+------------+---------+
| 3 | 8 | 01-01-2015 | 1 |
| 5 | 10 | 01-01-2015 | 1 |
| 5 | 10 | 02-01-2015 | 3 |
| 5 | 9 | 04-01-2015 | 2 |
| 6 | 1 | 01-01-2015 | 2 |
| 6 | 7 | 02-01-2015 | 1 |
+---------+----------+------------+---------+
Try the following:
select counter,
xx.user_id,
xx.event_id,
xx.created_at
from xx
join (select a.id,
a.user_id,
count(*) as counter
from xx as a
join xx as b
on a.user_id=b.user_id
and b.id<=a.id
group by 1,2) as counts
on xx.id=counts.id
Use a join to generate rows for each id with all the other lower ids for that user below it and count them.
Try This one:
Sub query will help to get this rsult.
select (select count(*) from user_event iue where iue.user_id == oue.user_id) as counter,
oue.user_id,
oue.event_id,
oue.created_at
from user_event oue
You could try to use a variable as a table, cross join it with the source table and reset whenever user id changes.
SELECT #counter := CASE
WHEN #user = user_id THEN #counter + 1
ELSE 1
END AS counter,
#user := user_id AS user_id,
event_id,
created_at
FROM your_table m,
(SELECT #counter := 0,
#user := '') AS t
ORDER BY user_id;
I've created a demo here

select sum of max values from multiple incrementing sequences

I want to calculate a sum of the max values from sequence of increment values.
for this data set:
time_stamp count
1467820429 6 *
1467820428 5
1467820427 4
1467820426 3
1467820416 2
1467820415 1
1467820413 0
1467820412 3 *
1467820411 2
1467820409 1
1467820408 0
1467820405 1 *
1467820404 0
1467820400 5 *
answer = 6 + 3 + 1 + 5 = 15
how can i write a MySQL compatible SQL statement to acheve this
As I mentioned in comments there is no efficient way to do this in Mysql atleast to my knowledge
Try this
SELECT Sum(CASE
WHEN `count` >= prev_cnt THEN `count`
ELSE 0
END)
FROM (SELECT *,
IFnull((SELECT `count`
FROM yourtable b
WHERE a.`time_stamp` < b.`time_stamp`
ORDER BY `time_stamp` LIMIT 1), `count`) AS prev_cnt
FROM yourtable a) c
you can get it in following method
mysql> select time_stamp,count,if (count=0,#curRank :=0,#curRank := #curRank + 1) as rank from ff,(SELECT #curRank := 0) r;
+------------+-------+------+
| time_stamp | count | rank |
+------------+-------+------+
| 1467820429 | 6 | 1 |
| 1467820428 | 5 | 2 |
| 1467820427 | 4 | 3 |
| 1467820426 | 3 | 4 |
| 1467820415 | 2 | 5 |
| 1467820415 | 1 | 6 |
| 1467820413 | 0 | 0 |
| 1467820412 | 3 | 1 |
| 1467820411 | 2 | 2 |
| 1467820409 | 1 | 3 |
| 1467820408 | 0 | 0 |
| 1467820405 | 1 | 1 |
| 1467820404 | 0 | 0 |
| 1467820408 | 5 | 1 |
+------------+-------+------+
14 rows in set (0.00 sec)
mysql> SELECT * FROM (select time_stamp,count,if (count=0,#curRank :=0,#curRank := #curRank + 1) as rank from ff,(SELECT #curRank := 0) r) t WHERE rank=1;
+------------+-------+------+
| time_stamp | count | rank |
+------------+-------+------+
| 1467820408 | 5 | 1 |
| 1467820412 | 3 | 1 |
| 1467820429 | 6 | 1 |
| 1467820405 | 1 | 1 |
+------------+-------+------+
4 rows in set (0.00 sec)
mysql> SELECT sum(count) as total FROM
(select time_stamp,count,if (count=0,#curRank :=0,#curRank := #curRank + 1) as rank from ff,
(SELECT #curRank := 0) r) t WHERE rank=1;
+-------+
| total |
+-------+
| 15 |
+-------+
1 row in set (0.00 sec)
you can get it with simple inner query
SELECT SUM(a.cnt)
FROM
( SELECT x.*
, MIN(y.time_stamp) next
FROM my_table x
LEFT
JOIN my_table y
ON y.time_stamp > x.time_stamp
GROUP
BY x.time_stamp
) a
LEFT
JOIN my_table b
ON b.time_stamp = a.next
AND b.cnt > a.cnt
WHERE b.cnt IS NULL;
You need to identify when a value shifts. One way to get the previous value uses variables:
select sum(count)
from (select t.*,
(if((#old_c := #c) is null, 0, -- never happens
if((#c := count) is not null, #old_c, #old_c)
)
) as prev_count
from t cross join
(select #c := -1) params
order by time_stamp
) t
where prev_count >= count;
The expression for getting the previous count is a bit complicated. MySQL does not guarantee the order of evaluation of expressions, so the assignment of the new value of count and returning the old value needs to be in a single expression.
You need GROUP BY and HAVING, like this:
select sum ( count )
from table
group by time_stamp
having count = max(count)
Very simple solution :
1- take the lag of the column time_stamp
2- take the difference of orif time_stamp column and the lag column
3- sum the values of count after filtering out the records for -1
+------------+-------+------+-----------------+-------+
| a | b | c | d | a-d |
| time_stamp | count | flag | lag_time_stamp | diff |
| 1467820429 | 6 | * | nulll | null |
| 1467820428 | 5 | | 1467820429 | -1 |
| 1467820427 | 4 | | 1467820428 | -1 |
| 1467820426 | 3 | | 1467820427 | -1 |
| 1467820416 | 2 | * | 1467820426 | -10 |
| 1467820415 | 1 | | 1467820416 | -1 |
| 1467820413 | 3 | * | 1467820415 | -2 |
| 1467820412 | 3 | | 1467820413 | -1 |
| 1467820411 | 2 | | 1467820412 | -1 |
| 1467820409 | 1 | * | 1467820411 | -2 |
| 1467820408 | 0 | | 1467820409 | -1 |
| 1467820405 | 1 | * | 1467820408 | -3 |
| 1467820404 | 0 | | 1467820405 | -1 |
| 1467820400 | 5 | * | 1467820404 | -4 |
+------------+-------+------+-----------------+-------+
--sum the values of the table that we got after filtering the records for -1
+------------+-------+
| time_stamp | count |
+------------+-------+
| 1467820429 | 6 |
| 1467820416 | 2 |
| 1467820413 | 3 |
| 1467820409 | 1 |
| 1467820405 | 1 |
| 1467820400 | 5 |
+------------+-------+

Select most recent MAX() and MIN() - WebSQL

i'm build an exercises web app and i'm working with two tables like this:
Table 1: weekly_stats
| id | code | type | date | time |
|----|--------------|--------------------|------------|----------|
| 1 | CC | 1 | 2015-02-04 | 19:15:00 |
| 2 | CC | 2 | 2015-01-28 | 19:15:00 |
| 3 | CPC | 1 | 2015-01-26 | 19:15:00 |
| 4 | CPC | 1 | 2015-01-25 | 19:15:00 |
| 5 | CP | 1 | 2015-01-24 | 19:15:00 |
| 6 | CC | 1 | 2015-01-23 | 19:15:00 |
| .. | ... | ... | ... | ... |
Table 2: global_stats
| id | exercise_number |correct | wrong |
|----|-----------------|--------|-----------|
| 1 | 138 | 1 | 0 |
| 2 | 246 | 1 | 0 |
| 3 | 988 | 1 | 10 |
| 4 | 13 | 5 | 0 |
| 5 | 5 | 4 | 7 |
| 6 | 5 | 4 | 7 |
| .. | ... | ... | ... |
What i would like is to get MAX(correct-wrong) and MIN(correct-wrong) and now i'm working with this query:
SELECT
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MAX(correct - wrong) from global_stats)
UNION
SELECT
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MIN(correct - wrong) from global_stats);
This query is working good, except for one thing: when "WHERE correct - wrong = (SELECT MIN(correct - wrong)[...]" selects more than one row, the row selected is the first but i would like to have returned the most recent (in other words: ordered by datetime(date, time)). Is it possible?
Thanks!
I think you can solve it like this:
SELECT * FROM (
SELECT
1 as sort_column,
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MAX(correct - wrong) from global_stats)
ORDER BY date DESC, time DESC
LIMIT 1 ) as a
UNION
SELECT * FROM (
SELECT
2 as sort_column,
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MIN(correct - wrong) from global_stats)
ORDER BY date DESC, time DESC
LIMIT 1) as b
ORDER BY sort_column;
Here is the documentation about how UNION works.