Say I have a table like so
| id | user_id | event_id | created_at |
|----|---------|----------|------------|
| 1 | 5 | 10 | 2015-01-01 |
| 2 | 6 | 7 | 2015-01-02 |
| 3 | 3 | 8 | 2015-01-01 |
| 4 | 5 | 9 | 2015-01-04 |
| 5 | 5 | 10 | 2015-01-02 |
| 6 | 6 | 1 | 2015-01-01 |
I want to be able to generate a counter of events per user. So my result would be:
| counter | user_id | event_id | created_at |
|---------|---------|----------|------------|
| 1 | 5 | 10 | 2015-01-01 |
| 1 | 6 | 7 | 2015-01-02 |
| 1 | 3 | 8 | 2015-01-01 |
| 2 | 5 | 9 | 2015-01-04 |
| 3 | 5 | 10 | 2015-01-02 |
| 2 | 6 | 1 | 2015-01-01 |
One idea is to self join the table and group by to replicate row_number() over.. function available in other RDBMS.
Check this Rextester Demo and see second query, to understand how inner join works in this case.
select t1.user_id,
t1.event_id,
t1.created_at,
count(*) as counter
from your_table t1
inner join your_table t2
on t1.user_id=t2.user_id
and t1.id>=t2.id
group by t1.user_id,
t1.event_id,
t1.created_at
order by t1.user_id,t1.event_id;
Output:
+---------+----------+------------+---------+
| user_id | event_id | created_at | counter |
+---------+----------+------------+---------+
| 3 | 8 | 01-01-2015 | 1 |
| 5 | 10 | 01-01-2015 | 1 |
| 5 | 10 | 02-01-2015 | 3 |
| 5 | 9 | 04-01-2015 | 2 |
| 6 | 1 | 01-01-2015 | 2 |
| 6 | 7 | 02-01-2015 | 1 |
+---------+----------+------------+---------+
Try the following:
select counter,
xx.user_id,
xx.event_id,
xx.created_at
from xx
join (select a.id,
a.user_id,
count(*) as counter
from xx as a
join xx as b
on a.user_id=b.user_id
and b.id<=a.id
group by 1,2) as counts
on xx.id=counts.id
Use a join to generate rows for each id with all the other lower ids for that user below it and count them.
Try This one:
Sub query will help to get this rsult.
select (select count(*) from user_event iue where iue.user_id == oue.user_id) as counter,
oue.user_id,
oue.event_id,
oue.created_at
from user_event oue
You could try to use a variable as a table, cross join it with the source table and reset whenever user id changes.
SELECT #counter := CASE
WHEN #user = user_id THEN #counter + 1
ELSE 1
END AS counter,
#user := user_id AS user_id,
event_id,
created_at
FROM your_table m,
(SELECT #counter := 0,
#user := '') AS t
ORDER BY user_id;
I've created a demo here
Related
Given we have following table where the series number and the the date should increment
+----+--------+------------+
| id | series | date |
+----+--------+------------+
| 1 | 10 | 2020-08-13 |
| 2 | 9 | 2020-08-02 |
| 3 | 8 | 2020-06-23 |
| 4 | 7 | 2020-06-08 |
| 5 | 6 | 2020-05-20 |
| 6 | 5 | 2020-05-05 |
| 7 | 4 | 2020-05-01 |
+----+--------+------------+
Is there a way to check if there are records that do not follow this pattern ?
For example row 2 has bigger series number but it's date is before row 3
+----+--------+------------+
| id | series | date |
+----+--------+------------+
| 1 | 10 | 2020-08-13 |
| 2 | 9 | 2020-06-02 |
| 3 | 8 | 2020-07-23 |
| 4 | 7 | 2020-06-08 |
| 5 | 6 | 2020-05-20 |
| 6 | 5 | 2020-05-05 |
| 7 | 4 | 2020-05-01 |
+----+--------+------------+
You can use window functions:
select *
from (
select t.*, lead(date) over(order by series) lead_date
from mytable t
) t
where date > lead_date
Alternatively:
select *
from (
select t.*, lead(series) over(order by date) lead_series
from mytable t
) t
where series > lead_series
You can use lag():
select t.*
from (select t.*,
lag(id) over (order by series) as prev_id_series,
lag(id) over (order by date) as prev_id_date
from t
) t
where prev_id_series <> prev_id_date;
You can fetch problematic rows and their corresponding conflicting rows using SELF JOIN like this (assuming your table is called "series"):
SELECT s1.id AS row_id, s1.series AS row_series, s1.date AS row_date,
s2.id AS conflict_id, s2.series AS conflict_series, s2.date AS conflict_date
FROM series AS s1
JOIN series AS s2
ON s1.series > s2.series AND s1.date < s2.date;
I have the following table:
+------------+--------+-----+
| reg_dat | status | id |
+------------+--------+-----+
| 2016-01-31 | 10 | 1 |
| 2017-06-31 | 12 | 1 |
| 2015-01-31 | 12 | 4 |
| 2017-01-25 | 5 | 4 |
| 2017-01-11 | 3 | 2 |
+------------+--------+-----+
I would like to do a mysql query to group the rows by id and keeping only the more recent date... so the output should be the following:
+------------+--------+-----+
| reg_dat | status | id |
+------------+--------+-----+
| 2017-06-31 | 12 | 1 |
| 2017-01-25 | 5 | 4 |
| 2017-01-11 | 3 | 2 |
+------------+--------+-----+
Unfortunately my code doesn't work...
select *
from table
group by id
order by id, reg_dat DESC
Have you some suggestions?
You can do that using a JOIN and a subquery
SELECT t.reg_dat, t.status, t.id
FROM table t
JOIN (SELECT max(reg_dat) max_date, id FROM table GROUP BY id) t1
ON t.reg_dat = t1.max_date AND t.id = t1.id
How does one query the time difference between consecutive rows with a hierarchical data? For example, I'd like to go from the following table:
+-------+----------+---------------------+
| group_id | event | event_time |
+-------+----------+---------------------+
| 1 | alarm | 2016-12-01 17:53:12 |
| 1 | alarm | 2016-12-01 17:59:43 |
| 2 | purchase | 2016-11-29 09:49:47 |
| 2 | purchase | 2016-11-29 09:53:51 |
| 2 | purchase | 2016-11-29 09:57:59 |
| 2 | alarm | 2016-11-29 10:01:02 |
| 2 | alarm | 2016-11-29 10:13:27 |
| 2 | purchase | 2016-11-29 10:15:00 |
| 2 | purchase | 2016-11-29 10:16:24 |
+-------+----------+---------------------+
to:
+-------+----------+---------------------+------------+
| group_id | event | event_time | time_delta |
+-------+----------+---------------------+------------+
| 1 | alarm | 2016-12-01 17:53:12 | 0 |
| 1 | alarm | 2016-12-01 17:59:43 | 00:06:31 |
| 2 | purchase | 2016-11-29 09:49:47 | 0 |
| 2 | purchase | 2016-11-29 09:53:51 | 00:04:04 |
| 2 | purchase | 2016-11-29 09:57:59 | 00:04:08 |
| 2 | alarm | 2016-11-29 10:01:02 | 0 |
| 2 | alarm | 2016-11-29 10:13:27 | 00:12:25 |
| 2 | purchase | 2016-11-29 10:15:00 | 0 |
| 2 | purchase | 2016-11-29 10:16:24 | 00:01:24 |
+-------+----------+---------------------+------------+
Data above is illustrative; my data actually has many groups and many events. So basically, I'd like calculate the time difference whenever the group_id and the event is the same in consecutive rows.
You can get the previous time for a given group by doing:
select t.*,
(select t2.time_delta
from t t2
where t2.group_id = t.group_id and
t2.event = t.event and
t2.event_time < t.event_time
order by t2.event_time desc
limit 1
) as prev_event_time
from t;
You can then get the time difference in a variety of ways, such as:
select t.*, timediff(event_time, prev_event_time)
from (select t.*,
(select t2.time_delta
from t t2
where t2.group_id = t.group_id and
t2.event = t.event and
t2.event_time < t.event_time
order by t2.event_time desc
limit 1
) as prev_event_time
from t
) t
Try this using user defined variables:
SELECT
group_id, event, event_time, diff time_delta
FROM
(SELECT
t1.*,
CASE
WHEN #event = event AND #group = group_id THEN TIME_FORMAT(TIMEDIFF(event_time, #et), '%H:%i:%s')
ELSE 0
END diff,
#event:=event,
#group:=group_id,
#et:=event_time
FROM
(SELECT
*
FROM
your_table
ORDER BY group_id , event_time) t1
CROSS JOIN (SELECT #event:='', #group:=- 1, #et:='') t2) t;
#et variable stores the previous event_time within each group of group_id and event.
it is possible to display accumulated data, resetting the count based on a condition?
I would like to create a script to accumulate if there is value 1 in cell number, but if another value the count should be restarted. Something like what is displayed in the column cumulative_with_condition.
+----+------------+--------+
| id | release | number |
+----+------------+--------+
| 1 | 2016-07-08 | 4 |
| 2 | 2016-07-09 | 1 |
| 3 | 2016-07-10 | 1 |
| 4 | 2016-07-12 | 2 |
| 5 | 2016-07-13 | 1 |
| 6 | 2016-07-14 | 1 |
| 7 | 2016-07-15 | 1 |
| 8 | 2016-07-16 | 2-3 |
| 9 | 2016-07-17 | 3 |
| 10 | 2016-07-18 | 1 |
+----+------------+--------+
select * from version where id > 1 and id < 9;
+----+------------+--------+---------------------------+
| id | release | number | cumulative_with_condition |
+----+------------+--------+---------------------------+
| 2 | 2016-07-09 | 1 | 1 |
| 3 | 2016-07-10 | 1 | 2 |
| 4 | 2016-07-12 | 2 | 0 |
| 5 | 2016-07-13 | 1 | 1 |
| 6 | 2016-07-14 | 1 | 2 |
| 7 | 2016-07-15 | 1 | 3 |
| 8 | 2016-07-16 | 2-3 | 0 |
+----+------------+--------+---------------------------+
You want something like row_number() (not exactly, but like that). You can do that using variables:
select t.*,
(#rn := if(number = 1, #rn + 1,
if(#n := number, 0, 0)
)
) as cumulative_with_condition
from t cross join
(select #n := '', #rn := 0) params
order by t.id;
As an alternative to using user variables, as demonstrated by Gordon Linoff, in this case it's also possible to self-join, group and count:
SELECT t.id, t.release, t.number, COUNT(version.id) AS cumulative_with_condition
FROM version RIGHT JOIN (
SELECT highs.*, MAX(lows.id) min
FROM version lows RIGHT JOIN version highs ON lows.id <= highs.id
WHERE lows.number <> '1'
GROUP BY highs.id
) t ON version.id > t.min AND version.id <= t.id
WHERE t.id > 1 AND t.id < 9
GROUP BY t.id
See it on sqlfiddle.
But, frankly, neither approach is particularly elegant—as I commented previously, you're probably best off implementing this within your application code.
i'm build an exercises web app and i'm working with two tables like this:
Table 1: weekly_stats
| id | code | type | date | time |
|----|--------------|--------------------|------------|----------|
| 1 | CC | 1 | 2015-02-04 | 19:15:00 |
| 2 | CC | 2 | 2015-01-28 | 19:15:00 |
| 3 | CPC | 1 | 2015-01-26 | 19:15:00 |
| 4 | CPC | 1 | 2015-01-25 | 19:15:00 |
| 5 | CP | 1 | 2015-01-24 | 19:15:00 |
| 6 | CC | 1 | 2015-01-23 | 19:15:00 |
| .. | ... | ... | ... | ... |
Table 2: global_stats
| id | exercise_number |correct | wrong |
|----|-----------------|--------|-----------|
| 1 | 138 | 1 | 0 |
| 2 | 246 | 1 | 0 |
| 3 | 988 | 1 | 10 |
| 4 | 13 | 5 | 0 |
| 5 | 5 | 4 | 7 |
| 6 | 5 | 4 | 7 |
| .. | ... | ... | ... |
What i would like is to get MAX(correct-wrong) and MIN(correct-wrong) and now i'm working with this query:
SELECT
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MAX(correct - wrong) from global_stats)
UNION
SELECT
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MIN(correct - wrong) from global_stats);
This query is working good, except for one thing: when "WHERE correct - wrong = (SELECT MIN(correct - wrong)[...]" selects more than one row, the row selected is the first but i would like to have returned the most recent (in other words: ordered by datetime(date, time)). Is it possible?
Thanks!
I think you can solve it like this:
SELECT * FROM (
SELECT
1 as sort_column,
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MAX(correct - wrong) from global_stats)
ORDER BY date DESC, time DESC
LIMIT 1 ) as a
UNION
SELECT * FROM (
SELECT
2 as sort_column,
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MIN(correct - wrong) from global_stats)
ORDER BY date DESC, time DESC
LIMIT 1) as b
ORDER BY sort_column;
Here is the documentation about how UNION works.