I have a table like so
id | status | data | date
----|---------|--------|-------
1 | START | a4c | Jan 1
2 | WORKING | 2w3 | Dec 29
3 | WORKING | 2d3 | Dec 29
4 | WORKING | 3ew | Dec 26
5 | WORKING | 5r5 | Dec 23
6 | START | 2q3 | Dec 22
7 | WORKING | 32w | Dec 20
8 | WORKING | 9k5 | Dec 10
and so on...
What I am trying to do, is to get the number of 'WORKING' rows between two 'START' i.e.
id | status | count | date
----|---------|--------|-------
1 | START | 4 | Jan 1
6 | START | 2 | Dec 22
and so on ...
I am using MySql 5.7.28.
Highly appreciate any help/suggestion!
date is unusable in the example, try using id as an ordering column instead
select id, status,
(select count(*)
from mytable t2
where t2.id > t.id and t2.status='WORKING'
and not exists (select 1
from mytable t3
where t3.id > t.id and t3.id < t2.id and status='START')
) count,
date
from mytable t
where status='START';
Fiddle
Assuming id is safe then you can do this by finding the next id for each block (and assigning some dummy values) then grouping by next id
drop table if exists t;
create table t
(id int,status varchar(20), data varchar(3),date varchar(10));
insert into t values
( 1 , 'START' , 'a4c' , 'Jan 1'),
( 2 , 'WORKING' , '2w3' , 'Dec 29'),
( 3 , 'WORKING' , '2d3' , 'Dec 29'),
( 4 , 'WORKING' , '3ew' , 'Dec 26'),
( 5 , 'WORKING' , '5r5' , 'Dec 23'),
( 6 , 'START' , '2q3' , 'Dec 22'),
( 7 , 'WORKING' , '32w' , 'Dec 20'),
( 8 , 'WORKING' , '9k5' , 'Dec 10');
SELECT MIN(ID) ID,
'START' STATUS,
SUM(CASE WHEN STATUS <> 'START' THEN 1 ELSE 0 END) AS OBS,
Max(DATE) DATE
FROM
(
select t.*,
CASE WHEN STATUS = 'START' THEN DATE ELSE '' END AS DT,
COALESCE(
(select t1.id from t t1 where t1.STATUS = 'START' and t1.id > t.id ORDER BY T1.ID limit 1)
,99999) NEXTID
from t
) S
GROUP BY NEXTID;
+------+--------+------+--------+
| ID | STATUS | OBS | DATE |
+------+--------+------+--------+
| 1 | START | 4 | Jan 1 |
| 6 | START | 2 | Dec 22 |
+------+--------+------+--------+
2 rows in set (0.00 sec)
This is a form of gaps-and-islands problem -- which is simpler in MySQL 8+ using window functions.
In older versions, probably the most efficient method is to accumulate a count of starts to define groupings for the rows. You can do this using variables and then aggregate:
select min(id) as id, 'START' as status, sum(status = 'WORKING') as num_working, max(date) as date
from (select t.*, (#s := #s + (t.status = 'START')) as grp
from (select t.* from t order by id asc) t cross join
(select #s := 0) params
) t
group by grp
order by min(id);
Here is a db<>fiddle.
SELECT id, status, `count`, `date`
FROM ( SELECT #count `count`,
id,
status,
`date`,
#count:=(#status=status)*#count+1,
#status:=status
FROM test,
( SELECT #count:=0, #status:='' ) init_vars
ORDER BY id DESC
) calculations
WHERE status='START'
ORDER BY id
> Since I am still in design/development I can move to MySQL 8 if that makes it easier for this logic? Any idea how this could be done with Windows functions? – N0000B
WITH cte AS ( SELECT id,
status,
`date`,
SUM(status='WORKING') OVER (ORDER BY id DESC) workings
FROM test
ORDER BY id )
SELECT id,
status,
workings - COALESCE(LEAD(workings) OVER (ORDER BY id), 0) `count`,
`date`
FROM cte
WHERE status='START'
ORDER BY id
fiddle
Related
I'd like to get the Date & ID which corresponds to the lowest and Largest Time, respectively the extreme rows in the table below with ID 5 & 4.
Please note the following:
Dates are stored as values in ms
The ID reflects the Order By Date ASC
Below I have split the Time to make it clear
* indicates the two rows to return.
Values should be returns as columns, i.e: SELECT minID, minDate, maxID, maxDate FROM myTable
| ID | Date | TimeOnly |
|----|---------------------|-----------|
| 5 | 14/11/2019 10:01:29 | 10:01:29* |
| 10 | 15/11/2019 10:01:29 | 10:01:29 |
| 6 | 14/11/2019 10:03:41 | 10:03:41 |
| 7 | 14/11/2019 10:07:09 | 10:07:09 |
| 11 | 15/11/2019 12:01:43 | 12:01:43 |
| 8 | 14/11/2019 14:37:16 | 14:37:16 |
| 1 | 12/11/2019 15:04:50 | 15:04:50 |
| 9 | 14/11/2019 15:04:50 | 15:04:50 |
| 2 | 13/11/2019 18:10:41 | 18:10:41 |
| 3 | 13/11/2019 18:10:56 | 18:10:56 |
| 4 | 13/11/2019 18:11:03 | 18:11:03* |
In earlier versions of MySQL, you can use couple of inline queries. This is a straight-forward option that could be quite efficient here:
select
(select ID from mytable order by TimeOnlylimit 1) minID,
(select Date from mytable order by TimeOnly limit 1) minDate,
(select ID from mytable order by TimeOnly desc limit 1) maxID,
(select Date from mytable order by TimeOnly desc limit 1) maxDate
One option for MySQL 8+, using ROW_NUMBER with pivoting logic:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY TimeOnly) rn_min,
ROW_NUMBER() OVER (ORDER BY Date TimeOnly) rn_max
FROM yourTable
)
SELECT
MAX(CASE WHEN rn_min = 1 THEN ID END) AS minID,
MAX(CASE WHEN rn_min = 1 THEN Date END) AS minDate
MAX(CASE WHEN rn_max = 1 THEN ID END) AS maxID,
MAX(CASE WHEN rn_max = 1 THEN Date END) AS maxDate
FROM cte;
Here is an option for MySQL 5.7 or earlier:
SELECT
MAX(CASE WHEN pos = 1 THEN ID END) AS minID,
MAX(CASE WHEN pos = 1 THEN Date END) AS minDate
MAX(CASE WHEN pos = 2 THEN ID END) AS maxID,
MAX(CASE WHEN pos = 2 THEN Date END) AS maxDate
FROM
(
SELECT ID, Date, 1 AS pos FROM yourTable
WHERE TimeOnly = (SELECT MIN(TimeOnly) FROM yourTable)
UNION ALL
SELECT ID, Date, 2 FROM yourTable
WHERE TimeOnly = (SELECT MAX(TimeOnly) FROM yourTable)
) t;
This second 5.7 option uses similar pivoting logic, but instead of ROW_NUMBER is uses subqueries to identify the min and max records. These records are brought together using a union, along with an identifier to keep track of which record be min/max.
You could simply do this:
SELECT minval.ID, minval.Date, maxval.ID, maxval.Date
FROM (
SELECT ID, Date
FROM t
ORDER BY CAST(Date AS TIME)
LIMIT 1
) AS minval
CROSS JOIN (
SELECT ID, Date
FROM t
ORDER BY CAST(Date AS TIME) DESC
LIMIT 1
) AS maxval
If you want two rows then change CROSS JOIN query to a UNION ALL query.
Demo on db<>fiddle
I have a report i'm trying to figure out, but I would like to do it all with in a SQL statement instead of needing to iterate over a bunch of data in script to do it.
I have a table that is structured like:
CREATE TABLE `batch_item` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`record_id` int(11) DEFAULT NULL,
`created` DATE NOT NULL,
PRIMARY KEY (`id`),
KEY `record_id` (`record_id`)
);
The Date field is always YEAR-MONTH-01. Data looks something like:
+------+-----------+------------+
| id | record_id | created |
+------+-----------+------------+
| 1 | 1 | 2019-01-01 |
| 2 | 2 | 2019-01-01 |
| 3 | 3 | 2019-01-01 |
| 4 | 1 | 2019-02-01 |
| 5 | 2 | 2019-02-01 |
| 6 | 1 | 2019-03-01 |
| 7 | 3 | 2019-03-01 |
| 8 | 1 | 2019-04-01 |
| 9 | 2 | 2019-04-01 |
+------+-----------+------------+
So what I'm trying to do, with out having to create a looping script, is find the AVG number of sequential months for each record. Example with the data above would be:
Record_id 1 would have a avg of 4 months.
Record_id 2 would be 1.5
Record_id 3 would be 1
I can write a script to iterate through all the records. I just would rather avoid that.
This is a gaps-and-islands problem. You simply need an enumeration of the rows for this to work. In MySQL 8+, you would use row_number() but you can use a global enumeration here:
select record_id, min(created) as min_created, max(created) as max_created, count(*) as num_months
from (select bi.*, (created - interval n month) as grp
from (select bi.*, (#rn := #rn + 1) as n -- generate some numbers
from batch_item bi cross join
(select #rn := 0) params
order by bi.record_id, bi.month
) bi
) bi
group by record_id, grp;
Note that when using row_number(), you would normally partition by record_id. However that is not necessary, if the numbers are created in the correct sequence.
The above query gets the islands. For your final results, you need one more level of aggregation:
select record_id, avg(num_months)
from (select record_id, min(created) as min_created, max(created) as max_created, count(*) as num_months
from (select bi.*, (created - interval n month) as grp
from (select bi.*, (#rn := #rn + 1) as n -- generate some numbers
from batch_item bi cross join
(select #rn := 0) params
order by bi.record_id, bi.month
) bi
) bi
group by record_id, grp
) bi
group by record_id;
This is not a tested solution. It should work in MySQL 8.x with minor tweaks, since I don't remember date arithmetic in MySQL:
with
a as ( -- the last row of each island
select *
from batch_item
where lead(created) over(partition by record_id order by created) is null
or lead(created) over(partition by record_id order by created)
> created + 1 month -- Fix the date arithmetic here!
),
e as ( -- each row, now with the last row of its island
select b.id, b.record_id, min(a.last_created) as end_created
from batch_item b
join a on b.record_id = a.record_id and b.created <= a.created
group by b.id, b.record_id
),
m as ( -- each island with the number of months it has
select
record_id, end_created, count(*) as months
from e
group by record_id, end_created
)
select -- the average length of islands for each record_id
record_id, avg(months) as avg_months
from m
group by record_id
I have a table called map_item_group in MySQL that looks like this example:
item_serial | group_code | start_date | end_date
===================================================
item1 | group1 | 2015-01-01 | 2016-01-01
item1 | group2 | 2016-02-01 | 2016-03-15
item2 | group1 | 2015-06-01 | 2016-06-30
item1 | group2 | 2016-05-18 | 2016-06-30
I want to create a MySQL view called group_info that looks like this:
group_code | start_date | end_date | items_string
=======================================================
group1 | 2015-01-01 | 2015-06-01 | item1
group1 | 2015-06-01 | 2016-01-01 | item1,item2
group1 | 2016-01-01 | 2016-06-30 | item2
group2 | 2016-02-01 | 2016-03-15 | item1
group2 | 2016-05-18 | 2016-06-30 | item1
In other words, I want one row for each group showing the items in that group over each time span.
Simply grouping by group_code, start_date and end_date (i.e. SELECT group_code, start_date, end_date, GROUP_CONCAT(item_serial) FROM map_item_group GROUP BY group_code, start_date, end_date) does not give the desired output.
I can imagine ways to do this with subqueries, but subqueries aren't allowed in MySQL views. I can create other views in place of subqueries as a workaround, but I'd rather avoid adding a bunch of extra views to my schema. What's the cleanest way to do this?
First I create a list of all dates (start + end) by group_code using UNION I called T1 but should choose a different name
Then use variables to asign a row_number to each date. Subquery T1 and T2
Then have to duplicate the code to join the result to itself and create ranges. Subquery R
You could simplify it making that a separated view.
Now I have the ranges, join back to the original table to see if the item belong to that range.
OUPUT
SQL Demo
SELECT R.`group_code`, R.`start_date`, R.`end_date`, GROUP_CONCAT(T.item_serial SEPARATOR ', ') items
FROM (
SELECT T1.`group_code`, T1.range_date as start_date, T2.range_date as end_date
FROM (
SELECT `group_code`, range_date,
#rn := IF( #grpCode = `group_code`, #rn + 1 , IF(#grpCode := `group_code`, 1, 1)) as rn
FROM (
SELECT `group_code`, `start_date` as range_date
FROM Table1
UNION
SELECT `group_code`, `end_date` as range_date
FROM Table1
ORDER BY 1, 2
) as T1,
(SELECT #rn := 0, #grpCode := '') r
) T1
JOIN (
SELECT `group_code`, range_date,
#rn := IF( #grpCode = `group_code`, #rn + 1 , IF(#grpCode := `group_code`, 1, 1)) as rn
FROM (
SELECT `group_code`, `start_date` as range_date
FROM Table1
UNION
SELECT `group_code`, `end_date` as range_date
FROM Table1
ORDER BY 1, 2
) as T1,
(SELECT #rn := 0, #grpCode := '') r
) T2
ON T1.rn = T2.rn -1
AND T1.group_code = T2.group_code
) R
JOIN Table1 T
ON R.start_date < T.end_date
AND R.end_date > T.start_date
AND R.group_code = T.group_code
GROUP BY R.`group_code`, R.`start_date`, R.`end_date`
ORDER BY 1,2, 4
I have a table like : session is the name of the table for example
With columns: Id, sessionDate, user_id
What i need:
Delta should be a new calculated column
Id | sessionDate | user_id | Delta in days
------------------------------------------------------
1 | 2011-02-20 00:00:00 | 2 | NULL
2 | 2011-03-21 00:00:00 | 2 | NULL
3 | 2011-04-22 00:00:00 | 2 | NULL
4 | 2011-02-20 00:00:00 | 4 | NULL
5 | 2011-03-21 00:00:00 | 4 | NULL
6 | 2011-04-22 00:00:00 | 4 | NULL
Delta is the Difference between the timestamps
What i want is a result for Delta Timestamp (in Days) for the the previous row and the current row grouped by the user_id.
this should be the result:
Id | sessionDate | user_id | Delta in Days
------------------------------------------------------
1 | 2011-02-20 00:00:00 | 2 | NULL
2 | 2011-02-21 00:00:00 | 2 | 1
3 | 2011-02-22 00:00:00 | 2 | 1
4 | 2011-02-20 00:00:00 | 4 | NULL
5 | 2011-02-23 00:00:00 | 4 | 3
6 | 2011-02-25 00:00:00 | 4 | 2
I already have a solution for a specific user_id:
SELECT user_id, sessionDate,
abs(DATEDIFF((SELECT MAX(sessionDate) FROM session WHERE sessionDate < t.sessionDate and user_id = 1), sessionDate)) as Delta_in_days
FROM session AS t
WHERE t.user_id = 1 order by sessionDate asc
But for more user_ids i didn´t find any solution
Hope somebody can help me.
Try this:
drop table a;
create table a( id integer not null primary key, d datetime, user_id integer );
insert into a values (1,now() + interval 0 day, 1 );
insert into a values (2,now() + interval 1 day, 1 );
insert into a values (3,now() + interval 2 day, 1 );
insert into a values (4,now() + interval 0 day, 2 );
insert into a values (5,now() + interval 1 day, 2 );
insert into a values (6,now() + interval 2 day, 2 );
select t1.user_id, t1.d, t2.d, datediff(t2.d,t1.d)
from a t1, a t2
where t1.user_id=t2.user_id
and t2.d = (select min(d) from a t3 where t1.user_id=t3.user_id and t3.d > t1.d)
Which means: join your table to itself on user_ids and adjacent datetime entries and compute the difference.
If id is really sequential (as in your sample data), the following should be quite efficient:
select t.id, t.sessionDate, t.user_id, datediff(t2.sessiondate, t.sessiondate)
from table t left outer join
table tprev
on t.user_id = tprev.user_id and
t.id = tprev.id + 1;
There is also another efficient method using variables. Something like this should work:
select t.id, t.sessionDate, t.user_id, datediff(prevsessiondate, sessiondate)
from (select t.*,
if(#user_id = user_id, #prev, NULL) as prevsessiondate,
#prev := sessiondate,
#user_id := user_id
from table t cross join
(select #user_id := 0, #prev := 0) vars
order by user_id, id
) t;
(There is a small issue with these queries where the variables in the select clause may not be evaluated in the order we expect them to. This is possible to fix, but it complicates the query and this will usually work.)
Although you have choosen an answer here is another way of achieving it
SELECT
t1.Id,
t1.sessionDate,
t1.user_id,
TIMESTAMPDIFF(DAY,t2.sessionDate,t1.sessionDate) as delta
from myTable t1
left join myTable t2
on t1.user_id = t2.user_id
AND t2.Id = (
select max(Id) from myTable t3
where t1.Id > t3.Id AND t1.user_id = t3.user_id
);
DEMO
I have a table
id | event_id | stat_count | month | year
1 | 1 | 12 | 01 | 2070
2 | 1 | 11 | 02 | 2070
3 | 2 | 14 | 01 | 2070
4 | 2 | 15 | 04 | 2070
and so on.
I need to fetch data by grouping by event_id. Month will be from 01 to 12 but not necessarily all months and stat_count might have been inserted in the table. The date provided is not in English date format.
I need to have result as follows
event data
Fevent 12,21,0,3,0,9,0,0,12,23,34,0
Fevent2 12,0,3,4,3,0,3,2,3,4,0,0
and so on.
Update:
Here is the query I have tried using group_concat too:
select
(select event_name from vital_events where id=VEC.vital_event_id) name,
group_concat(
CASE when VEC.month='01' then VEC.stat_count
when VEC.month='02' then VEC.stat_count
when VEC.month='03' then VEC.stat_count
when VEC.month='04' then VEC.stat_count
when VEC.month='05' then VEC.stat_count
when VEC.month='06' then VEC.stat_count
when VEC.month='07' then VEC.stat_count
when VEC.month='08' then VEC.stat_count
when VEC.month='09' then VEC.stat_count
when VEC.month='10' then VEC.stat_count
when VEC.month_nepali='11' then VEC.stat_count
when VEC.month='12' then VEC.stat_count
else 0 end order by VEC.month
)as data
from vital_event_counts as VEC
group by VEC.vital_event_id
Another version using a CROSS JOIN to find all months that should be included and then using a LEFT JOIN against the table with existing values;
SELECT x.event_id,
GROUP_CONCAT(COALESCE(stat_count,0) ORDER BY x.month) stat_count
FROM
(SELECT DISTINCT event_id, m.month FROM Table1 CROSS JOIN
(SELECT 1 month UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL
SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12) m
) x
LEFT JOIN Table1
ON table1.month = x.month AND table1.event_id = x.event_id
AND year = 2070
GROUP BY x.event_id
An SQLfiddle to test with.
Not pretty, but…
SELECT
event_id,
CONCAT(
(SELECT COALESCE(SUM(stat_count), 0) FROM tbl WHERE id = t.id AND month = 1), ',',
(SELECT COALESCE(SUM(stat_count), 0) FROM tbl WHERE id = t.id AND month = 2), ',',
(SELECT COALESCE(SUM(stat_count), 0) FROM tbl WHERE id = t.id AND month = 3), ',',
⋮
(SELECT COALESCE(SUM(stat_count), 0) FROM tbl WHERE id = t.id AND month = 12)
) AS data
FROM
tbl AS t