MySql counting the number of groups of rows containing a certain value - mysql

How can I get the number of "groups" of a status, where status == 0, excluding groups which start the table and groups that span <= hour? (If the time constraint is too difficult, we can alternatively exclude groups with counts <= 40 instead of groups spanning <= hour, since a row is logged about every 1:30 minutes.)
For example, the following SAMPLE table WITHOUT the time constraint would produce 3 if grouping by status == 0.
+------+----------+----------+
| id | status |time |
+------+----------+----------+
| 0001 | 1 |11:32:48 |
+------+----------+----------+
| 0002 | 0 |11:30:26 |
+------+----------+----------+
| 0003 | 0 |11:28:54 |
+------+----------+----------+
| 0004 | 1 |11:27:23 |
+------+----------+----------+
| 0005 | 0 |11:25:52 |
+------+----------+----------+
| 0006 | 1 |11:24:20 |
+------+----------+----------+
| 0007 | 1 |11:22:48 |
+------+----------+----------+
| 0008 | 0 |11:21:17 |
+------+----------+----------+
| 0009 | 0 |11:19:45 |
+------+----------+----------+
| 0010 | 0 |11:18:14 |
+------+----------+----------+
| 0011 | 0 |11:16:43 |
+------+----------+----------+
| 0012 | 0 |11:15:11 |
+------+----------+----------+
| 0013 | 0 |11:13:39 |
+------+----------+----------+
| 0002 | 0 |11:12:08 |
+------+----------+----------+
| 0014 | 1 |11:10:37 |
+------+----------+----------+
| 0015 | 1 |11:09:05 |
+------+----------+----------+
| 0016 | 1 |11:07:33 |
+------+----------+----------+
| 0017 | 0 |11:06:02 |
+------+----------+----------+
One solution I can think of would be to grab the entire table and produce the result with Java, but I am afraid this would be too inefficient given that the table can have millions of entries.

select sum(is_different_from_previous) , status
from (
select status,
(#prevStatus <> status and #prevStatus <> -1) is_different_from_previous,
#prevStatus := status
from myTable t1
cross join (select #prevStatus := -1) t2
order by t1.time
) t1 group by status
for a specific status
select * from (
select sum(is_different_from_previous) , status
from (
select status,
(#prevStatus <> status and #prevStatus <> -1) is_different_from_previous,
#prevStatus := status
from myTable t1
cross join (select #prevStatus := -1) t2
order by t1.time
) t1 group by status
) t1 where status = 0
Edit
To only count groups with a certain # of 0s
select count(*) from (
select * from (
select status,
(#prevStatus <> status and #prevStatus <> -1) is_different_from_previous,
if(#prevStatus <> status and #prevStatus <> -1,#groupNumber := #groupNumber + 1, #groupNumber) groupNumber,
#prevStatus := status
from myTable t1
cross join (select #prevStatus := -1, #groupNumber := 0) t2
order by t1.id
) t1
where status = 0
group by groupNumber
having count(*) > 4
) t1
http://sqlfiddle.com/#!9/e4a49/23

Try the following modified query, which is more efficient than the earlier one, because another table scan is eliminated and we restrict the data to only the last one hour. Also, the first group is not counted.
EDIT: I changed the JOIN condition back to st2.id = st1.id+1 to satisfy the requirements.
select
st1.status,
count(st1.id)
from sampletable st1
inner join sampletable st2
on (st2.id = st1.id+1 and st2.status <> st1.status)
where st1.status = 0 AND st1.time >= DATE_SUB(NOW(), INTERVAL 1 hour)
group by st1.status;
Updated SQL Fiddle demo with same id, status data:
SQL Fiddle demo

Related

SQL query to get latest non null value of columns grouping by a column

I have a table in MySQL table, which looks like
+--------------------------+------------+-----------------+---------+------+-------+-------------+-------------------------+----------------------------+-------+-------+
| deviceID | date | timestamp | counter | rssi | vavId | nvo_airflow | nvo_air_damper_position | nvo_temperature_sensor_pps | block | floor |
+--------------------------+------------+-----------------+---------+------+-------+-------------+-------------------------+----------------------------+-------+-------+
| fd00::212:4b00:1957:d616 | 2020-02-29 | 12:40:01.513066 | 805 | 91 | 7 | NULL | NULL | 26.49 | NULL | ABCD |
| fd00::212:4b00:1957:d616 | 2020-02-29 | 12:41:01.542272 | 807 | 94 | 5 | 50 | 64 | 26.37 | NULL | ABCD |
| fd00::212:4b00:1957:d616 | 2020-02-29 | 12:43:01.699023 | 811 | 90 | 7 | 50 | NULL | NULL | NULL | ABCD |
| fd00::212:4b00:1957:d616 | 2020-02-29 | 12:46:01.412259 | 817 | 64 | 26 | NULL | NULL | 25.85 | NULL | ABCD |
| fd00::212:4b00:1957:d616 | 2020-02-29 | 12:48:01.576133 | 821 | 91 | 26 | 55 | 42 | NULL | NULL | ABCD |
| fd00::212:4b00:1957:d616 | 2020-02-29 | 12:49:01.529593 | 823 | 91 | 7 | 45 | 72 | NULL | NULL | ABCD |
I want to get the latest non null data of 3 columns(nvo_airflow, nvo_air_damper_position, nvo_temperature_sensor_pps) for each vavId.
My result should look something like
vavId,nvo_airflow,nvo_air_damper_position,nvo_temperature_sensor_pps
5,50,64,26.37
7,45,72,26.49
26,55,42,25.85
I have written a sql query for the same,
SELECT airflow_table.nvo_airflow,damper_position_table.nvo_air_damper_position,temperature_sensor_table.nvo_temperature_sensor_pps,temperature_sensor_table.vavId
FROM(
((SELECT t1.date,t1.timestamp,t1.nvo_airflow,t1.vavId
FROM
(SELECT * FROM vavDataOptimized where date='2020-02-29')t1
INNER JOIN
(SELECT max(timestamp) as recent_timestamp,vavId FROM vavDataOptimized where date='2020-02-29' and `nvo_airflow` is not null GROUP BY vavId)t2
ON (t1.timestamp = t2.recent_timestamp and t1.vavId = t2.vavId)
ORDER BY vavId) airflow_table
inner join
(SELECT t1.date,t1.timestamp,t1.nvo_air_damper_position,t1.vavId
FROM
(SELECT * FROM vavDataOptimized where date='2020-02-29')t1
INNER JOIN
(SELECT max(timestamp) as recent_timestamp,vavId FROM vavDataOptimized where date='2020-02-29' and `nvo_air_damper_position` is not null GROUP BY vavId)t2
ON (t1.timestamp = t2.recent_timestamp and t1.vavId = t2.vavId)
ORDER BY vavId) damper_position_table ON airflow_table.vavId = damper_position_table.vavId)
inner join
(SELECT t1.date,t1.timestamp,t1.nvo_temperature_sensor_pps,t1.vavId
FROM
(SELECT * FROM vavDataOptimized where date='2020-02-29')t1
INNER JOIN
(SELECT max(timestamp) as recent_timestamp,vavId FROM vavDataOptimized where date='2020-02-29' and `nvo_temperature_sensor_pps` is not null GROUP BY vavId)t2
ON (t1.timestamp = t2.recent_timestamp and t1.vavId = t2.vavId)
ORDER BY vavId) temperature_sensor_table on airflow_table.vavId = temperature_sensor_table.vavId);
What I am trying to do is getting the latest value for each of nvo_airflow, nvo_air_damper_position, nvo_temperature_sensor_pps for each vav as three intermediate tables and then trying to do a inner join on the tables.
This query is taking a lot of time to time and not getting executed. I am not sure if I am doing in it an optimized way. Am I doing something wrong, or is there a better way of doing this?
Here is one option of doing it. What i have done is to first rank the records on the basis of latest value of the attribute columns which are not null(eg: airflow_flg=1 implies ranks by not null values only)
After that a union of all the three would get the values you are looking for.
with data
as (
select *
,case when nvo_airflow is null then 0 else 1 end as airflow_flg
,case when nvo_air_damper_position is null then 0 else 1 end as damper_flg
,case when nvo_temperature_sensor_pps is null then 0 else 1 end as sensor_flg
,row_number() over(partition by case when nvo_airflow is null then 0 else 1 end,deviceid,vavid order by timestamp1 desc) as rnk_airflow
,row_number() over(partition by case when nvo_air_damper_position is null then 0 else 1 end,deviceid,vavid order by timestamp1 desc) as rnk_damper
,row_number() over(partition by case when nvo_temperature_sensor_pps is null then 0 else 1 end,deviceid,vavid order by timestamp1 desc) as rnk_sensor
from t
)
,concat_data
as (
select deviceid
,vavid
,nvo_airflow as val
,'nvo_airflow' as txt
from data
where airflow_flg=1
and rnk_airflow=1
union all
select deviceid
,vavid
,nvo_air_damper_position as val
,'nvo_air_damper_position' as txt
from data
where damper_flg=1
and rnk_damper=1
union all
select deviceid
,vavid
,nvo_temperature_sensor_pps as val
,'nvo_temperature_sensor_pps' as txt
from data
where sensor_flg=1
and rnk_sensor=1
)
select deviceid
,vavid
,max(case when txt='nvo_airflow' then val end) as nvo_airflow
,max(case when txt='nvo_air_damper_position' then val end) as nvo_air_damper_position
,max(case when txt='nvo_temperature_sensor_pps' then val end) as nvo_temperature_sensor_pps
from concat_data
group by deviceid
,vavid
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=5440a6e55e7b9a82596f57840dc38083
Does something like this fits your needs?
select vavId,nvo_airflow,nvo_air_damper_position,nvo_temperature_sensor_pps
from
(select vavId, #rownum1 := #rownum1 + 1 as rownum1 from
(select vavId
from vavDataOptimized
where vavId is not NULL
ORDER BY date1,timestamp1 DESC LIMIT 3) a
CROSS JOIN (SELECT #rownum1 := 0) v) a,
(select nvo_airflow, #rownum2 := #rownum2 + 1 as rownum2 from
(select nvo_airflow
from vavDataOptimized
where nvo_airflow is not NULL
ORDER BY date1,timestamp1 DESC LIMIT 3) b
CROSS JOIN (SELECT #rownum2 := 0) v) b,
(select nvo_air_damper_position, #rownum3 := #rownum3 + 1 as rownum3 from
(select nvo_air_damper_position
from vavDataOptimized
where nvo_air_damper_position is not NULL
ORDER BY date1,timestamp1 DESC LIMIT 3) c
CROSS JOIN (SELECT #rownum3 := 0) v) c,
(select nvo_temperature_sensor_pps, #rownum4 := #rownum4 + 1 as rownum4 from
(select nvo_temperature_sensor_pps
from vavDataOptimized
where nvo_temperature_sensor_pps is not NULL
ORDER BY date1,timestamp1 DESC LIMIT 3) d
CROSS JOIN (SELECT #rownum4 := 0) v) d
where rownum1 = rownum2
and rownum1 = rownum3
and rownum1 = rownum4
Here is the fiddle : https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=57539e363b668038547df037b15f0dee

Subtract Next to the current and make that output as the data of that current row

Computing Kilometer Run is based on this
current value = next value - current value
I have a table that looks like this.
My question is how can I compute the kmr based on the odometer value? I will replace the value of kmr column of kmr value
You can use variables to store last values.
create table tbl (code varchar(10), vdate date, kmr int);
✓
insert into tbl values
('Person1', '20180101', 71883),
('Person1', '20180102', 71893),
('Person1', '20180103', 71903),
('Person2', '20180101', 71800),
('Person2', '20180102', 71815),
('Person2', '20180103', 71820);
✓
select code, vdate, kmr, current_kmr
from
(
select t1.code, t1.vdate, t1.kmr,
t1.kmr - if(coalesce(#last_code, t1.code) = t1.code, coalesce(#last_kmr, t1.kmr), t1.kmr) as current_kmr,
#last_kmr := t1.kmr,
#last_code := t1.code
from tbl t1,
(select #last_kmr := null, #last_code := null) t2
order by t1.code, t1.vdate
) t
code | vdate | kmr | current_kmr
:------ | :--------- | ----: | ----------:
Person1 | 2018-01-01 | 71883 | 0
Person1 | 2018-01-02 | 71893 | 10
Person1 | 2018-01-03 | 71903 | 10
Person2 | 2018-01-01 | 71800 | 0
Person2 | 2018-01-02 | 71815 | 15
Person2 | 2018-01-03 | 71820 | 5
dbfiddle here
This will work using rank and for MORE THAN one person
http://sqlfiddle.com/#!9/4c054e/1
Select m.`code`,m.vdate, ( n.kmr - m.kmr) as new_kmr
From
(Select t1.*, #rnk := #rnk + 1 as rnk
From tbl t1, (select #rnk := 0) t
Order by t1.`code`,t1.vdate) m left join
(Select t2.*, #rnk1 := #rnk1 + 1 as rnk
From tbl t2, (select #rnk1:= 0) t
Order by t2.`code`,t2.vdate) n
On m.`code` = n.`code`
And m.rnk + 1 = n.rnk
Order by m.`code`, m.vdate
Output:
code vdate new_kmr
person 1 2018-03-01 10
person 1 2018-03-02 10
person 1 2018-03-03 (null)
person 2 2018-03-01 5
person 2 2018-03-02 (null)

SQL update column3 after group by column1 and compare value in column2

If I have a table T that look like this: where id is the unique auto-increment primary key. Difference column is default to 0. I want to UPDATE only the difference of largestId - secondLargestId in each id_str group while the rest remains unchanged.
id_str id Value Difference
2380 1 21.01 0
2380 3 22.04 0
2380 5 22.65 0
2380 8 23.11 0
2380 10 35.21 0
20100 2 37.07 0
20100 4 38.17 0
20100 6 38.97 0
20103 7 57.98 0
20103 9 60.83 0
The result I want is:
id_str id Value Difference
2380 1 21.01 0
2380 3 22.04 0
2380 5 22.65 0
2380 8 23.11 0
2380 10 35.21 12.1
20100 2 37.07 0
20100 4 38.17 0
20100 6 38.97 0.8
20103 7 57.98 0
20103 9 60.83 2.85
How can I write the query?
This should do the trick in MySQL.
CREATE TABLE SomeTable
( id_str VARCHAR(10),
id INTEGER,
value_ DECIMAL(7,5),
difference DECIMAL(7,5)
);
INSERT INTO SomeTable VALUES(2380,1,21.01,0);
INSERT INTO SomeTable VALUES(2380,3,22.04,0);
INSERT INTO SomeTable VALUES(2380,5,22.65,0);
INSERT INTO SomeTable VALUES(2380,8,23.11,0);
INSERT INTO SomeTable VALUES(2380,10,35.21,0);
INSERT INTO SomeTable VALUES(20100,2,37.07,0);
INSERT INTO SomeTable VALUES(20100,4,38.17,0);
INSERT INTO SomeTable VALUES(20100,6,38.97,0);
INSERT INTO SomeTable VALUES(20103,7,57.98,0);
INSERT INTO SomeTable VALUES(20103,9,60.83,0);
UPDATE SomeTable,
(SELECT T1.id AS id_updt,
T1.value_ - T2.value_ AS diff_updt
FROM (SELECT id_str,
id,
value_,
(
CASE id_str
WHEN #curStr THEN #curRow := #curRow + 1
ELSE #curRow := 1
AND #curStr := id_str
END
) AS rnk
FROM SomeTable,
(SELECT #curRow := 0, #curStr := '') r
ORDER
BY id_str DESC,
id DESC
) AS T1
INNER
JOIN (SELECT id_str,
id,
value_,
(
CASE id_str
WHEN #curStr THEN #curRow := #curRow + 1
ELSE #curRow := 1
AND #curStr := id_str
END
) AS rnk
FROM SomeTable,
(SELECT #curRow := 0, #curStr := '') r
ORDER
BY id_str DESC,
id DESC
) AS T2
ON T1.id_str = T2.id_str
AND T1.rnk = 1
AND T2.rnk = 2
) AS UPDT
SET SomeTable.difference = UPDT.diff_updt
WHERE SomeTable.id = UPDT.id_updt;
Deprecated solution - This will work for a DBMS that supports the rank function.
UPDATE SomeTable
FROM ( SELECT RNK1.id AS id_updt,
RNK1.value_ - RNK2.value_ AS diff_updt
FROM (SELECT id_str,
RANK() OVER
( PARTITION BY id_str
ORDER BY id DESC
) AS id_rnk
FROM SomeTable
) AS RNK1
INNER
JOIN (SELECT id_str,
RANK() OVER
( PARTITION BY id_str
ORDER BY id DESC
) - 1 AS id_rnk_decrement
FROM SomeTable
) AS RNK2
ON RNK1.id_str = RNK2.id_str
AND RNK1.id_rnk = RNK2.id_rnk_decrement
WHERE RNK1.id_rnk = 1
) AS UPDT
SET SomeTable.difference_ = UPDT.diff_updt
WHERE SomeTable.id = UPDT.id_updt;
You can find the two greatest ids per group with the following query:
select t1.id_str, max(t1.id) as id1, (
select max(t2.id)
from mytable t2
where t2.id_str = t1.id_str
and t2.id < max(t1.id)
) as id2
from mytable t1
group by t1.id_str;
Result:
| id_str | id1 | id2 |
|--------|-----|-----|
| 2380 | 10 | 8 |
| 20100 | 6 | 4 |
| 20103 | 9 | 7 |
Use it as subquery in your update statement:
update mytable u
join (
select t1.id_str, max(t1.id) as id1, (
select max(t2.id)
from mytable t2
where t2.id_str = t1.id_str
and t2.id < max(t1.id)
) as id2
from mytable t1
group by t1.id_str
) t on t.id1 = u.id
join mytable t1 on t1.id = t.id1
join mytable t2 on t2.id = t.id2
set u.Difference = t1.Value - t2.Value;
The table will now contain:
| id_str | id | Value | Difference |
|--------|----|-------|------------|
| 2380 | 1 | 21.01 | 0 |
| 2380 | 3 | 22.04 | 0 |
| 2380 | 5 | 22.65 | 0 |
| 2380 | 8 | 23.11 | 0 |
| 2380 | 10 | 35.21 | 12.1 |
| 20100 | 2 | 37.07 | 0 |
| 20100 | 4 | 38.17 | 0 |
| 20100 | 6 | 38.97 | 0.8 |
| 20103 | 7 | 57.98 | 0 |
| 20103 | 9 | 60.83 | 2.85 |
http://rextester.com/CCO40873

Count total flow inventory in and out php mysql

For the record I'm not using different table, I calculate using the same table but I added more column called stock.
I have a record table:
Table A
=======================================================
**id** | **code** | **status** | **total** | **date** |
1 | B01 | IN | 500 |2013-01-15|
2 | B01 | OUT | 100 |2013-01-20|
3 | B01 | OUT | 200 |2013-02-01|
4 | B01 | IN | 300 |2013-02-05|
The output that I want using select mysql is like this:
Table A
==================================================================
**id** | **code** | **status** | **total** | **date** | **stock**
1 | B01 | IN | 500 |2013-01-15| 500
2 | B01 | OUT | 100 |2013-01-20| 400
3 | B01 | IN | 200 |2013-02-01| 600
4 | B01 | OUT | 300 |2013-02-05| 300
As you can see I added the stock column in table A.. so my question is how can I achieved that using mysql ?
UPDATE
I've been saved by #Juergen D answer so I'm using his method:
select t.*, #stock := #stock + case when status = 'IN'
then total
else -total
end as stock
from your_table t
cross join (select #stock := 0) s
order by t.id
in case you have a same problem as me :)
select t.*, #stock := #stock + case when status = 'IN'
then total
else -total
end as stock
from your_table t
cross join (select #stock := 0) s
order by t.id
Use a CASE statement inside of SUM to determine whether to add or subtract.
SELECT
t1.id,
t1.code,
t1.status,
t1.date,
SUM(
CASE
WHEN t2.status = 'IN'
THEN t2.total
WHEN t2.status = 'OUT'
THEN (t2.total * -1)
END
) stock
FROM table t1
JOIN table t2
ON t2.date <= t1.date
AND t2.code = t1.code
GROUP BY t1.id

Calculate delta(difference of current and previous row) mysql group by specific column

I have a table like : session is the name of the table for example
With columns: Id, sessionDate, user_id
What i need:
Delta should be a new calculated column
Id | sessionDate | user_id | Delta in days
------------------------------------------------------
1 | 2011-02-20 00:00:00 | 2 | NULL
2 | 2011-03-21 00:00:00 | 2 | NULL
3 | 2011-04-22 00:00:00 | 2 | NULL
4 | 2011-02-20 00:00:00 | 4 | NULL
5 | 2011-03-21 00:00:00 | 4 | NULL
6 | 2011-04-22 00:00:00 | 4 | NULL
Delta is the Difference between the timestamps
What i want is a result for Delta Timestamp (in Days) for the the previous row and the current row grouped by the user_id.
this should be the result:
Id | sessionDate | user_id | Delta in Days
------------------------------------------------------
1 | 2011-02-20 00:00:00 | 2 | NULL
2 | 2011-02-21 00:00:00 | 2 | 1
3 | 2011-02-22 00:00:00 | 2 | 1
4 | 2011-02-20 00:00:00 | 4 | NULL
5 | 2011-02-23 00:00:00 | 4 | 3
6 | 2011-02-25 00:00:00 | 4 | 2
I already have a solution for a specific user_id:
SELECT user_id, sessionDate,
abs(DATEDIFF((SELECT MAX(sessionDate) FROM session WHERE sessionDate < t.sessionDate and user_id = 1), sessionDate)) as Delta_in_days
FROM session AS t
WHERE t.user_id = 1 order by sessionDate asc
But for more user_ids i didn´t find any solution
Hope somebody can help me.
Try this:
drop table a;
create table a( id integer not null primary key, d datetime, user_id integer );
insert into a values (1,now() + interval 0 day, 1 );
insert into a values (2,now() + interval 1 day, 1 );
insert into a values (3,now() + interval 2 day, 1 );
insert into a values (4,now() + interval 0 day, 2 );
insert into a values (5,now() + interval 1 day, 2 );
insert into a values (6,now() + interval 2 day, 2 );
select t1.user_id, t1.d, t2.d, datediff(t2.d,t1.d)
from a t1, a t2
where t1.user_id=t2.user_id
and t2.d = (select min(d) from a t3 where t1.user_id=t3.user_id and t3.d > t1.d)
Which means: join your table to itself on user_ids and adjacent datetime entries and compute the difference.
If id is really sequential (as in your sample data), the following should be quite efficient:
select t.id, t.sessionDate, t.user_id, datediff(t2.sessiondate, t.sessiondate)
from table t left outer join
table tprev
on t.user_id = tprev.user_id and
t.id = tprev.id + 1;
There is also another efficient method using variables. Something like this should work:
select t.id, t.sessionDate, t.user_id, datediff(prevsessiondate, sessiondate)
from (select t.*,
if(#user_id = user_id, #prev, NULL) as prevsessiondate,
#prev := sessiondate,
#user_id := user_id
from table t cross join
(select #user_id := 0, #prev := 0) vars
order by user_id, id
) t;
(There is a small issue with these queries where the variables in the select clause may not be evaluated in the order we expect them to. This is possible to fix, but it complicates the query and this will usually work.)
Although you have choosen an answer here is another way of achieving it
SELECT
t1.Id,
t1.sessionDate,
t1.user_id,
TIMESTAMPDIFF(DAY,t2.sessionDate,t1.sessionDate) as delta
from myTable t1
left join myTable t2
on t1.user_id = t2.user_id
AND t2.Id = (
select max(Id) from myTable t3
where t1.Id > t3.Id AND t1.user_id = t3.user_id
);
DEMO