I have table that holds records with tasks, status and time when triggered:
Table tblwork:
+-------------+------------+---------------------+-----+
| task | status | stime | id |
+-------------+------------+---------------------+-----+
| A | 1 | 2018-03-07 20:00:00 | 1 |
| A | 2 | 2018-03-07 20:30:00 | 2 |
| A | 1 | 2018-03-07 21:00:00 | 3 |
| A | 3 | 2018-03-07 21:30:00 | 4 |
| B | 1 | 2018-03-07 22:30:00 | 5 |
| B | 3 | 2018-03-07 23:30:00 | 6 |
+-------------+------------+---------------------+-----+
Status 1 means start, 2 - pause, 3 - end
Then I need to calculate how much time is spent for each task excluding pause (status = 2). This is how I do it:
SELECT t1.id, t1.task,
SUM(timestampdiff(second,IFNULL(
(SELECT MAX(t2.stime) FROM tblwork t2 WHERE t2.task='B' AND t2.stime< t1.stime) ,t1.stime),t1.stime)) myTimeDiffSeconds
FROM tblwork t1
WHERE t1.task='B' and (t1.status = 1 or t1.status = 3);
Now I want to get table for all tasks
SELECT t1.id, t1.task,
SUM(timestampdiff(second,IFNULL(
(SELECT MAX(t2.stime) FROM tblwork t2 WHERE t2.stime< t1.stime) ,t1.stime),t1.stime)) myTimeDiffSeconds
FROM tblwork t1
WHERE (t1.status = 1 or t1.status = 3) GROUP BY t1.taks
I get this result:
+-------------+------------+---------------------+
| task | id | mytimedifference |
+-------------+------------+---------------------+
| A | 1 | 3600 |
| B | 3 | 2421217 |
+-------------+------------+---------------------+
Calculation for A is correct B is wrong, it should be 3600 second but i don't understand why.
Assuming there is always a start for each pause and end, wouldn't something like this be more direct?
SELECT t.task
, SUM(TO_SECONDS(t.stime)
* CASE WHEN t.status IN (1) THEN -1
WHEN t.status IN (2, 3) THEN 1
ELSE 0
END
) AS totalTimeSecs
FROM tblwork AS task
GROUP BY t.task
I'm not quite sure offhand how big the values that come out of TO_SECONDS() are for current timestamps; but if they are an issue when being summed, if could be changed to
, SUM((TO_SECONDS(t.stime) - some_constant_just_before_or_at_your_earliest_seconds)
* CASE WHEN t.status IN (1) THEN -1
WHEN t.status IN (2, 3) THEN 1
ELSE 0
END
) AS totalTimeSecs
You can detect "abnormal" data by adding the following to the select expression list
, CASE WHEN SUM(CASE
WHEN t.status IN (1) THEN -1
WHEN t.status IN (2, 3) THEN 1
ELSE 0 END
) = 0
THEN 'OK'
ELSE 'ABNORMAL'
END AS integrityCheck
Note: any "unclosed" intervals will be marked as abnormal; without much more complicated and expensive start and end checking for intervals to differentiate "open" from "invalid", it's probably the best that can be done. The sum used for additonal "integrityCheck" equaling -1 might hint at an open ended interval, but could also indicate an erroneous double-start.
Related
I'm trying to use MySQL to Group based on two columns, choose the most recent date, and create two binary columns to records status. Here's an example table
_______________________________________________________________________
Letters | Numbers| dates | score | random_status |
_______________________________________________________________________
A | 2 | 2021-09-29 0:00:00 | 0.3 | Sent |
A | 2 | 2021-10-01 0:00:00 | 1.4 | Received |
A | 5 | 2021-10-04 0:00:00 | 0.8 | Sent |
A | 7 | 2021-10-20 0:00:00 | 0.9 | Sent |
A | 7 | 2021-10-20 0:20:00 | 0.5 | Sent |
R | 7 | 2021-09-09 0:20:54 | 0.2 | Sent |
R | 7 | 2021-10-14 0:00:00 | 2.5 | Received |
R | 2 | 2021-10-07 0:00:00 | 0.7 | Received |
R | 2 | 2021-09-14 0:00:00 | 1.7 | Sent |
C | 5 | 2021-10-07 0:00:00 | 2.1 | Sent |
C | 5 | 2021-10-25 0:00:00 | 3.5 | Sent |
C | 7 | 2021-08-18 0:00:00 | 1.9 | Sent |
C | 7 | 2021-08-29 0:00:00 | 0.6 | Received |
C | 2 | 2021-02-01 0:00:00 | 1.8 | Sent |
I want to group base on Letters and Numbers columns, and I want the latest date, with the latest score. I want to create two new columns based on the status column, that says if a letter and number combination was ever in sent or received status.
Something that looks like this:
Letters| Numbers | latest_date |latest_score| Has_sent| Has_received|
A | 2 |2021-10-01 0:00:00 | 1.4 | 1 | 1 |
A | 5 |2021-10-04 0:00:00 | 0.8 | 1 | 0 |
A | 7 |2021-10-20 0:20:00 | 0.5 | 1 | 0 |
C | 2 |2021-02-01 0:00:00 | 1.8 | 1 | 0 |
C | 5 |2021-10-25 0:00:00 | 3.5 | 1 | 0 |
C | 7 |2021-08-29 0:00:00 | 0.6 | 1 | 1 |
R | 2 |2021-10-07 0:00:00 | 0.7 | 1 | 1 |
R | 7 |2021-10-14 0:00:00 | 2.5 | 1 | 1 |
I used the following query
SELECT t1.Letters, t1.Numbers, MAX(t1.dates) as latest_date, t1.score as latest_score,
case when status = "Sent" then 1 else 0 end AS Has_sent,
case when status = "Received" then 1 else 0 end AS Has_received
FROM dummy_data t1
WHERE
t1.dates IN (SELECT MAX(t2.dates) FROM dummy_data t2
WHERE t1.Letters = t2.Letters AND t1.Numbers = t2.Numbers)
GROUP BY t1.Letters, t1.Numbers;
The last two columns, has_sent and has_reported, are not showing as expected. Instead I get it based on the max date. Is it doable to get it to be binary based on that status ever existed per Letter and Number combination?
Try:
select tbl1.Letters,
tbl1.Numbers,
tbl1.latest_date,
tbl1.latest_score score,
tbl2.Has_sent,
tbl2.Has_received
from (
select Letters,
Numbers,
max(dates) as `latest_date`,
score as `latest_score`
from dummy_data
where dates in ( select max(dates)
from dummy_data
group by Letters, Numbers )
group by Letters, Numbers
) as tbl1
inner join
(
select Letters,
Numbers ,
max(case when random_status = "Sent" then 1 else 0 end) AS Has_sent,
max(case when random_status = "Received" then 1 else 0 end) AS Has_received
from dummy_data
group by Letters, Numbers
) as tbl2 on tbl1.Letters=tbl2.Letters and tbl1.Numbers=tbl2.Numbers;
Demo: https://www.db-fiddle.com/f/usu3XK7Gn8gGqQnusmCiLk/4
I've never used case, but from what I'm reading it appears row-wise? In that case, it won't work on both as the group by aggregates.
SELECT
t1.Letters,
t1.Numbers,
MAX(t1.dates) AS latest_date,
LAST_VALUE(t1.score) AS latest_score, /* Since your rows appear to be chronological I assume this works */
SELECT EXISTS(
SELECT 1
FROM dummy_data
WHERE
Letters = t1.Letters AND
random_status = "Sent"
) AS has_sent,
SELECT EXISTS(
SELECT 1
FROM dummy_data
WHERE
Letters = t1.Letters AND
random_status = "Received"
) AS has_received
FROM dummy_data AS t1
GROUP BY t1.Letters, t1.Numbers
;
If your tables are large as many time-series are, those subqueries won't be sustainable. From the business logic I see in your example talbes:
A Letter/Number pair must have a sent status.
A Letter/Number pair can only have a received status if it has a historical sent status.
A Letter/Number pair can have no more than one sent row and one received row. (unique constraint on Letters,Numbers,random_status)
Here is an alternative assuming those points.
SELECT
d_sent.Letters AS Letters,
d_sent.Numbers AS Numbers,
MAX(d_sent.dates, d_rcvd.dates) AS latest_date,
(
CASE
WHEN isnull(d_rcvd.score) THEN d_sent.score
ELSE d_rcvd.score
END
) AS latest_score,
1 AS is_sent,
cast(isnull(d_rcvd.score) AS SIGNED INTEGER) AS is_received
FROM (
SELECT * FROM dummy_data WHERE random_status="Sent"
) AS d_sent
LEFT JOIN (
SELECT * FROM dummy_data WHERE random_status="Received"
) AS d_rcvd
ON
d_sent.Letters = d_rcvd.Letters AND
d_sent.Numbers = d_rcvd.Numbers
;
I'm sure there are some syntactical tweaks that must be made. Let me know how it goes.
Checking if left join is null
MySQL case function
Edit: It appears case is indeed row-wise.
I have a logs table which consists of data in which user has opened (1) or closed (2) status. My problem is I need to get the user with registration_id that has open status but no close status.
Logs Table
id | registration_id | user_id | status | created_at
1 | 1 | 1 | 1 | 2021-02-22 8:00:00
2 | 1 | 1 | 2 | 2021-02-22 8:30:00
3 | 2 | 1 | 1 | 2021-02-22 8:30:00
4 | 2 | 1 | 2 | 2021-02-22 9:00:00
5 | 3 | 1 | 1 | 2021-02-22 9:00:00
6 | 4 | 2 | 1 | 2021-02-22 8:00:00
7 | 4 | 2 | 2 | 2021-02-22 8:30:00
Expected Output
id | registration_id | user_id | status | created_at
5 | 3 | 1 | 1 | 2021-02-22 9:00:00
Since the registration_id = 3 with user_id = 1 don't have a closed status. Also, there's a lot of logs between open and closed, I just simplified it in my question so, if you're planning to just count the registration_id if it's equals to 1. It doesn't work.
What I've tried is subtracting the closed created_at - open created_at and if the total is less than or equal 0, it doesn't have a close status but I know there's a better way to get what I wanted because my current query is very slow.
SELECT
user_id,
registration_id,
date,
SUM(timestampdiff(minute, openTime, closedTime)) AS total
FROM (
SELECT
user_id,
date(created_at) as `date`,
created_at as openTime,
registration_id,
coalesce(
(
SELECT created_at FROM logs t2
WHERE t1.registration_id = t2.registration_id
AND t1.created_at < t2.created_at
AND t1.user_id = t2.user_id
AND status = 2
ORDER BY t1.created_at
LIMIT 1
),
date_add(t1.created_at, interval -1 minute)
) AS closedTime
FROM logs t1
WHERE status = 1
) a
GROUP BY a.user_id, registration_id
HAVING total <= 0;
I have kind of log table. It hold records with tasks, status and time when triggered:
Table tblwork:
+-------------+------------+---------------------+-----+
| task | status | stime | id |
+-------------+------------+---------------------+-----+
| A | 1 | 2018-03-07 20:00:00 | 1 |
| A | 2 | 2018-03-07 20:30:00 | 2 |
| A | 1 | 2018-03-07 21:00:00 | 3 |
| A | 3 | 2018-03-07 21:30:00 | 4 |
+-------------+------------+---------------------+-----+
Status 1 means start, 2 - pause, 3 - end.
So far I try something like this:
SELECT x1.stime, SUM(TIMEDIFF(x2.stime, x1.stime))
FROM tblwork AS x1
LEFT JOIN tblwork AS x2
ON x1.id = x2.id + 1
WHERE x1.`status` = 1 OR x1.`status` = 3
But this gave result -6.000?!?!
I need to calculate total time spent for task but to exclude pause. So the final result should be 01:00:00. Is it possible to do that on this way or should I change table and logic?
UPDATE : SOLUTION
I think I found right way to do exact what I want:
SELECT id, stime,
SUM(TIMESTAMPDIFF(SECOND,
(SELECT MAX(stime) FROM tblwork WHERE stime < t.stime),
stime
)) AS TotalTime
FROM tblwork as t
where (t.status = 1 or t.status = 3)
Looking to your data you should join only the x1.status = 1 and left join the x2.status=2 or 3
SELECT x1.stime, SUM(TIMEDIFF(x2.stime, x1.stime))
FROM tblwork AS x1
LEFT JOIN tblwork AS x2 ON x1.id = x2.id + 1
AND (x2.status = '2' OR x2.status = '3')
WHERE x1.`status` = 1
l have a record table now, and l must to statistics the result of every month.
here is a test table
+----+------+----------+----------+------+
| id | name | grade1 | grade2 | time |
+----+------+----------+----------+------+
| 1 | a | 1 | 1 | 1 |
| 2 | a | 0 | 1 | 1 |
| 3 | a | 1 | 2 | 2 |
| 4 | b | 1 | 2 | 2 |
| 5 | a | 1 | 1 | 2 |
+----+------+----------+----------+------+
5 rows in set (0.01 sec)
time column means month(the actual is timestamp).
l need to statistics total number those grade1 >=1 && grade2 >=1 in every month
So, l want to get the result like this
+----+------+----------+----------+----------+----------+------+
| id | name | grade1_m1| grade2_m1| grade1_m2| grade2_m2| time |
+----+------+----------+----------+----------+----------+------+
| 13 | a | 1 | 2 | null | null | 1 |
| 14 | a | null | null | 2 | 2 | 2 |
| 15 | b | null | null | 1 | 1 | 2 |
+----+------+----------+----------+----------+----------+------+
3 rows in set (0.00 sec)
fake code of sql seem like this:
select
count(grade1 where time=1 and grade1 >= 1) as grade1_m1,
count(grade2 where time=1 and grade2 >= 1) as grade1_m1,
count(grade1 where time=2 and grade1 >= 1) as grade1_m2,
count(grade2 where time=2 and grade2 >= 1) as grade1_m2,
-- ... 12 months' statistics
from test
group by name
In the fact, l done it, but with temporary table like follow:
select
count(if(m1.grade1>=1, 1, null)) as grade1_m1,
count(if(m1.grade2>=1, 1, null)) as grade2_m1,
count(if(m2.grade1>=1, 1, null)) as grade1_m2,
count(if(m2.grade2>=1, 1, null)) as grade2_m2,
-- ...
from test
left join
(select * from test where time = 1) as m1
on m1.id = test.id
left join
(select * from test where time = 1) as m2
on m2.id = test.id
-- ...
group by name
But this sql is toooooooo long. this test table is just a simple version. Under real situation, l printed my sql and that took up two screens in chrome. So l am seeking a more simple way to complete it
You're original version is almost there. You need case and sum() is more appropriate:
select name,
sum(case when time=1 and grade1 >= 1 then grade1 end) as grade1_m1,
sum(case when time=1 and grade2 >= 1 then grade2 end) as grade2_m1,
sum(case when time=2 and grade1 >= 1 then grade1 end) as grade1_m2,
sum(case time=2 and grade2 >= 1 then grade2 end) as grade2_m2,
-- ... 12 months' statistics
from test
group by name
I have two tables: Here is the
sqlfiddle (http://sqlfiddle.com/#!9/5a51734/5)
1) [table:data_aoc]
| aoc_id | aoc_name | aoc_type | client_id |
|------------------------------|-----------|
1 | MA01 | sensor_1 | 4 | 1 |
2 | MA02 | sensor_2 | 15 | 1 |
2) table:data_log
| log_id | log_aoc_id | trans_type | trans_value | trans_date |
|-------------------------------------------------------------|
1 | x1a1 | MA01 | 0 | 90 | 2017-10-20 |
2 | afaf | MA01 | 0 | 90 | 2017-10-21 |
3 | va12 | MA02 | 0 | 10 | 2017-10-21 |
4 | gag2 | MA02 | 0 | 10 | 2017-11-25 |
Expected Result
Total value for MA02 should be 10 but it shows 20
my queries as follows
SELECT
(CASE
WHEN a.aoc_type IN ('4')
THEN IFNULL((SUM(b.trans_value * case b.trans_type when '0' then -1 else 1 end)),0)
WHEN a.aoc_type IN ('15')
THEN IFNULL((SUM(b.trans_value * case when b.trans_type='0' AND DATE(b.trans_date) <= DATE(NOW()) then -1 else 1 end)),0)
END) as total
FROM data_aoc a
LEFT JOIN data_log b ON b.log_aoc_id = a.aoc_id
WHERE a.client_id='1'
GROUP BY a.aoc_name
ORDER BY a.aoc_id asc
Iam expecting when aoc_type is (15) it will sum the value within the date condition
DATE(b.trans_date) <= DATE(NOW())
but when i execute the queries, it produce result not within the date condition. *some timestamps are generated in advance beyond the NOW() date time.
The desired result should be:
| Total |
|-------|
| -180 |
| 10 |
But i get
| Total |
|-------|
| -180 |
| 0 | <----- it should be 10 because of the date condition i put
thank you!
As a follow-up of same findings from Don, And your clarification of don't count after, I came up with this query... Pre-check on the date first and if after, multiply by zero, OTHERWISE, apply the +/- 1 factor.
SELECT
SUM( b.trans_value *
CASE when ( a.aoc_type = '15'
AND b.trans_type = '0'
AND DATE(b.trans_date) > DATE(NOW()) )
then 0
when ( a.aoc_type = '4'
AND b.trans_type = '0' )
OR ( a.aoc_type = '15'
AND b.trans_type = '0'
AND DATE(b.trans_date) <= DATE(NOW()) )
then -1 else 1 end ) as total
FROM
data_aoc a
LEFT JOIN data_log b
ON a.aoc_id = b.log_aoc_id
WHERE
a.client_id='1'
GROUP BY
a.aoc_name
ORDER BY
a.aoc_id asc
Also posted on SQL Fiddle
It seems to be working exactly as it should.
With the date clause I get:
Sensor 1 = -180
Sensor 2 = 0
If you break down the summing you get two rows to be summed for sensor #2
10 on 10-21 (before the date restriction so it's multiplied by -1)
and
10 on 11-25 (after the date restriction so it's multiplied by 1)
10 * -1 + 10 * 1 = 0
The sensor #2 reading is correctly 0.
I do not understand why you think it should be anything else.