MySQL query to select min datetime grouped by 30 day intervals - mysql

Here's some dump data..
CREATE TABLE `customer` (
`approve_datetime` datetime DEFAULT NULL,
`created_date` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `customer` (`approve_datetime`, `created_date`)
VALUES
('2015-08-20 04:43:00','2015-08-20'),
(NULL,'2015-09-03'),
('2015-09-17 02:17:00','2015-09-17'),
(NULL,'2015-09-29'),
('2015-09-29 12:44:00','2015-09-29'),
('2015-10-08 03:09:00','2015-10-08'),
('2016-01-20 08:59:00','2016-01-19'),
('2016-05-03 09:38:00','2016-05-02'),
('2016-07-15 11:06:00','2016-07-15'),
(NULL,'2016-08-30'),
('2016-10-18 12:55:00','2016-10-18'),
(NULL,'2017-01-08'),
(NULL,'2017-02-02'),
('2017-02-13 02:58:00','2017-02-13');
Here is my current query which doesn't handle the 30 day groupings correctly.
SELECT a.*
FROM customer a
WHERE a.approve_datetime IN (
SELECT MIN(b.approve_datetime)
FROM customer b
WHERE b.created_date BETWEEN a.created_date
AND DATE_ADD(a.created_date, INTERVAL 30 DAY)
)
Which gives me the following.
+---------------------+--------------+
| approve_datetime | created_date |
+---------------------+--------------+
| 2015-08-20 04:43:00 | 2015-08-20 |
| 2015-09-17 02:17:00 | 2015-09-17 |
| 2015-09-29 12:44:00 | 2015-09-29 |
| 2015-10-08 03:09:00 | 2015-10-08 |
| 2016-01-20 08:59:00 | 2016-01-19 |
| 2016-05-03 09:38:00 | 2016-05-02 |
| 2016-07-15 11:06:00 | 2016-07-15 |
| 2016-10-18 12:55:00 | 2016-10-18 |
| 2017-02-13 02:58:00 | 2017-02-13 |
+---------------------+--------------+
Can the query be altered to achieve the following results?
+---------------------+--------------+
| approve_datetime | created_date |
+---------------------+--------------+
| 2015-08-20 04:43:00 | 2015-08-20 |
| 2015-09-29 12:44:00 | 2015-09-29 |
| 2016-01-20 08:59:00 | 2016-01-19 |
| 2016-05-03 09:38:00 | 2016-05-02 |
| 2016-07-15 11:06:00 | 2016-07-15 |
| 2016-10-18 12:55:00 | 2016-10-18 |
| 2017-02-13 02:58:00 | 2017-02-13 |
+---------------------+--------------+
Notice that records with created_date's 2015-09-17 and 2015-10-08 have been removed because they are within 30 days of the previous record which is the minimum date for that particular group. 2015-08-20 + 30 days starts off the first group with 2015-08-20 being the min date for that group.
I hope what I'm trying to achieve makes sense.

take a look at this. the result is different but look if this correct. Column 3 and 4 are only to see how it works.
SELECT
min(b.approve_datetime) AS approve_datetime
, min(b.created_date) AS created_date
, DATEDIFF(b.created_date,(SELECT min(created_date) FROM customer)) / 30 AS dayd30
, FLOOR( DATEDIFF(b.created_date,(SELECT min(created_date) FROM customer)) / 30 ) AS dayd30floorint
FROM customer b
GROUP BY FLOOR( DATEDIFF(b.created_date,(SELECT min(created_date) FROM customer)) / 30 )
ORDER BY b.created_date ;
sample
MariaDB [testdb]> SELECT
-> min(b.approve_datetime) AS approve_datetime
-> , min(b.created_date) AS created_date
-> , DATEDIFF(b.created_date,(SELECT min(created_date) FROM customer)) / 30 AS dayd30
-> , FLOOR( DATEDIFF(b.created_date,(SELECT min(created_date) FROM customer)) / 30 ) AS dayd30floorint
-> FROM customer b
-> GROUP BY FLOOR( DATEDIFF(b.created_date,(SELECT min(created_date) FROM customer)) / 30 )
-> ORDER BY b.created_date ;
+---------------------+--------------+---------+----------------+
| approve_datetime | created_date | dayd30 | dayd30floorint |
+---------------------+--------------+---------+----------------+
| 2015-08-20 04:43:00 | 2015-08-20 | 0.0000 | 0 |
| 2015-09-29 12:44:00 | 2015-09-29 | 1.3333 | 1 |
| 2016-01-20 08:59:00 | 2016-01-19 | 5.0667 | 5 |
| 2016-05-03 09:38:00 | 2016-05-02 | 8.5333 | 8 |
| 2016-07-15 11:06:00 | 2016-07-15 | 11.0000 | 11 |
| NULL | 2016-08-30 | 12.5333 | 12 |
| 2016-10-18 12:55:00 | 2016-10-18 | 14.1667 | 14 |
| NULL | 2017-01-08 | 16.9000 | 16 |
| NULL | 2017-02-02 | 17.7333 | 17 |
| 2017-02-13 02:58:00 | 2017-02-13 | 18.1000 | 18 |
+---------------------+--------------+---------+----------------+
10 rows in set (0.00 sec)
MariaDB [testdb]>

Related

Calculate the amount of time for each status

I have the following table bellow.
The timeStamp is the moment that the status began.
There are some rows that don't add new information if status changed (like the second row) and they could be ignored.
I would to calculate (using mysql 5.7) the total amount of time for each status.
| timeStamp | status |
|------------------------------|
| 2019-12-10 14:00:00 | 1 |
| 2019-12-10 14:10:00 | 1 | // this row could be ignored
| 2019-12-10 14:00:00 | 2 | // more 24 hours in status 1
| 2019-12-11 14:10:00 | 2 |
| 2019-12-12 14:00:00 | 1 | // more 24 hours in status 2
| 2019-12-14 14:00:00 | 2 | // more 48 hours in status 1
| 2019-12-16 14:10:00 | 2 |
| 2019-12-17 14:20:00 | 2 |
| 2019-12-18 14:00:00 | 3 | // more 96 hours in status 2
| 2019-12-19 14:00:00 | 1 | // more 24 hours in status 3
I would like to see as result a table like bellow.
| status | amount_of_time |
|-------------------------|
| 1 | 72 hours |
| 2 | 120 hours |
| 3 | 24 hours |
What complicates this is that the status don't stay in order: is not 1, 2,3.
In the example above it is: 1, 2, 1, 2, 3, 1, so I can't use the MIN information.
Get the timestamp of the following row in a subquery and calculate the difference to the timestamp of the current row:
select t1.status, timestampdiff(second,
t1.timeStamp,
(
select min(t2.timeStamp)
from mytable t2
where t2.timeStamp > t1.timeStamp
)
) as diff
from mytable t1;
This will return:
| status | diff |
| ------ | ------ |
| 1 | 600 |
| 1 | 86400 |
| 2 | 600 |
| 2 | 85800 |
| 1 | 172800 |
| 2 | 173400 |
| 2 | 87000 |
| 2 | 85200 |
| 3 | 86400 |
| 1 | NULL |
View on DB Fiddle
From here it's just a matter of GROUP BY and SUM:
select status, sum(diff) as duratation_in_seconds
from (
select t1.status, timestampdiff(second,
t1.timeStamp,
(
select min(t2.timeStamp)
from mytable t2
where t2.timeStamp > t1.timeStamp
)
) as diff
from mytable t1
) x
group by status;
Result:
| status | duratation_in_seconds |
| ------ | --------------------- |
| 1 | 259800 |
| 2 | 432000 |
| 3 | 86400 |
View on DB Fiddle
If you want the time in hours, change the first line to
select status, round(sum(diff)/3600) as duratation_in_hours
and you will get:
| status | duratation_in_hours |
| ------ | ------------------- |
| 1 | 72 |
| 2 | 120 |
| 3 | 24 |
View on DB Fiddle
You might though want to use floor() instead of round(). That's not clear from your question.
In MySQL 8 you could use the LEAD() window function to get the timestamp of the next row:
select status, sum(diff) as duratation_in_seconds
from (
select
status,
timestampdiff(second, timeStamp, lead(timeStamp) over (order by timeStamp)) as diff
from mytable
) x
group by status;
View on DB Fiddle

Displaying conditional mysql values in additional column

In a sample project i wanted to display data in such a way that based on dates the records for same student comes in additional columns.
mysql> desc sch_student;
+----------------+--------------+
| Field | Type |
+----------------+--------------+
| s_first_name | varchar(128) |
| s_last_name | varchar(128) |
| rollcode | int(8) |
| regnum | int(8) |
| in_time | datetime |
| out_time | datetime |
| total_time | int(8) |
+----------------+--------------+
for below query i am getting sample output like below , my expected output is something i am unable to get. I tried Sample join but it didn't work.
mysql> select * from sch_student;
+-------------------+---------------+--------------+-----------+---------------------+---------------------+----------------+
| s_first_name | s_last_name | rollcode | regnum | in_time | out_time | total_time |
+-------------------+---------------+--------------+-----------+---------------------+---------------------+----------------+
| Suzan | Matsuo | 8900 | 2897 | 2017-12-02 22:30:11 | 2017-12-02 22:30:11 | 00:17:00 |
| Scottie | Ogletree | 5624 | 5627 | 2017-12-02 16:40:01 | 2017-12-02 16:40:05 | 00:26:04 |
| Cynthia | Zimmerman | 3107 | 6348 | 2017-12-02 16:35:01 | 2017-12-02 16:35:01 | 00:59:89 |
| Ricardo | Shurtliff | 3072 | 261 | 2017-12-02 15:33:01 | 2017-12-02 15:33:01 | 00:16:55 |
| Elizabeth | Milligan | 4722 | 3233 | 2017-12-02 15:06:00 | 2017-12-02 15:10:33 | 00:14:33 |
+-------------------+---------------+--------------+-----------+---------------------+---------------------+----------------+
Expected output is something like below
+-------------------+---------------+--------------+-----------+---------------------+---------------------+----------------+--------------+-----------+---------------------+---------------------+----------------+
| s_first_name | s_last_name | Today's Meeting | Day Before Yesterday's Meeting |
| | rollcode | regnum | in_time | out_time | total_time | rollcode | regnum | in_time | out_time | total_time |
+-------------------+---------------+--------------+-----------+---------------------+---------------------+----------------+--------------+-----------+---------------------+---------------------+----------------+
| Suzan | Matsuo | 8900 | 2897 | 2017-12-02 22:30:11 | 2017-12-02 22:30:11 | 00:17:00 | 8900 | 2897 | 2017-11-30 12:30:11 | 2017-11-30 12:50:11 | 00:17:00 |
| Scottie | Ogletree | 5624 | 5627 | 2017-12-02 16:40:01 | 2017-12-02 16:40:05 | 00:26:04 | 5624 | 5627 | 2017-11-30 18:40:01 | 2017-11-30 19:33:05 | 00:26:04 |
| Cynthia | Zimmerman | 3107 | 6348 | 2017-12-02 16:35:01 | 2017-12-02 16:35:01 | 00:59:89 | 3107 | 6348 | 2017-11-30 13:35:01 | 2017-11-30 14:15:01 | 00:59:89 |
| Ricardo | Shurtliff | 3072 | 261 | 2017-12-02 15:33:01 | 2017-12-02 15:33:01 | 00:16:55 | 3072 | 261 | 2017-11-30 19:33:01 | 2017-11-30 20:33:01 | 00:16:55 |
| Elizabeth | Milligan | 4722 | 3233 | 2017-12-02 15:06:00 | 2017-12-02 15:10:33 | 00:14:33 | 4722 | 3233 | 2017-11-30 18:06:00 | 2017-11-30 19:10:33 | 00:14:33 |
+-------------------+---------------+--------------+-----------+---------------------+---------------------+----------------+--------------+-----------+---------------------+---------------------+----------------+
I tried below join and it's not returning expected output. Is it possible to display conditional column from table?
select * from
(
(select s_first_name,s_last_name,rollcode,regnum,in_time from sch_student where sch_student.in_time BETWEEN CURDATE()- INTERVAL 1 DAY AND CURDATE() ) As TD,
(select s_first_name,s_last_name,rollcode,regnum,in_time from sch_student where sch_student.in_time BETWEEN CURDATE()- INTERVAL 3 DAY AND CURDATE() ) As DBYS
) ;
I think this is what you need. I haven't tested it. Basically the query gets todays data LEFT joins to the day before yesterday's data. I assumed regnum and rollcode makes your primary key. Change if that isnt the case.
SELECT TD.* , DBYS.*
FROM (
SELECT s_first_name
,s_last_name
,rollcode
,regnum
,in_time
FROM sch_student
WHERE sch_student.in_time BETWEEN CURDATE() - INTERVAL 1 DAY
AND CURDATE()) AS TD
LEFT JOIN (
SELECT s_first_name
,s_last_name
,rollcode
,regnum
,in_time
FROM sch_student
WHERE sch_student.in_time BETWEEN CURDATE() - INTERVAL 3 DAY
AND CURDATE() - INTERVAL 2 DAY) AS DBYS
ON (TD.regnum = DBYS.regnum AND
TD.rollcode = DBYS.rollcode);
If you want to get info for today's meeting and the "day-before-yesterday's" meeting, try using a LEFT JOIN instead:
SELECT s_first_name, s_last_name, rollcode, regnum, in_time
FROM sch_student AS sch_today
LEFT JOIN sch_student AS sch_daybeforeyesterday ON
sch_today.<PK_FIELD> = sch_daybeforeyesterday.<PK_FIELD> AND
sch_daybeforeyesterday.in_time BETWEEN CURDATE()- INTERVAL 3 DAY AND CURDATE() - INTERVAL 2 DAY
WHERE sch_student.in_time BETWEEN CURDATE()- INTERVAL 1 DAY AND CURDATE()
This will give you all rows with "in_time" within the last 0-24 hours. For each of those rows, it will return any corresponding rows with "in_time" within the 48-72 hours.

TIMESTAMPDIFF Sum Case error

+-----------+-----------+--------+
| punchtime | punchdate | emp_id |
+-----------+-----------+--------+
| 9:51:00 | 4/1/2016 | 2 |
| 12:59:00 | 4/1/2016 | 2 |
| 10:28:00 | 4/1/2016 | 5 |
| 14:13:00 | 4/1/2016 | 5 |
| 9:56:00 | 4/1/2016 | 10 |
| 15:31:00 | 4/1/2016 | 10 |
| 10:08:00 | 5/1/2016 | 2 |
| 18:09:00 | 5/1/2016 | 2 |
| 10:15:00 | 5/1/2016 | 5 |
| 18:32:00 | 5/1/2016 | 5 |
| 10:11:00 | 6/1/2016 | 2 |
| 18:11:00 | 6/1/2016 | 2 |
| 10:25:00 | 6/1/2016 | 5 |
| 18:28:00 | 6/1/2016 | 5 |
| 10:19:00 | 6/1/2016 | 10 |
| 18:26:00 | 6/1/2016 | 10 |
+-----------+-----------+--------+
I need to count where emp_id punchtime is less then that 4 hours and count ir for the whole. i am trying the below code but its not working.
SELECT
a.emp_id,
sum( case when TIMESTAMPDIFF(hour, min(a.punchtime),
max(a.punchtime))< 4 then 1 else 0 end ) as 'Half Day'
FROM machinedata a
GROUP BY
a.emp_id
I am getting a error #1111 - Invalid use of group function
Desired output -
+-----------+-----------+
| emp_id | Half Day |
+-----------+-----------+
|2 | 1 |
|8 | 0 |
|10 |0 |
+-----------+-----------+
Your data set and desired result do not accord, so I'm going to ignore it...
Instead consider the following...
Note both the way in which I have presented the problem, and the construction of the solution.
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(employee_id INT NOT NULL
,punchtime DATETIME NOT NULL
,PRIMARY KEY(employee_id,punchtime)
);
INSERT INTO my_table VALUES
( 2,'2016/01/04 09:51:00'),
( 2,'2016/01/04 12:59:00'),
( 5,'2016/01/04 10:28:00'),
( 5,'2016/01/04 14:13:00'),
(10,'2016/01/04 09:56:00'),
(10,'2016/01/04 15:31:00'),
( 2,'2016/01/05 10:08:00'),
( 2,'2016/01/05 18:09:00'),
( 5,'2016/01/05 10:15:00'),
( 5,'2016/01/05 18:32:00'),
( 2,'2016/01/06 10:11:00'),
( 2,'2016/01/06 18:11:00'),
( 5,'2016/01/06 10:25:00'),
( 5,'2016/01/06 18:28:00'),
(10,'2016/01/06 10:19:00'),
(10,'2016/01/06 18:26:00');
SELECT employee_id
, SUM(diff < 14400 ) half
FROM
( SELECT x.*
, DATE(x.punchtime) dt
, TIME_TO_SEC(MAX(y.punchtime)) - TIME_TO_SEC(MIN(x.punchtime)) diff
FROM my_table x
JOIN my_table y
ON y.employee_id = x.employee_id
AND DATE(y.punchtime) = DATE(x.punchtime)
GROUP
BY x.employee_id
, dt
) n
GROUP
BY employee_id;
+-------------+------+
| employee_id | half |
+-------------+------+
| 2 | 1 |
| 5 | 1 |
| 10 | 0 |
+-------------+------+

Get first and last record number in every date exists in table

I am trying to show invoices for every single day, so for that purpose I used group by on created date and sum on subtotal. This is how I done it :
SELECT
`main_table`.*,
SUM(subtotal) AS `total_sales`
FROM
`sales_invoice` AS `main_table`
GROUP BY
DATE_FORMAT(created_at, "%m-%y")
Its working, but I also want to get the Invoice # from and Invoice # to for every date. Is it possible to do it with single query ?
EDIT :
Table Structure :
------------------------------------------------
| id | inoice_no | created_at | subtotal
| 1 | 34 | 2015-03-17 05:55:27 | 5
| 2 | 35 | 2015-03-17 12:35:00 | 7
| 3 | 36 | 2015-03-20 01:40:00 | 3
| 4 | 37 | 2015-03-20 07:05:13 | 6
| 5 | 38 | 2015-03-20 10:25:23 | 1
| 6 | 39 | 2015-03-24 12:00:00 | 6
------------------------------------------------
Output
---------------------------------------------------------------
| id | inoice_no | created_at | subtotal | total_sales
| 2 | 35 | 2015-03-17 12:35:00 | 7 | 12
| 5 | 38 | 2015-03-20 10:25:23 | 1 | 10
| 6 | 39 | 2015-03-24 12:00:00 | 6 | 6
-----------------------------------------------------------------
What I Expect
---------------------------------------------------------------
| id | inoice_no | created_at | subtotal | total_sales | in_from | in_to
| 2 | 35 | 2015-03-17 12:35:00 | 7 | 12 | 34 | 35
| 5 | 38 | 2015-03-20 10:25:23 | 1 | 10 | 36 | 38
| 6 | 39 | 2015-03-24 12:00:00 | 6 | 6 | 39 | 39
-----------------------------------------------------------------
If your invoice number is INTEGER then below query will give you the result what you want:
SELECT DATE_FORMAT(A.created_at, "%m-%y") AS InvoiceDate,
MIN(A.invoiveNo) AS FromInvoiceNo,
MAX(A.invoiveNo) AS ToInvoiceNo,
SUM(A.subtotal) AS total_sales
FROM sales_invoice AS A
GROUP BY InvoiceDate;
I guess salesid is primaryid in sales_invoice table.
select * from(
SELECT
`main_table`.*,
SUM(subtotal) AS `total_sales`
FROM
`sales_invoice` AS `main_table`
GROUP BY
DATE_FORMAT(created_at, "%m-%y")
order by main_table.salesid limit 1
union all
SELECT
`main_table`.*,
SUM(subtotal) AS `total_sales`
FROM
`sales_invoice` AS `main_table`
GROUP BY
DATE_FORMAT(created_at, "%m-%y")
order by main_table.salesid desc limit 1
)a

MySQL: Getting the most counted same-value entry (statistical mode) per hour within a datetime range

I have a table like this:
+--------+---------+----------------------+--------------+----------+
| idadata | value_r | date_r | idparameter | idnode |
+--------+-----------+-----------------------+--------------+--------+
| 54620 | 66.6627 | 2014-10-16 12:01:09 | 46 | 9 |
| 54621 | 19.4953 |2014-10-16 12:01:09 | 40 | 9 |
| 54622 | 19.9384 |2014-10-16 12:01:09 | 47 | 9 |
| 54623 | 163.849 | 2014-10-16 12:01:09 | 43 | 9 |
| 54624 | 67.9257 | 2014-10-16 12:02:09 | 44 | 9 |
| 54625 | 315 | 2014-10-16 12:02:09 | 42 | 9 |
| 54626 | 0.699 | 2014-10-16 12:02:09 | 41 | 9 |
| 54627 | 67.9257 | 2014-10-16 12:03:09 | 46 | 9 |
| 54628 | 19.2308 | 2014-10-16 12:03:09 | 40 | 9 |
| 54629 | 11.207 | 2014-10-16 12:03:09 | 47 | 9 |
| 54630 | 118.743 | 2014-10-16 12:03:09 | 43 | 9 |
| 54631 | 292.5 | 2014-10-16 12:03:09 | 42 | 9 |
+---------+----------+----------------------+---------------+-------+
I need to get the statistical mode or the value_r that repeats the most for a given idparameter and idnode in a given datime interval each hour. I have managed to get the mode when I set the datetime difference for 1 hour manually. However, when I try to group by hour or time difference it doesn't work and I end up with mode of the whole Start-End datetime and not group by hours.
So far this is my code:
select value_r , date_r , max(counter_v) from
(SELECT iddata, value_r,date_r ,count( value_r ) counter_v
FROM wsnca.data dat
where dat.idnode=9 and dat.idparameter=42 and
( dat.date_r between ('2014-10-16 12:00:00') and ('2014-10-16 13:00:00') )
group by value_r
order by counter_v DESC) T;
Result:
+----------+----------------------+---------------+
| value_r | date_r | max(counter_v)|
+-----------+----------------------+--------------+
| 270 | 2014-10-16 12:03:09 | 7 |
+-----------+-----------------------+--------------+
However, the result I'm looking for would be like this:
+----------+----------------------+---------------+
| value_r | date_r | max(counter_v)|
+-----------+----------------------+--------------+
| 270 | 2014-10-16 12:00:00 | 7 |
+-----------+-----------------------+--------------+
| 90 | 2014-10-16 13:00:00 | 4 |
+-----------+-----------------------+--------------+
| 45 | 2014-10-16 14:00:00 | 9 |
+-----------+-----------------------+--------------+
| 180 | 2014-10-16 15:00:00 | 8 |
+-----------+-----------------------+--------------+
As I said before, I don't know how to group that by one hour time interval and reading from the query round at the hour datetime as in the desired table.
I know I could do it in the PHP doing several queries but would prefer to do it in the one query.
You can number the count for each value_r per hour starting with #1 for the highest count, #2 for the 2nd highest and so on and then only keep #1 rows, which will be the modes for each hour.
select date_hour, value_r, cnt from (
select * ,
#rowNum := IF(date_hour = #prevDateHour,#rowNum+1,1) rowNum,
#prevDateHour := date_hour
from (
select value_r, hour(date_r) date_hour, count(*) cnt
from wsnca.data dat
where dat.idnode=9 and dat.idparameter=42
group by value_r, hour(date_r)
) t1 order by date_hour, cnt desc
) t1 where rowNum = 1
change group by value_r into group by value_r, date_r I think that should make it
EDIT Better Response for what you want to achieve
select value_r , DATE_FORMAT(date_r, '%Y-%m-%d %H') as formatted_date, max(counter_v) from
(SELECT iddata, value_r,date_r ,count( value_r ) counter_v
FROM wsnca.data dat
where dat.idnode=9 and dat.idparameter=42 and
( dat.date_r between ('2014-10-16 12:00:00') and ('2014-10-16 13:00:00') )
group by value_r, formatted_date
order by counter_v DESC) T