How to determine daily accumlated values in mysql for each sample? - mysql

I've got a mysql table that has a running total:
+---------------------+--------+
| Timestamp | Total |
+---------------------+--------+
| 2012-07-04 05:35:00 | 1.280 | 1.280-1.280 = 0
| 2012-07-04 09:25:00 | 2.173 | 2.173-1.280 = 0.893
| 2012-07-04 09:30:00 | 2.219 | 2.219-1.280 = 0.939
| 2012-07-04 15:00:00 | 7.778 | 7.778-1.280 = 6.498
| 2012-07-04 21:05:00 | 13.032 | 13.032-1.280 = 11.752
| 2012-07-04 22:00:00 | 13.033 | 13.033-1.280 = 11.753
| 2012-07-05 05:20:00 | 13.033 | 13.033-13.033 = 0
| 2012-07-05 07:10:00 | 13.140 | 13.140-13.033 = 0.107
| 2012-07-05 10:15:00 | 14.993 | 14.993-13.033 = 1.960
| 2012-07-05 11:35:00 | 16.870 | 16.870-13.033 = 3.837
+---------------------+--------+
What I'm looking for is a query that determines the aggregated daily increase for each interval.
I've tried to show the desired outcome as well as the calculation behind each row. I've tried already several things with a join, but somehow I fail to determine what the starting value for each day is.
Thanks.

I can't vouch for the efficiency of this query, but it does get you the results you are looking for:
SELECT t1.`Timestamp`, t1.`Total`,
CASE WHEN t1.`timestamp` =
(SELECT MIN(t2.`Timestamp`)
FROM myTable t2
WHERE DATE(t2.`Timestamp`)=DATE(t1.`Timestamp`))
THEN 0
ELSE t1.`Total` - (SELECT MIN(t3.`Total`)
FROM myTable t3
WHERE DATE(t3.`Timestamp`)=DATE(t1.`Timestamp`))
END AS Diff
FROM myTable t1
ORDER BY `Timestamp`
Alternate Solution (more efficient I think)
SELECT t1.`Timestamp`, t1.`Total`, (t1.`Total` - d1.MinVal) diff
FROM myTable t1
INNER JOIN
(SELECT DATE(`Timestamp`) ts_date,
MIN(`Total`) AS MinVal
FROM myTable
GROUP BY ts_date) d1
ON DATE(t1.`Timestamp`) = d1.ts_date

Related

Optimize a query for calcilating datetime difference

I have a SQL table:
+---------+----------+---------------------+---------------------+---------+
| id | party_id | begintime | endtime | to_meas |
+---------+----------+---------------------+---------------------+---------+
| 1395035 | 9255 | 2010-09-26 00:34:02 | 2010-09-26 03:56:20 | 0 |
| 1395036 | 8974 | 2009-07-10 11:00:00 | 2009-07-10 21:30:00 | 0 |
| 1395037 | 8974 | 2009-07-10 23:14:00 | 2009-07-11 08:48:00 | 0 |
| 1395038 | 8975 | 2009-07-10 11:00:00 | 2009-07-10 21:30:00 | 0 |
| 1395039 | 8975 | 2009-07-10 23:14:00 | 2009-07-11 08:48:00 | 0 |
| 1395040 | 8974 | 2009-07-11 10:08:31 | 2009-07-12 18:49:51 | 0 |
| 1395041 | 8975 | 2009-07-11 10:08:31 | 2009-07-12 18:49:51 | 0 |
| 1395042 | 8974 | 2009-07-12 20:38:27 | 2009-07-13 20:33:21 | 0 |
| 1395043 | 8975 | 2009-07-12 20:38:27 | 2009-07-13 20:33:21 | 0 |
| 1395044 | 8974 | 2009-07-13 21:57:37 | 2009-07-15 08:25:45 | 0 |
| 1395045 | 8975 | 2009-07-13 21:57:37 | 2009-07-15 08:25:45 | 0 |
| 1395046 | 8974 | 2009-07-15 08:51:25 | 2009-07-16 10:29:13 | 0 |
| 1395047 | 8975 | 2009-07-15 08:51:25 | 2009-07-16 10:29:13 | 0 |
| 1395048 | 8974 | 2009-07-16 12:22:22 | 2009-07-17 14:39:10 | 0 |
| 1395049 | 8975 | 2009-07-16 12:22:22 | 2009-07-17 14:39:10 | 0 |
| 1395050 | 8976 | 2009-07-24 16:53:48 | 2009-07-25 08:47:29 | 0 |
| 1395051 | 8977 | 2009-07-24 16:53:48 | 2009-07-25 08:47:29 | 0 |
| 1395052 | 8978 | 2009-07-24 16:53:48 | 2009-07-25 08:47:29 | 0 |
| 1395053 | 8979 | 2009-07-24 16:53:48 | 2009-07-25 08:47:29 | 0 |
| 1395054 | 8976 | 2009-07-25 10:47:14 | 2009-07-26 09:41:44 | 0 |
+---------+----------+---------------------+---------------------+---------+
...
I need to calculate time between begintime and previous endtime and set to_meas to 1 if this difference is > 30 minutes. Here is my attempt to do it in MySQL:
update doses d set to_meas=1 where d.id in
(select a.id from party join (select * from doses) a
on party_id=a.party_id
left join (select * from doses) b
on party.id=b.party_id
and b.begintime=(select min(begintime)
from (select * from doses) c
where c.begintime > a.endtime)
and timestampdiff(minute, a.endtime, b.begintime) > 30
group by party.id);
This command runs (quasi-) forever. I've tried to do it in python's pandas:
conn = engine.connect()
sql =
'''
select doses.id, party_id, party.ml, begintime, endtime
from doses join party on party.id=doses.party_id
'''
df = pd.read_sql(con=conn, sql=sql,
measure = df.groupby('party_id', as_index=False).apply(
lambda x: x[pd.to_datetime(x['begintime']) -
pd.to_datetime(x.shift()['endtime']) > pd.to_timedelta('30 minutes')])
measure_ids = measure['id'].to_list()
measure_list = ','.join([str(x) for x in measure_ids])
conn.execute(
'update doses set to_meas=true where id in(%s)' % measure_list)
The last statement runs about 10 seconds. Is there a way to optimize SQL code for running as fast as the pandas` one?
In MySQL 8.0, you can get select the result you want with window functions, like so:
select d.*,
(begintime > lag(endtime) over(partition by pary_id order by endtime) + interval 30 minute) as to_meas
from doses d
In earlier versions:
select d.*,
(
begintime > (
select max(endtime) + interval 30 minute
from doses d1
where d1.party_id = d.party_id and d1.endtime < d.endtime
)
) as to_meas
from doses d
I would not recommend storing such derived information. You can use the query, or create a view. But if you really insist on an update:
update doses d
inner join (
select id,
(
begintime > (
select max(endtime) + interval 30 minute
from doses d1
where d1.party_id = d.party_id and d1.endtime < d.endtime
)
) as to_meas
from doses d
) d1 on d1.id = d.id
set d.to_meas = d1.to_meas
You can update your data using exists as follows:
Update doses d
Set meas = 1
Where begintime > (select max(dd.endtime) + interval '30' minute
From doses dd where dd.begintime < d.begintime
And dd.party_id = d.party_id)
If you want to update the data, you can use window functions in the update:
update doses d join
(select d.*,
lag(d.endtime) over (partition by d.party_id order by d.endtime) as prev_endtime
from doses d
) dd
on d.id = dd.id and
d.starttime > dd.prev_endtime + interval 30 minute
set to_meas = 1;
Then, for this query, you want an index on doses(party_id, endtime). I assume that id is already declared as a primary key.
Note: With this index, you might find it faster simply to calculate the value on the fly rather than storing it in the table.
EDIT:
In older versions of MySQL, you can phrase this as:
update doses d join
(select d.*,
(select d2.endtime
from doses d2
where d2.party_id = d.party_id and
d2.endtime < d.endtime
) as prev_endtime
from doses d
) dd
on d.id = dd.id and
d.starttime > dd.prev_endtime + interval 30 minute
set to_meas = 1;
You have relatively few rows per party_id so a correlated query seems reasonable. This also needs an index on (party_id, endtime).

How can I subtract two row's within same column in same date?

I have a query. I want to do an subtraction of the first and last row in the same day. I wrote the this query, but I was not sure of the performance. Is there an alternative way to this problem?
| imei | date | km |
|-----------------------------------------|
| 123 | 2019-01-15 00:00:01 | 15 |
| 123 | 2019-01-15 12:12:08 | 8 |
| 123 | 2019-01-15 23:00:59 | 30 |
| 456 | 2019-01-15 00:03:12 | 232 |
| 456 | 2019-01-15 07:04:00 | 123 |
| 456 | 2019-01-15 23:16:18 | 464 |
My query:
SELECT
gg.imei,
DATE_FORMAT(gg.datee, '%Y-%m-%d'),
gg.km - (SELECT
g.km
FROM
gps g
WHERE
g.datee LIKE '2019-01-15%'
AND g.datee = (SELECT
MIN(t.datee)
FROM
gps t
WHERE
t.datee LIKE '2019-01-15%'
AND t.imei = g.imei)
AND g.imei = gg.imei
GROUP BY g.imei) AS km
FROM
gps gg
WHERE
gg.datee LIKE '2019-01-15%'
AND gg.datee = (SELECT
MAX(ts.datee)
FROM
gps ts
WHERE
ts.datee LIKE '2019-01-15%'
AND gg.imei = ts.imei)
Result is true.
| imei | date | km |
|------------------------------|
| 123 | 2019-01-15 | 15 |
| 456 | 2019-01-15 | 232 |
But the query is too complicated.
Edit: There are 3 million records in the table.
You can find first and last datetime for each imei-date pair in a sub query then join with it:
SELECT agg.imei, agg.date_date, gps_last.km - gps_frst.km AS diff
FROM (
SELECT imei, DATE(date) AS date_date, MIN(date) AS date_frst, MAX(date) AS date_last
FROM gps
GROUP BY imei, DATE(date)
) AS agg
JOIN gps AS gps_frst ON agg.imei = gps_frst.imei AND agg.date_frst = gps_frst.date
JOIN gps AS gps_last ON agg.imei = gps_last.imei AND agg.date_last = gps_last.date
You need appropriate indexes on your table though. The DATE(date) part in particular will be slow, so you might want to consider adding another column for storing the date part only.

MySQl Query giving wrong result

Select * from YogaTimeTable;
Delete
from YogaTimeTable
Where RoomNum IN (select tt.RoomNum
from YogaRooms r,
YogaTypes t,
YogaTimeTable tt
where r.RoomNum = tt.roomNum
and ((r.RoomCapacity * t.ClassPrice) - (r.CostPerHour * tt.duration / 60)) < 200);
Select * from YogaTimeTable;
The goal is to delete any classes from the timetable that can make less than $200 profit. To calculate the profitability of each class, multiply the roomcapacity by the classprice and then subtract the cost of the room. To calculate the cost of the room multiply the costperhour by the duration divided by 60.
but it isn't giving the right result, can someone tell me where I made my mistake. Thanks. The tables are attached.
To me it looks like you have two problems.
A cross join between t and tt exists and should be resolved.
You're attempting to delete based on an incomplete or partial key of YogaTimeTable. The Unique Key of YogaTimeTable appears to be YogaID, StartTime,Day and RoomNum. I say this because the same yoga type could be in the same room at the same time on a different day, or in the same room on the same day at different start times. Thus I think the unique key for YogaTimeTable is a composite key of those 4 fields. So when deleting you need to use the complete key, not a partial key.
So this would result in.
.
DELETE FROM YogaTimeTable
WHERE exists
(SELECT 1
FROM YogaRooms r
INNER JOIN YogaTimeTable tt
on r.RoomNum = tt.roomNum
INNER JOIN YogaTypes t
on tt.YogaID = t.YogaID
WHERE YogaTimeTable.YogaID = TT.YogaID
and YogaTimeTable.RoomNum = TT.RoomNum
and YogaTimeTable.StartTime = TT.StartTime
and YogaTimeTable.Day = TT.Day
and ((r.RoomCapacity * t.ClassPrice) - (r.CostPerHour * tt.duration / 60)) < 200);
According to: I can use a correlated subquery to delete I just can't alias the table.... https://bugs.mysql.com/bug.php?id=2920
Profitability of all classes...
select ytt.YogaID,
ytt.Day,
ytt.StartTime,
ytt.RoomNum,
yt.ClassPrice,
ifnull(ytt.Duration,0) as Duration,
ifnull(yr.CostPerHour,0) as CostPerHour,
ifnull(yr.RoomCapacity,0) as RoomCapacity,
round( ifnull(yr.RoomCapacity,0)*yt.ClassPrice
- (ifnull(yr.CostPerHour,0)*ifnull(ytt.Duration,0)/60)
, 2) as Profitability
from YogaTypes yt
left join YogaTimeTable ytt on (ytt.YogaID=yt.YogaID)
left join YogaRooms yr on (yr.RoomNum=ytt.RoomNum);
+--------+-----------+-----------+---------+------------+----------+-------------+--------------+---------------+
| YogaID | Day | StartTime | RoomNum | ClassPrice | Duration | CostPerHour | RoomCapacity | Profitability |
+--------+-----------+-----------+---------+------------+----------+-------------+--------------+---------------+
| DRU | Wednesday | 10:30:00 | 1 | 18.50 | 60.00 | 100.00 | 20 | 270.00 |
| DRU | Tuesday | 17:00:00 | 2 | 18.50 | 90.00 | 50.00 | 10 | 110.00 |
| SUN | Monday | 07:30:00 | 3 | 18.00 | 60.00 | 150.00 | 25 | 300.00 |
| HAT | Tuesday | 07:30:00 | 4 | 20.00 | 90.00 | 70.00 | 15 | 195.00 |
| HAT | Monday | 18:30:00 | 4 | 20.00 | 60.00 | 70.00 | 15 | 230.00 |
| NULL | NULL | NULL | NULL | 17.00 | 0.00 | 0.00 | 0 | 0.00 |
+--------+-----------+-----------+---------+------------+----------+-------------+--------------+---------------+
6 rows in set (0.00 sec)
The classes with profitability less than desired...
select ytt.YogaID,
ytt.Day,
ytt.StartTime,
ytt.RoomNum
from YogaTypes yt
left join YogaTimeTable ytt on (ytt.YogaID=yt.YogaID)
left join YogaRooms yr on (yr.RoomNum=ytt.RoomNum)
where ifnull(yr.RoomCapacity,0)*yt.ClassPrice
- (ifnull(yr.CostPerHour,0)*ifnull(ytt.Duration,0)/60) < 200;
+--------+---------+-----------+---------+
| YogaID | Day | StartTime | RoomNum |
+--------+---------+-----------+---------+
| DRU | Tuesday | 17:00:00 | 2 |
| HAT | Tuesday | 07:30:00 | 4 |
| NULL | NULL | NULL | NULL |
+--------+---------+-----------+---------+
3 rows in set (0.00 sec)
Now to delete the undesirable sessions...
delete tt.*
from YogaTimeTable tt,
(select ytt.YogaID,
ytt.Day,
ytt.StartTime,
ytt.RoomNum
from YogaTypes yt
left join YogaTimeTable ytt on (ytt.YogaID=yt.YogaID)
left join YogaRooms yr on (yr.RoomNum=ytt.RoomNum)
where ifnull(yr.RoomCapacity,0)*yt.ClassPrice
- (ifnull(yr.CostPerHour,0)*ifnull(ytt.Duration,0)/60) < 200
) as unprof
where tt.YogaID=unprof.YogaID
and tt.RoomNum=unprof.RoomNum
and tt.Day=unprof.Day
and tt.StartTime=unprof.StartTime;
Query OK, 2 rows affected (0.00 sec)

Using left join with min

I am trying to connect two tables with left join and a date.
My SQL Query
SELECT
ord.`ordernumber` bestellnummer,
his.`change_date` zahldatum
FROM
`s_order` ord
LEFT JOIN
`s_order_history` his ON ((ord.`id`=his.`orderID`) AND (ord.`cleared`=his.`payment_status_id`)) #AND MIN(his.`change_date`)
WHERE
ord.`ordertime` >= \''.$dateSTART.'\' AND ord.`ordertime` <= \''.$dateSTOP.'\'' ;
s_order
+----+---------------------+---------+-------------+
| id | ordertime | cleared | ordernumber |
+----+---------------------+---------+-------------+
| 1 | 2014-08-11 19:53:43 | 2 | 123 |
| 2 | 2014-08-15 18:33:34 | 2 | 125 |
+----+---------------------+---------+-------------+
s_order_history
+----+-------------------+-----------------+---------+---------------------+
| id | payment_status_id | order_status_id | orderID | orderID change_date |
+----+-------------------+-----------------+---------+---------------------+
| 1 | 1 | 5 | 1 | 2014-08-11 20:53:43 |
| 2 | 2 | 5 | 1 | 2014-08-11 22:53:43 |
| 3 | 2 | 7 | 1 | 2014-08-12 19:53:43 |
| 4 | 1 | 5 | 2 | 2014-08-15 18:33:34 |
| 5 | 1 | 6 | 2 | 2014-08-16 18:33:34 |
| 6 | 2 | 6 | 2 | 2014-08-17 18:33:34 |
+----+-------------------+-----------------+---------+---------------------+
Wanted result:
+-------------+---------------------+
| ordernumber | change_date |
+-------------+---------------------+
| 123 | 2014-08-11 22:53:43 |
| 125 | 2014-08-17 18:33:34 |
+-------------+---------------------+
The problem I have is getting only the date, where the cleared/payment_status_id value has been changed in s_order. I currently get all dates where the payment_status_id matches the current cleared value, but I only need the one, where it happend first.
This is only an excerpt of the actually query, since the original is a lot longer (mostly more left joins and a lot more tables).
You can group data by ordernumber
SELECT
ord.`ordernumber` bestellnummer,
MIN(his.`min_change_date`) as zahldatum
FROM
`s_order` ord
LEFT JOIN
`s_order_history` his ON ((ord.`id`=his.`orderID`) AND (ord.`cleared`=his.`payment_status_id`)) #AND MIN(his.`change_date`)
WHERE
ord.`ordertime` >= \''.$dateSTART.'\' AND ord.`ordertime` <= \''.$dateSTOP.'\''
GROUP BY
ord.`ordernumber`;
or you can group data in a subquery:
SELECT
ord.`ordernumber` bestellnummer,
his.`min_change_date` zahldatum
FROM
`s_order` ord
LEFT JOIN (
SELECT
orderID, payment_status_id, MIN(change_date) as min_change_date
FROM
s_order_history
GROUP BY
orderID, payment_status_id
) his ON (ord.`id` = his.`orderID` AND ord.`cleared` = his.`payment_status_id`)
WHERE
ord.`ordertime` >= \''.$dateSTART.'\' AND ord.`ordertime` <= \''.$dateSTOP.'\'';
Try this:
select s_order.ordernumber, min(s_order_history.change_date)
from s_order left join s_order_history
on s_order.id = s_order_history.orderID
and s_order.cleared = s_order_history.payment_status_id
group by s_order.order_id
SELECT ord.`ordernumber` bestellnummer,
MIN( his.`change_date` ) zahldatum
...
GROUP BY ord.`ordernumber`
MIN is an aggregate function so you can't use it in a JOIN straight up like you've tried above. You also are not comparing it to a value in your JOIN.
You'll want to do something like:
his.`change_date` = (SELECT MIN(his.`change_date`) FROM s_order_history where ord.`id` = his.`orderID`)
in your JOIN.

MySQL date difference between column3 of first row and column2 of second row

I searched for the above topic and only getting query in Oracle which uses certain keywords specific to oracle.
+----------+------------+--------------------+
| Agent_id | valid_from | last_modified_date |
+----------+------------+--------------------+
| 13002 | 2010-12-25 | 2011-01-03 |
| 13002 | 2011-01-03 | 2011-08-25 |
| 13002 | 2011-08-26 | 2012-12-30 |
| 13002 | 2013-01-01 | 2013-01-01 |
| 12110 | 2014-02-27 | 2014-03-03 |
| 12110 | 2014-03-25 | 2014-12-25 |
+----------+------------+--------------------+
I have the above table values and want to retrieve difference between last_modified_date of 1st row and valid_from date of 2nd row and likewise for the same agent(agent id here).
Result table:
+----------+------------+--------------------+-----------+
| Agent_id | valid_from | last_modified_date | datediff |
+----------+------------+--------------------+-----------+
| 13002 | 2010-12-25 | 2011-01-03 | 0 |
| 13002 | 2011-01-03 | 2011-08-25 | 0 |
| 13002 | 2011-08-26 | 2012-12-30 | 1 |
| 13002 | 2013-01-01 | 2013-01-01 | 1 |
| 12110 | 2014-02-27 | 2014-03-03 | 0 |
| 12110 | 2014-03-25 | 2014-12-25 | 22 |
+----------+------------+--------------------+-----------+
If there is no date for comparison on first row diff should be 0.
These are set of dates where the status gets changed from Y to D and to find when the agent is without any activity.
please help!!
Use DATEDIFF function.
Example for MySQL:
SELECT
DATEDIFF(valid_from,last_modified_date) AS 'days'
FROM
table
Working example on SQLFiddle.
This will return difference in days. Source
Same for SQL Server 2005-2012:
SELECT
DATEDIFF(day,valid_from,last_modified_date ) AS 'days'
FROM
table
This will return difference in days. Source
Difference in days.
SELECT *,MIN(COALESCE(DATEDIFF(t1.valid_from, t2.last_modified_date),0))
FROM agents t1
LEFT JOIN agents t2 ON t1.agent_id=t2.agent_id AND t1.valid_from >= t2.last_modified_date
GROUP BY t1.agent_id, t1.valid_from
http://sqlfiddle.com/#!2/8b6ee/4
Try this query. Change the order by clause accordingly.
I have assumed the order based on your result.
select agent_id,
max(valid_from) as valid_from,max(last_modified_date) as last_modified_date,
ifnull(datediff(max(valid_from),max(last_modified_date)),0)as difference
from
(
select #a:=#a+1,
case when (#a+1)%2 = 0 then #b:=#a-2 else #b end as b , agent_id,
case when #a%2=0 then valid_from else 0 end as valid_from,
case when #a%2<>0 then last_modified_date else 0 end as last_modified_date
from table a ,(select #a:=0,#b:=0) b
order by agent_id desc ,valid_from
) a
group by agent_id,b
The easiest way to do this in MySQL is using variables:
select t.*
from (select t.*,
if(agent_id = #agent_id, datediff(valid_from, #last_modified_date), 0) as datediff,
#last_modified_date := last_modified_date,
#agent_id := agent_id
from table t cross join
(select #agent_id := 0, #last_modified_date := 0) const
order by agent_id, valid_from
) t;
You can also calculate the previous date using correlated subqueries.
By the way, those keywords that you would use in Oracle are not Oracle-specific. They are ANSI standard functionality that MySQL does not support.
just change the order of fields in result a little bit
mysqli_multi_query('
set #i='';
select
Agent_id, if(#i='', 0,datediff(valid_from,#i)) as datediff, valid_from, (#i:=last_modified_date) as last_modified_date
from
your_table'
);