Counting changes in timeline with MySQL - mysql

I am new to MySQL and I need your help. I have a table with similar data
---------------------------------------------------
|RobotPosX|RobotPosY|RobotPosDir|RobotShortestPath|
---------------------------------------------------
|0.1 | 0.2 | 15 | 1456 |
|0.2 | 0.3 | 30 | 1456 |
|0.54 | 0.67 | 15 | 1456 |
|0.68 | 0.98 | 22 | 1234 |
|0.36 | 0.65 | 45 | 1234 |
|0.65 | 0.57 | 68 | 1456 |
|0.65 | 0.57 | 68 | 2556 |
|0.79 | 0.86 | 90 | 1456 |
---------------------------------------------------
As you can see there are repeated values in the column RobotShortestPath, But they are important. Each number represent a specific task. If the number repeats consecutively(ex: 1456), it means that Robot is performing that task, and when the number changes(ex: 1234) it means that it has switched to another task. And if the previous number(ex:1456) appears again it also means that robot is performing a new task(1456) after done with earlier task(1234).
So where I am stuck is I am unable to get no of tasks performed. I have used several things from my minimum knowledge like COUNT, GROUP BY but nothing seem to work.
Here the no.of tasks performed are 5 actually, but whatever I do I get only 3 as result.

SET #last_task = 0;
SELECT SUM(new_task) AS tasks_performed
FROM (
SELECT
IF(#last_task = RobotShortestPath, 0, 1) AS new_task,
#last_task := RobotShortestPath
FROM table
ORDER BY ??
) AS tmp
Update for multiple tables
From a database strcture normailization view, your better of with one table, and have a filed identifing what column is what robot, if that not posible for some reason, you can get that by union the tables:
SET #last_task = 0;
SELECT robot_id, SUM(new_task) AS tasks_performed
FROM (
SELECT
IF(#last_task = RobotShortestPath, 0, 1) AS new_task,
#last_task := RobotShortestPath
FROM (
SELECT 1 AS robot_id, robot_log_1.* FROM robot_log_1
UNION SELECT 2, robot_log_2.* FROM robot_log_2
UNION SELECT 3, robot_log_3.* FROM robot_log_3
UNION SELECT 4, robot_log_4.* FROM robot_log_4
UNION SELECT 5, robot_log_5.* FROM robot_log_5
) as robot_log
ORDER BY robot_id, robot_log_id
) AS robot_log_history
GROUP BY robot_id
ORDER BY tasks_performed DESC

As I understood, you need to track when RobotShortestPath is changed to another value. To achieve this you can use trigger like this:
delimiter |
CREATE TRIGGER track AFTER UPDATE ON yourtable
FOR EACH ROW BEGIN
IF NEW.RobotShortestPath != OLD.RobotShortestPath THEN
UPDATE tracktable SET counter=counter+1 WHERE tracker=1;
END IF;
END;
|
delimeter ;

set #n:=0, #i:=0;
select max(sno) from
(
select #n:=case when #i=RobotShortestPath then #n else #n+1 end as sno,
#i:=RobotShortestPath as dno
from table
) as t;

Try following query:
SET #cnt = 0, #r = -1;
SELECT IF(armed <> #r, #cnt:= #cnt + 1, 0), #r:= RobotShortestPath, #cnt FROM table;
SELECT #cnt AS count;

Related

MySQL: Want to use a stored procedure to sort table by one column, then use that ordering to set values of another column

Pretty new to SQL, so I apologize if this is obvious.
I have a table of the number of games sold and their corresponding rank in the bestseller list (1, 2, 3, etc.). This table is called ranking(rank: bigint, global_sales: double).
What I want to do is make a stored procedure that I can call whenever the sales update, and then this procedure can be called to update the rankings. Here's what I have so far, and I'm afraid it's probably very incorrect:
delimiter $$
drop procedure if exists updateRank;
create procedure updateRank()
begin
select *
from ranking
order by global_sales desc;
declare r bigint default 1;
loop1: loop
GameRank = r;
set r=r+1;
end loop loop1;
end $$
delimiter ;
From what I could find here and on Google, I couldn't find anything similar, although this is probably a fairly-common query. Any insight would be greatly appreciated.
Edit: I'm using MySQL Workbench version 8.0 CE
you can do it in a loop
but also you can do it in a query
Of course i don't know your ölayout of your tables, but thos showws you how you would update
CREATE Table ranking (GameRank INt,global_sales INT);
INSERT INTO `ranking` VALUES(0,200),(0,300),(0,250),(0,125)
SELECT * FROM ranking
GameRank | global_sales
-------: | -----------:
0 | 200
0 | 300
0 | 250
0 | 125
MYsql 5.x
SET #ranking = 0
UPDATE ranking r
INNER JOIN (SELECT #ranking := #ranking + 1 _rank, global_sales FROM ranking ORDER BY global_sales DESC) t ON r.global_sales = t.global_sales
SET GameRank = _rank
SELECT * FROM ranking OrDER By global_sales DESC
GameRank | global_sales
-------: | -----------:
1 | 300
2 | 250
3 | 200
4 | 125
MYSQL 8
UPDATE ranking r
INNER JOIN (SELECT global_sales,RANK() OVER ( ORDER BY global_sales DESC ) my_rank FROM ranking) t
ON r.global_sales = t.global_sales
SET GameRank = my_rank
SELECT * FROM ranking OrDER By global_sales DESC
GameRank | global_sales
-------: | -----------:
1 | 300
2 | 250
3 | 200
4 | 125
db<>fiddle here

mysql for percentage between rows

I have some sql that looks like this:
SELECT
stageName,
count(*) as `count`
FROM x2production.contact_stages
WHERE FROM_UNIXTIME(createDate) between '2016-05-01' AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND (stageName = 'DI-Whatever' OR stageName = 'DI-Quote' or stageName = 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Whatever')
This produces a table that looks like:
+-------------+-------+
| stageName | count |
+-------------+-------+
| DI-quote | 1230 |
| DI-Meeting | 985 |
| DI-Whatever | 325 |
+-------------+-------+
Question:
I would like a percentage from one row to the next. For example the percentage of DI-Meeting to DI-quote. The math would be 100*985/1230 = 80.0%
So in the end the table would look like so:
+-------------+-------+------+
| stageName | count | perc |
+-------------+-------+------+
| DI-quote | 1230 | 0 |
| DI-Meeting | 985 | 80.0 |
| DI-Whatever | 325 | 32.9 |
+-------------+-------+------+
Is there any way to do this in mysql?
Here is an SQL fiddle to mess w/ the data: http://sqlfiddle.com/#!9/61398/1
The query
select stageName,count,if(rownum=1,0,round(count/toDivideBy*100,3)) as percent
from
( select stageName,count,greatest(#rn:=#rn+1,0) as rownum,
coalesce(if(#rn=1,count,#prev),null) as toDivideBy,
#prev:=count as dummy2
from
( SELECT
stageName,
count(*) as `count`
FROM Table1
WHERE FROM_UNIXTIME(createDate) between '2016-05-01' AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND (stageName = 'DI-Underwriting' OR stageName = 'DI-Quote' or stageName = 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Underwriting')
) xDerived1
cross join (select #rn:=0,#prev:=-1) as xParams1
) xDerived2;
Results
+-----------------+-------+---------+
| stageName | count | percent |
+-----------------+-------+---------+
| DI-Quote | 16 | 0 |
| DI-Meeting | 13 | 81.250 |
| DI-Underwriting | 4 | 30.769 |
+-----------------+-------+---------+
Note, you want a 0 as the percent for the first row. That is easily changed to 100.
The cross join brings in the variables for use and initializes them. The greatest and coalesce are used for safety in variable use as spelled out well in this article, and clues from the MySQL Manual Page Operator Precedence. The derived tables names are just that: every derived table needs a name.
If you do not adhere to the principles in those referenced articles, then the use of variables is unsafe. I am not saying I nailed it, but that safety is always my focus.
The assignment of variables need to follow a safe form, such as the #rn variable being set on the inside of a function like greatest or least. We know that #rn is always greater than 0. So we are using the greatest function to force our will on the query. Same trick with coalesce, null will never happen, and := has lower precedence in the column that follows it. That is, the last one: #prev:= which follows the coalesce.
That way, a variable is set before other columns in that select row attempt to use its value.
So, just getting the expected results does not mean you did it safely and that it will work with your real data.
What you need is to use a LAG function, since MySQL doesn't support it your have to mimic it this way:
select stageName,
cnt,
IF(valBefore is null,0,((100*cnt)/valBefore)) as perc
from (SELECT tb.stageName,
tb.cnt,
#ct AS valBefore,
(#ct := cnt)
FROM (SELECT stageName,
count(*) as cnt
FROM Table1,
(SELECT #_stage = NULL,
#ct := NULL) vars
WHERE FROM_UNIXTIME(createDate) between '2016-05-01'
AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND stageName in ('DI-Underwriting', 'DI-Quote', 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Underwriting')
) tb
WHERE (CASE WHEN #_stage IS NULL OR #_stage <> tb.stageName
THEN #ct := NULL
ELSE NULL END IS NULL)
) as final
See it working here: http://sqlfiddle.com/#!9/61398/35
EDIT I've actually edited it to remove an unnecessary step (subquery)

MySQL dividing current row with current row + 1

I have a problem with MySQL query. I have two tables, table currency and table currency_detail. Table currency contains currency code, such as USD, EUR, IDR, etc. Table currency_detail contains date and rate. After joining tables to get all rates USD in 1 year, I have data which look like this :
Date | Rate
-------------------
2015-10-20 | 14463
2015-10-19 | 14452
2015-10-18 | 14442
2015-10-15 | 14371
2015-10-14 | 14322
2015-10-10 | 14306
2015-10-08 | 14322
I need to count every current row with current row + 1. Is it possible to get results that look like this ?
Date | Rate | PX
------------------------------
2015-10-20 | 14463 | 0.000761 -> LN(14463/14452)
2015-10-19 | 14452 | 0.000692 -> LN(14452/14442)
2015-10-18 | 14442 | 0.004928 -> LN(14442/14371)
2015-10-15 | 14371 | 0.003415 -> LN(14371/14322)
2015-10-14 | 14322 | 0.001118 -> LN(14322/14306)
2015-10-10 | 14306 | -0.00112 -> LN(14306/14322)
2015-10-08 | 14322 | 0 -> 0 (because no data after this row)
I have tried many ways, but still cant find the solutions. Anyone can help with the query ? Thanks before..
In standard SQL you would simply use LAG to read the value from the previous record. In MySQL you need a workaround. The easiest way might be to select all rows twice and number them on-the-fly; then you can join by row number:
select
this.rdate, this.rate,
ln(this.rate / prev.rate) as px
from
(
select #rownum1 := #rownum1 + 1 as rownum, rates.*
from rates
cross join (select #rownum1 := 0) init
order by rdate
) this
left join
(
select #rownum2 := #rownum2 + 1 as rownum, rates.*
from rates
cross join (select #rownum2 := 0) init
order by rdate
) prev on prev.rownum = this.rownum - 1
order by this.rdate desc;
I had to use different rownum variable names in the two subqueries, by the way, as MySQL got confused otherwise. I consider this a flaw, but I must admit MySQL's variables-in-SQL thing is still kind of alien to me :-)
SQL fiddle: http://www.sqlfiddle.com/#!9/341c4/7

How can I make an SQL query that returns time differences between checkins and checkouts?

I'm using mysql and I've got a table similar to this one:
id | user | task | time | checkout
----+-------+------+-----------------------+---------
1 | 1 | 1 | 2014-11-25 17:00:00 | 0
2 | 2 | 2 | 2014-11-25 17:00:00 | 0
3 | 1 | 1 | 2014-11-25 18:00:00 | 1
4 | 1 | 2 | 2014-11-25 19:00:00 | 0
5 | 2 | 2 | 2014-11-25 20:00:00 | 1
6 | 1 | 2 | 2014-11-25 21:00:00 | 1
7 | 1 | 1 | 2014-11-25 21:00:00 | 0
8 | 1 | 1 | 2014-11-25 22:00:00 | 1
id is just an autogenerated primary key, and checkout is 0 if that row registered a user checking in and 1 if the user was checking out from the task.
I would like to know how to make a query that returns how much time has a user spent at each task, that is to say, I want to know the sum of the time differences between the checkout=0 time and the nearest checkout=1 time for each user and task.
Edit: to make things clearer, the results I'd expect from my query would be:
user | task | SUM(timedifference)
------+------+-----------------
1 | 1 | 02:00:00
1 | 2 | 02:00:00
2 | 2 | 03:00:00
I have tried using SUM(UNIX_TIMESTAMP(time) - UNIX_TIMESTAMP(time)), while grouping by user and task to figure out how much time had elapsed, but I don't know how to make the query only sum the differences between the particular times I want instead of all of them.
Can anybody help? Is this at all possible?
As all comments tell you, your current table structure is not ideal. However it's still prossible to pair checkins with checkouts. This is a SQL server implementation but i am sure you can translate it to MySql:
SELECT id
, user_id
, task
, minutes_per_each_task_instance = DATEDIFF(minute, time, (
SELECT TOP 1 time
FROM test AS checkout
WHERE checkin.user_id = checkout.user_id
AND checkin.task = checkout.task
AND checkin.id < checkout.id
AND checkout.checkout = 1
))
FROM test AS checkin
WHERE checkin.checkout = 0
Above code works but will become slower and slower as your table starts to grow. After a couple of hundred thousands it will become noticable
I suggest renaming time column to checkin and instead of having checkout boolean field make it datetime, and update record when user checkouts. That way you will have half the number of records and no complex logic for reading or querying
You can determine with a ranking method what are the matching check in/ check out records, and calculate time differences between them
In my example new_table is the name of your table
SELECT n.user, n.task,n.time, n.checkout ,
CASE WHEN #prev_user = n.user
AND #prev_task = n.task
AND #prev_checkout = 0
AND n.checkout = 1
AND #prev_time IS NOT NULL
THEN HOUR(TIMEDIFF(n.time, #prev_time)) END AS timediff,
#prev_time := n.time,
#prev_user := n.user,
#prev_task := n.task,
#prev_checkout := n.checkout
FROM new_table n,
(SELECT #prev_user = 0, #prev_task = 0, #prev_checkout = 0, #prev_time = NULL) a
ORDER BY user, task, `time`
Then sum the time differences (timediff) by wrapping it in another select
SELECT x.user, x.task, sum(x.timediff) as total
FROM (
SELECT n.user, n.task,n.time, n.checkout ,
CASE WHEN #prev_user = n.user
AND #prev_task = n.task
AND #prev_checkout = 0
AND n.checkout = 1
AND #prev_time IS NOT NULL
THEN HOUR(TIMEDIFF(n.time, #prev_time)) END AS timediff,
#prev_time := n.time,
#prev_user := n.user,
#prev_task := n.task,
#prev_checkout := n.checkout
FROM new_table n,
(#prev_user = 0, #prev_task = 0, #prev_checkout = 0, #prev_time = NULL) a
ORDER BY user, task, `time`
) x
GROUP BY x.user, x.task
It would probably be easier to understand by changing the table structure though. If that is at all possible. Then the SQL wouldn't have to be so complicated and would be more efficient. But to answer your question it is possible. :)
In the above examples, names prefixed with '#' are MySQL variables, you can use the ':=' to set a variable to a value. Cool stuff ay?
Select MAX of checkouts and checkins independently, map them based on user and task and calculate the time difference
select user, task,
SUM(UNIX_TIMESTAMP(checkin.time) - UNIX_TIMESTAMP(checkout.time)) from (
(select user, task, MAX(time) as time
from checkouts
where checkout = 0
group by user, task) checkout
inner join
(select user, task, MAX(time) as time
from checkouts
where checkout = 1
group by user, task) checkin
on (checkin.time > checkout.time
and checkin.user = checkout.user
and checkin.task = checkout.task)) c
This should work. Join on the tables and select the minimum times
SELECT
`user`,
`task`,
SUM(
UNIX_TIMESTAMP(checkout) - UNIX_TIMESTAMP(checkin)
)
FROM
(SELECT
so1.`user`,
so1.`task`,
MIN(so1.`time`) AS checkin,
MIN(so2.`time`) AS checkout
FROM
so so1
INNER JOIN so so2
ON (
so1.`id` = so2.`id`
AND so1.`user` = so2.`user`
AND so1.`task` = so2.`task`
AND so1.`checkout` = 0
AND so2.`checkout` = 1
AND so1.`time` < so2.`time`
)
GROUP BY `user`,
`task`,
so1.`time`) a
GROUP BY `user`,
`task` ;
As others have suggested though, This will not scale too well as it is, you would need to adjust it if it starts handling more data

Need help writing this update query with if condition and autoincrement in mysql

Background before we begin...
Table schema:
UserId | ActivityDate | Time_diff
where "ActivityDate" is timestamp of activity by user
"Time_diff" is timestampdiff between the next activity and current activity in seconds
in general, but for the last recorded activity of user, since there is no next activity I set the Time_diff to -999
Ex:
UserId | ActivityDate | Time_diff
| 1 | 2012-11-10 11:19:04 | 12 |
| 1 | 2012-11-10 11:19:16 | 11 |
| 1 | 2012-11-10 11:19:27 | 3 |
| 1 | 2012-11-10 11:19:30 | 236774 |
| 1 | 2012-11-13 05:05:44 | 39 |
| 1 | 2012-11-13 05:06:23 | 77342 |
| 1 | 2012-11-14 02:35:25 | 585888 |
| 1 | 2012-11-20 21:20:13 | 1506130 |
...
| 1 | 2013-06-13 06:32:48 | 1616134 |
| 1 | 2013-07-01 23:28:22 | 5778459 |
| 1 | 2013-09-06 20:36:01 | -999 |
| 2 | 2008-08-01 04:59:33 | 622 |
| 2 | 2008-08-01 05:09:55 | 38225 |
| 2 | 2008-08-01 15:47:00 | 31108 |
| 2 | 2008-08-02 00:25:28 | 28599 |
| 2 | 2008-08-02 08:22:07 | 163789 |
| 2 | 2008-08-04 05:51:56 | 1522915 |
| 2 | 2008-08-21 20:53:51 | 694678 |
| 2 | 2008-08-29 21:51:49 | 2945291 |
| 2 | 2008-10-03 00:00:00 | 172800 |
| 2 | 2008-10-05 00:00:00 | 776768 |
| 2 | 2008-10-13 23:46:08 | 3742999 |
I have just added the field "session_id"
alter table so_time_diff add column session_id int(11) not null;
My actual question...
I would like to update this field for each of the above records based on the following logic:
for first record: set session_id = 1
from second record:
if previous_record.UserId == this_record.UserId AND previous_record.time_diff <=3600
set this_record.session_id = previous_record.session_id
else if previous_record.UserId == this_record.UserId AND previous_record.time_diff >3600
set this_record.session_id = previous_record.session_id + 1
else if previous_record.UserId <> this_record.UserId
set session_id = 1 ## for a different user, restart
In simple words,
if two records of the same user are within a time_interval of 3600 seconds, assign the same sessionid, if not increment the sessionid, if its a different user, restart the sessionid count.
I've never written logic in an update query before. Is this possible? Any guidance is greatly appreciated!
Yes, this is possible. It would be easier if the time_diff was on the later record, rather than the previous record, but we can make it work. (We don't really need the stored time_diff.)
The "trick" to getting this to work is really writing a SELECT statement. If you've got a SELECT statement that returns the key of the row to be updated, and the values to be assigned, making that into an UPDATE is trivial.
The "trick" to getting a SELECT statement is to make use of MySQL user variables, and is dependent on non-guaranteed behavior of MySQL.
This is the skeleton of the statement:
SELECT #prev_userid AS prev_userid
, #prev_activitydate AS prev_activitydate
, #sessionid AS sessionid
, #prev_userid := t.userid AS userid
, #prev_activitydate := t.activitydate AS activitydate
FROM (SELECT #prev_userid := NULL, #prev_activitydate := NULL, #sessionid := 1) i
JOIN so_time_diff t
ORDER BY t.userid, t.activitydate
(We hope there's an index ON mytable (userid, activitydate), so the query can be satisfied from the index, without a need for an expensive "Using filesort" operation.)
Let's unpack that a bit. Firstly, the three MySQL user variables get initialized by the inline view aliased as i. We don't really care about what that returns, we only really care that it initializes the user variables. Because we're using it in a JOIN operation, we also care that it returns exactly one row.
When the first row is processed, we have the values that were previously assigned to the user variable, and we assign the values from the current row to them. When the next row is processed, the values from the previous row are in the user variables, and we assign the current row values to them, and so on.
The "ORDER BY" on the query is important; it's vital that we process the rows in the correct order.
But that's just a start.
The next step is comparing the userid and activitydate values of the current and previous rows, and deciding whether we're in the same sessionid, or whether its a different session, and we need to increment the sessionid by 1.
SELECT #sessionid := #sessionid +
IF( t.userid = #prev_userid AND
TIMESTAMPDIFF(SECOND,#prev_activitydate,t.activitydate) <= 3600
,0,1) AS sessionid
, #prev_userid := t.userid AS userid
, #prev_activitydate := t.activitydate AS activitydate
FROM (SELECT #prev_userid := NULL, #prev_activitydate := NULL, #sessionid := 1) i
JOIN so_time_diff t
ORDER BY t.userid, t.activitydate
You could make use of the value stored in the existing time_diff column, but you need the value from previous row when checking the current row, so that just be another MySQL user variable, a check of #prev_time_diff, rather than calculating the timestamp difference (as in my example above.) (We can add other expressions to the select list, to make debugging/verification easier...
, #prev_userid=t.userid
, TIMESTAMPDIFF(SECOND,#prev_activitydate,t.activitydate)
N.B. The ORDER of the expressions in the SELECT list is important; the expressions are evaluated in the order they appear... this wouldn't work if we were to assign the userid value from the current row to the user variable BEFORE we checked it... that's why those assignments come last in the SELECT list.
Once we have a query that looks good, that's returning a "sessionid" value that we want to assign to the row with a matching userid and activitydate, we can use that in a multitable update statement.
UPDATE (
-- query that generates sessionid for userid, activityid goes here
) s
JOIN so_time_diff t
ON t.userid = s.userid
AND t.activitydate = s.activity_date
SET t.sessionid = s.sessionid
(If there's a lot of rows, this could crank a very long time. With versions of MySQL prior to 5.6, I believe the derived table (aliased as s) won't have any indexes created on it. Hopefully, MySQL will use the derived table s as the driving table for the JOIN operation, and do index lookups to the target table.)
FOLLOWUP
I entirely missed the requirement to restart sessionid at 1 for each user. To do that, I'd modify the expression that's assigned to #sessionid, just split the condition tests of userid and activitydate. If the userid is different than the previous row, then return a 1. Otherwise, based on the comparison of activitydate, return either the current value of #sessionid, or the current value incremented by 1.
Like this:
SELECT #sessionid :=
IF( t.userid = #prev_userid
, IF( TIMESTAMPDIFF(SECOND,#prev_activitydate,t.activitydate) <= 3600
, #sessionid
, #sessionid + 1 )
, 1 )
AS sessionid
, #prev_userid := t.userid AS userid
, #prev_activitydate := t.activitydate AS activitydate
FROM (SELECT #prev_userid := NULL, #prev_activitydate := NULL, #sessionid := 1) i
JOIN so_time_diff t
ORDER BY t.userid, t.activitydate
N.B. None of these statements is tested, these statements have only been desk checked; I've successfully used this pattern innumerable times.
Here is what I wrote, and this worked!!!
SELECT #sessionid := #sessionid +
CASE WHEN #prev_userid IS NULL THEN 0
WHEN t.UserId <> #prev_userid THEN 1-#sessionid
WHEN t.UserId = #prev_userid AND
TIMESTAMPDIFF(SECOND,#prev_activitydate,t.ActivityDate) <= 3600
THEN 0 ELSE 1
END
AS sessionid
, #prev_userid := t.UserId AS UserId
, #prev_activitydate := t.ActivityDate AS ActivityDate,
time_diff
FROM (SELECT #prev_userid := NULL, #prev_activitydate := NULL, #sessionid := 1) i
JOIN example t
ORDER BY t.UserId, t.ActivityDate;
thanks again to #spencer7593 for your very descriptive answer giving me the right direction..!!!