MySQL dividing current row with current row + 1 - mysql

I have a problem with MySQL query. I have two tables, table currency and table currency_detail. Table currency contains currency code, such as USD, EUR, IDR, etc. Table currency_detail contains date and rate. After joining tables to get all rates USD in 1 year, I have data which look like this :
Date | Rate
-------------------
2015-10-20 | 14463
2015-10-19 | 14452
2015-10-18 | 14442
2015-10-15 | 14371
2015-10-14 | 14322
2015-10-10 | 14306
2015-10-08 | 14322
I need to count every current row with current row + 1. Is it possible to get results that look like this ?
Date | Rate | PX
------------------------------
2015-10-20 | 14463 | 0.000761 -> LN(14463/14452)
2015-10-19 | 14452 | 0.000692 -> LN(14452/14442)
2015-10-18 | 14442 | 0.004928 -> LN(14442/14371)
2015-10-15 | 14371 | 0.003415 -> LN(14371/14322)
2015-10-14 | 14322 | 0.001118 -> LN(14322/14306)
2015-10-10 | 14306 | -0.00112 -> LN(14306/14322)
2015-10-08 | 14322 | 0 -> 0 (because no data after this row)
I have tried many ways, but still cant find the solutions. Anyone can help with the query ? Thanks before..

In standard SQL you would simply use LAG to read the value from the previous record. In MySQL you need a workaround. The easiest way might be to select all rows twice and number them on-the-fly; then you can join by row number:
select
this.rdate, this.rate,
ln(this.rate / prev.rate) as px
from
(
select #rownum1 := #rownum1 + 1 as rownum, rates.*
from rates
cross join (select #rownum1 := 0) init
order by rdate
) this
left join
(
select #rownum2 := #rownum2 + 1 as rownum, rates.*
from rates
cross join (select #rownum2 := 0) init
order by rdate
) prev on prev.rownum = this.rownum - 1
order by this.rdate desc;
I had to use different rownum variable names in the two subqueries, by the way, as MySQL got confused otherwise. I consider this a flaw, but I must admit MySQL's variables-in-SQL thing is still kind of alien to me :-)
SQL fiddle: http://www.sqlfiddle.com/#!9/341c4/7

Related

If a value occurs for the first time mark 1 else 0, and count those in group

My data looks like this:
CreateTime | mobile
-----------+--------
2017/01/01 | 111
2017/01/01 | 222
2017/01/05 | 111
2017/01/08 | 333
2017/03/09 | 111
What I am trying to do is to add a variable if it is the first time that this mobile number occured:
CreateTime | mobile | FirstTime
-----------+--------+----------
2017/01/01 | 111 | 1
2017/01/01 | 222 | 1
2017/01/05 | 111 | 0
2017/01/08 | 333 | 1
2017/03/09 | 111 | 0
2017/03/15 | 222 | 0
2017/03/18 | 444 | 1
Basically we need to add a "true/false" column if it is the first time (based on createtime (and some other fields) which may or may not be sorted) that this specific mobile number occurred.
Ideally, this adjusted table will then be able to give me the following results when queried:
Select Month(createtime) as month,
count(mobile) as received,
sum(Firsttime) as Firsttimers
from ABC
Group by month(createtime)
Result:
Month | Received | FirstTimers
--------+----------+------------
2017/01 | 4 | 3
2017/03 | 3 | 1
If I can get to the RESULTS without needing to create the additional step, then that will be even better.
I do however need the query to run fast hence my thinking of creating the middle table perhaps but I stand corrected.
This is my current code and it works but it is not as fast as I'd like nor is it elegant.
SELECT Month(InF1.createtime) as 'Month',
Count(InF1.GUID) AS Received,
Sum(coalesce(Unique_lead,1)) As FirstTimers
FROM MYDATA_TABLE as InF1
Left Join
( SELECT createtime, mobile, GUID, 0 as Unique_lead
FROM MYDATA_TABLE as InF2
WHERE createtime = (SELECT min(createtime)
FROM MYDATA_TABLE as InF3
WHERE InF2.mobile=InF3.mobile
)
) as InF_unique
On Inf1.GUID = InF_unique.GUID
group by month(createtime)
(appologies if the question is incorrectly posted, it is my first post)
You could use sub query to get the first date per mobile, outer join it on the actual mobile date, and count matches. Make sure to count distinct mobile numbers to not double count the same number when it occurs with the same date twice:
select substr(createtime, 1, 7) month,
count(*) received,
count(distinct grp.mobile) firsttimers
from abc
left join (
select mobile,
min(createtime) firsttime
from abc
group by mobile
) grp
on abc.mobile = grp.mobile
and abc.createtime = grp.firsttime
group by month
Here is an alternative using variables, which can give you a row number:
select substr(createtime, 1, 7) month,
count(*) received,
sum(rn = 1) firsttimers
from (
select createtime,
#rn := if(#mob = mobile, #rn + 1, 1) rn,
#mob := mobile mobile
from (select * from abc order by mobile, createtime) ordered,
(select #rn := 1, #mob := null) init
order by mobile, createtime
) numbered
group by month;
NB: If you have MySql 8+, then use window functions.

mysql for percentage between rows

I have some sql that looks like this:
SELECT
stageName,
count(*) as `count`
FROM x2production.contact_stages
WHERE FROM_UNIXTIME(createDate) between '2016-05-01' AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND (stageName = 'DI-Whatever' OR stageName = 'DI-Quote' or stageName = 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Whatever')
This produces a table that looks like:
+-------------+-------+
| stageName | count |
+-------------+-------+
| DI-quote | 1230 |
| DI-Meeting | 985 |
| DI-Whatever | 325 |
+-------------+-------+
Question:
I would like a percentage from one row to the next. For example the percentage of DI-Meeting to DI-quote. The math would be 100*985/1230 = 80.0%
So in the end the table would look like so:
+-------------+-------+------+
| stageName | count | perc |
+-------------+-------+------+
| DI-quote | 1230 | 0 |
| DI-Meeting | 985 | 80.0 |
| DI-Whatever | 325 | 32.9 |
+-------------+-------+------+
Is there any way to do this in mysql?
Here is an SQL fiddle to mess w/ the data: http://sqlfiddle.com/#!9/61398/1
The query
select stageName,count,if(rownum=1,0,round(count/toDivideBy*100,3)) as percent
from
( select stageName,count,greatest(#rn:=#rn+1,0) as rownum,
coalesce(if(#rn=1,count,#prev),null) as toDivideBy,
#prev:=count as dummy2
from
( SELECT
stageName,
count(*) as `count`
FROM Table1
WHERE FROM_UNIXTIME(createDate) between '2016-05-01' AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND (stageName = 'DI-Underwriting' OR stageName = 'DI-Quote' or stageName = 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Underwriting')
) xDerived1
cross join (select #rn:=0,#prev:=-1) as xParams1
) xDerived2;
Results
+-----------------+-------+---------+
| stageName | count | percent |
+-----------------+-------+---------+
| DI-Quote | 16 | 0 |
| DI-Meeting | 13 | 81.250 |
| DI-Underwriting | 4 | 30.769 |
+-----------------+-------+---------+
Note, you want a 0 as the percent for the first row. That is easily changed to 100.
The cross join brings in the variables for use and initializes them. The greatest and coalesce are used for safety in variable use as spelled out well in this article, and clues from the MySQL Manual Page Operator Precedence. The derived tables names are just that: every derived table needs a name.
If you do not adhere to the principles in those referenced articles, then the use of variables is unsafe. I am not saying I nailed it, but that safety is always my focus.
The assignment of variables need to follow a safe form, such as the #rn variable being set on the inside of a function like greatest or least. We know that #rn is always greater than 0. So we are using the greatest function to force our will on the query. Same trick with coalesce, null will never happen, and := has lower precedence in the column that follows it. That is, the last one: #prev:= which follows the coalesce.
That way, a variable is set before other columns in that select row attempt to use its value.
So, just getting the expected results does not mean you did it safely and that it will work with your real data.
What you need is to use a LAG function, since MySQL doesn't support it your have to mimic it this way:
select stageName,
cnt,
IF(valBefore is null,0,((100*cnt)/valBefore)) as perc
from (SELECT tb.stageName,
tb.cnt,
#ct AS valBefore,
(#ct := cnt)
FROM (SELECT stageName,
count(*) as cnt
FROM Table1,
(SELECT #_stage = NULL,
#ct := NULL) vars
WHERE FROM_UNIXTIME(createDate) between '2016-05-01'
AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND stageName in ('DI-Underwriting', 'DI-Quote', 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Underwriting')
) tb
WHERE (CASE WHEN #_stage IS NULL OR #_stage <> tb.stageName
THEN #ct := NULL
ELSE NULL END IS NULL)
) as final
See it working here: http://sqlfiddle.com/#!9/61398/35
EDIT I've actually edited it to remove an unnecessary step (subquery)

How can I make an SQL query that returns time differences between checkins and checkouts?

I'm using mysql and I've got a table similar to this one:
id | user | task | time | checkout
----+-------+------+-----------------------+---------
1 | 1 | 1 | 2014-11-25 17:00:00 | 0
2 | 2 | 2 | 2014-11-25 17:00:00 | 0
3 | 1 | 1 | 2014-11-25 18:00:00 | 1
4 | 1 | 2 | 2014-11-25 19:00:00 | 0
5 | 2 | 2 | 2014-11-25 20:00:00 | 1
6 | 1 | 2 | 2014-11-25 21:00:00 | 1
7 | 1 | 1 | 2014-11-25 21:00:00 | 0
8 | 1 | 1 | 2014-11-25 22:00:00 | 1
id is just an autogenerated primary key, and checkout is 0 if that row registered a user checking in and 1 if the user was checking out from the task.
I would like to know how to make a query that returns how much time has a user spent at each task, that is to say, I want to know the sum of the time differences between the checkout=0 time and the nearest checkout=1 time for each user and task.
Edit: to make things clearer, the results I'd expect from my query would be:
user | task | SUM(timedifference)
------+------+-----------------
1 | 1 | 02:00:00
1 | 2 | 02:00:00
2 | 2 | 03:00:00
I have tried using SUM(UNIX_TIMESTAMP(time) - UNIX_TIMESTAMP(time)), while grouping by user and task to figure out how much time had elapsed, but I don't know how to make the query only sum the differences between the particular times I want instead of all of them.
Can anybody help? Is this at all possible?
As all comments tell you, your current table structure is not ideal. However it's still prossible to pair checkins with checkouts. This is a SQL server implementation but i am sure you can translate it to MySql:
SELECT id
, user_id
, task
, minutes_per_each_task_instance = DATEDIFF(minute, time, (
SELECT TOP 1 time
FROM test AS checkout
WHERE checkin.user_id = checkout.user_id
AND checkin.task = checkout.task
AND checkin.id < checkout.id
AND checkout.checkout = 1
))
FROM test AS checkin
WHERE checkin.checkout = 0
Above code works but will become slower and slower as your table starts to grow. After a couple of hundred thousands it will become noticable
I suggest renaming time column to checkin and instead of having checkout boolean field make it datetime, and update record when user checkouts. That way you will have half the number of records and no complex logic for reading or querying
You can determine with a ranking method what are the matching check in/ check out records, and calculate time differences between them
In my example new_table is the name of your table
SELECT n.user, n.task,n.time, n.checkout ,
CASE WHEN #prev_user = n.user
AND #prev_task = n.task
AND #prev_checkout = 0
AND n.checkout = 1
AND #prev_time IS NOT NULL
THEN HOUR(TIMEDIFF(n.time, #prev_time)) END AS timediff,
#prev_time := n.time,
#prev_user := n.user,
#prev_task := n.task,
#prev_checkout := n.checkout
FROM new_table n,
(SELECT #prev_user = 0, #prev_task = 0, #prev_checkout = 0, #prev_time = NULL) a
ORDER BY user, task, `time`
Then sum the time differences (timediff) by wrapping it in another select
SELECT x.user, x.task, sum(x.timediff) as total
FROM (
SELECT n.user, n.task,n.time, n.checkout ,
CASE WHEN #prev_user = n.user
AND #prev_task = n.task
AND #prev_checkout = 0
AND n.checkout = 1
AND #prev_time IS NOT NULL
THEN HOUR(TIMEDIFF(n.time, #prev_time)) END AS timediff,
#prev_time := n.time,
#prev_user := n.user,
#prev_task := n.task,
#prev_checkout := n.checkout
FROM new_table n,
(#prev_user = 0, #prev_task = 0, #prev_checkout = 0, #prev_time = NULL) a
ORDER BY user, task, `time`
) x
GROUP BY x.user, x.task
It would probably be easier to understand by changing the table structure though. If that is at all possible. Then the SQL wouldn't have to be so complicated and would be more efficient. But to answer your question it is possible. :)
In the above examples, names prefixed with '#' are MySQL variables, you can use the ':=' to set a variable to a value. Cool stuff ay?
Select MAX of checkouts and checkins independently, map them based on user and task and calculate the time difference
select user, task,
SUM(UNIX_TIMESTAMP(checkin.time) - UNIX_TIMESTAMP(checkout.time)) from (
(select user, task, MAX(time) as time
from checkouts
where checkout = 0
group by user, task) checkout
inner join
(select user, task, MAX(time) as time
from checkouts
where checkout = 1
group by user, task) checkin
on (checkin.time > checkout.time
and checkin.user = checkout.user
and checkin.task = checkout.task)) c
This should work. Join on the tables and select the minimum times
SELECT
`user`,
`task`,
SUM(
UNIX_TIMESTAMP(checkout) - UNIX_TIMESTAMP(checkin)
)
FROM
(SELECT
so1.`user`,
so1.`task`,
MIN(so1.`time`) AS checkin,
MIN(so2.`time`) AS checkout
FROM
so so1
INNER JOIN so so2
ON (
so1.`id` = so2.`id`
AND so1.`user` = so2.`user`
AND so1.`task` = so2.`task`
AND so1.`checkout` = 0
AND so2.`checkout` = 1
AND so1.`time` < so2.`time`
)
GROUP BY `user`,
`task`,
so1.`time`) a
GROUP BY `user`,
`task` ;
As others have suggested though, This will not scale too well as it is, you would need to adjust it if it starts handling more data

MySQL - Count Values occurring between other values

I'd like to count how many occurrences of a value happen before a specific value
Below is my starting table
+-----------------+--------------+------------+
| Id | Activity | Time |
+-----------------+--------------+------------+
| 1 | Click | 1392263852 |
| 2 | Error | 1392263853 |
| 3 | Finish | 1392263862 |
| 4 | Click | 1392263883 |
| 5 | Click | 1392263888 |
| 6 | Finish | 1392263952 |
+-----------------+--------------+------------+
I'd like to count how many clicks happen before a finish happens.
I've got a very roundabout way of doing it where I write a function to find the last
finished activity and query the clicks between the finishes.
Also repeat this for Error.
What I'd like to achieve is the below table
+-----------------+--------------+------------+--------------+------------+
| Id | Activity | Time | Clicks | Error |
+-----------------+--------------+------------+--------------+------------+
| 3 | Finish | 1392263862 | 1 | 1 |
| 6 | Finish | 1392263952 | 2 | 0 |
+-----------------+--------------+------------+--------------+------------+
This table is very long so I'm looking for an efficient solution.
If anyone has any ideas.
Thanks heaps!
This is a complicated problem. Here is an approach to solving it. The groups between the "finish" records need to be identified as being the same, by assigning a group identifier to them. This identifier can be calculated by counting the number of "finish" records with a larger id.
Once this is assigned, your results can be calculated using an aggregation.
The group identifier can be calculated using a correlated subquery:
select max(id) as id, 'Finish' as Activity, max(time) as Time,
sum(Activity = 'Clicks') as Clicks, sum(activity = 'Error') as Error
from (select s.*,
(select sum(s2.activity = 'Finish')
from starting s2
where s2.id >= s.id
) as FinishCount
from starting s
) s
group by FinishCount;
A version that leverages user(session) variables
SELECT MAX(id) id,
MAX(activity) activity,
MAX(time) time,
SUM(activity = 'Click') clicks,
SUM(activity = 'Error') error
FROM
(
SELECT t.*, #g := IF(activity <> 'Finish' AND #a = 'Finish', #g + 1, #g) g, #a := activity
FROM table1 t CROSS JOIN (SELECT #g := 0, #a := NULL) i
ORDER BY time
) q
GROUP BY g
Output:
| ID | ACTIVITY | TIME | CLICKS | ERROR |
|----|----------|------------|--------|-------|
| 3 | Finish | 1392263862 | 1 | 1 |
| 6 | Finish | 1392263952 | 2 | 0 |
Here is SQLFiddle demo
Try:
select x.id
, x.activity
, x.time
, sum(case when y.activity = 'Click' then 1 else 0 end) as clicks
, sum(case when y.activity = 'Error' then 1 else 0 end) as errors
from tbl x, tbl y
where x.activity = 'Finish'
and y.time < x.time
and (y.time > (select max(z.time) from tbl z where z.activity = 'Finish' and z.time < x.time)
or x.time = (select min(z.time) from tbl z where z.activity = 'Finish'))
group by x.id
, x.activity
, x.time
order by x.id
Here's another method of using variables, which is somewhat different to #peterm's:
SELECT
Id,
Activity,
Time,
Clicks,
Errors
FROM (
SELECT
t.*,
#clicks := #clicks + (activity = 'Click') AS Clicks,
#errors := #errors + (activity = 'Error') AS Errors,
#clicks := #clicks * (activity <> 'Finish'),
#errors := #errors * (activity <> 'Finish')
FROM
`starting` t
CROSS JOIN
(SELECT #clicks := 0, #errors := 0) i
ORDER BY
time
) AS s
WHERE Activity = 'Finish'
;
What's similar to Peter's query is that this one uses a subquery that's returning all the rows, setting some variables along the way and returning the variables' values as columns. That may be common to most methods that use variables, though, and that's where the similarity between these two queries ends.
The difference is in how the accumulated results are calculated. Here all the accumulation is done in the subquery, and the main query merely filters the derived dataset on Activity = 'Finish' to return the final result set. In contrast, the other query uses grouping and aggregation at the outer level to get the accumulated results, which may make it slower than mine in comparison.
At the same time Peter's suggestion is more easily scalable in terms of coding. If you happen to have to extend the number of activities to account for, his query would only need expansion in the form of adding one SUM(activity = '...') AS ... per new activity to the outer SELECT, whereas in my query you would need to add a variable and several expressions, as well as a column in the outer SELECT, per every new activity, which would bloat the resulting code much more quickly.

Counting changes in timeline with MySQL

I am new to MySQL and I need your help. I have a table with similar data
---------------------------------------------------
|RobotPosX|RobotPosY|RobotPosDir|RobotShortestPath|
---------------------------------------------------
|0.1 | 0.2 | 15 | 1456 |
|0.2 | 0.3 | 30 | 1456 |
|0.54 | 0.67 | 15 | 1456 |
|0.68 | 0.98 | 22 | 1234 |
|0.36 | 0.65 | 45 | 1234 |
|0.65 | 0.57 | 68 | 1456 |
|0.65 | 0.57 | 68 | 2556 |
|0.79 | 0.86 | 90 | 1456 |
---------------------------------------------------
As you can see there are repeated values in the column RobotShortestPath, But they are important. Each number represent a specific task. If the number repeats consecutively(ex: 1456), it means that Robot is performing that task, and when the number changes(ex: 1234) it means that it has switched to another task. And if the previous number(ex:1456) appears again it also means that robot is performing a new task(1456) after done with earlier task(1234).
So where I am stuck is I am unable to get no of tasks performed. I have used several things from my minimum knowledge like COUNT, GROUP BY but nothing seem to work.
Here the no.of tasks performed are 5 actually, but whatever I do I get only 3 as result.
SET #last_task = 0;
SELECT SUM(new_task) AS tasks_performed
FROM (
SELECT
IF(#last_task = RobotShortestPath, 0, 1) AS new_task,
#last_task := RobotShortestPath
FROM table
ORDER BY ??
) AS tmp
Update for multiple tables
From a database strcture normailization view, your better of with one table, and have a filed identifing what column is what robot, if that not posible for some reason, you can get that by union the tables:
SET #last_task = 0;
SELECT robot_id, SUM(new_task) AS tasks_performed
FROM (
SELECT
IF(#last_task = RobotShortestPath, 0, 1) AS new_task,
#last_task := RobotShortestPath
FROM (
SELECT 1 AS robot_id, robot_log_1.* FROM robot_log_1
UNION SELECT 2, robot_log_2.* FROM robot_log_2
UNION SELECT 3, robot_log_3.* FROM robot_log_3
UNION SELECT 4, robot_log_4.* FROM robot_log_4
UNION SELECT 5, robot_log_5.* FROM robot_log_5
) as robot_log
ORDER BY robot_id, robot_log_id
) AS robot_log_history
GROUP BY robot_id
ORDER BY tasks_performed DESC
As I understood, you need to track when RobotShortestPath is changed to another value. To achieve this you can use trigger like this:
delimiter |
CREATE TRIGGER track AFTER UPDATE ON yourtable
FOR EACH ROW BEGIN
IF NEW.RobotShortestPath != OLD.RobotShortestPath THEN
UPDATE tracktable SET counter=counter+1 WHERE tracker=1;
END IF;
END;
|
delimeter ;
set #n:=0, #i:=0;
select max(sno) from
(
select #n:=case when #i=RobotShortestPath then #n else #n+1 end as sno,
#i:=RobotShortestPath as dno
from table
) as t;
Try following query:
SET #cnt = 0, #r = -1;
SELECT IF(armed <> #r, #cnt:= #cnt + 1, 0), #r:= RobotShortestPath, #cnt FROM table;
SELECT #cnt AS count;