I have a Transaction table that records every amount added to or subtracted from the balance of a Customer, with the new balance:
+----+------------+------------+--------+---------+
| id | customerId | timestamp | amount | balance |
+----+------------+------------+--------+---------+
| 1 | 1 | 1000000001 | 10 | 10 |
| 2 | 1 | 1000000002 | -20 | -10 |
| 3 | 1 | 1000000003 | -10 | -20 |
| 4 | 2 | 1000000004 | -5 | -5 |
| 5 | 2 | 1000000005 | -5 | -10 |
| 6 | 2 | 1000000006 | 10 | 0 |
| 7 | 3 | 1000000007 | -5 | -5 |
| 8 | 3 | 1000000008 | 10 | 5 |
| 9 | 3 | 1000000009 | 10 | 15 |
| 10 | 4 | 1000000010 | 5 | 5 |
+----+------------+------------+--------+---------+
The Customer table stores the current balance, and looks like:
+----+---------+
| id | balance |
+----+---------+
| 1 | -20 |
| 2 | 0 |
| 3 | 15 |
| 4 | 5 |
+----+---------+
I would like to add a balanceSignSince column, that would store the timestamp at which the balance sign last changed. Transitioning to and from positive, negative, or zero counts as a balance change.
After the update, based on the above data, the Customer table should contain:
+----+---------+------------------+
| id | balance | balanceSignSince |
+----+---------+------------------+
| 1 | -20 | 1000000002 |
| 2 | 0 | 1000000006 |
| 3 | 15 | 1000000008 |
| 4 | 5 | 1000000010 |
+----+---------+------------------+
How can I write a SQL query that updates every Customer with the last time the balance sign changed, based on the Transaction table?
I suspect I can't do this without a quite complex stored procedure, but am curious to see if any clever ideas come up.
This uses a simulated rank() function.
select customerId, min(tstamp) from
(
select tstamp,
if (#cust = customerId and sign(#bal) = sign(balance), #rn := #rn,
if (#cust = customerId and sign(#bal) <> sign(balance), #rn := #rn + 1, #rn := 0)) as rn,
#cust := customerId as customerId, #bal := balance as balance
from
(select #rn := 0) x,
(select id, #cust := customerId as customerId, tstamp, amount, #bal := balance as balance
from trans order by customerId, tstamp desc) y
) z
where rn = 0
group by customerId;
Check it: http://rextester.com/XJVKK61181
This script returns a table like this:
+------------+----+------------+---------+
| tstamp | rn | customerId | balance |
+------------+----+------------+---------+
| 1000000003 | 0 | 1 | -20 |
| 1000000002 | 0 | 1 | -10 |
| 1000000001 | 1 | 1 | 10 |
| 1000000006 | 0 | 2 | 0 |
| 1000000005 | 2 | 2 | -10 |
| 1000000004 | 2 | 2 | -5 |
| 1000000009 | 0 | 3 | 15 |
| 1000000008 | 2 | 3 | 5 |
| 1000000007 | 3 | 3 | -5 |
| 1000000010 | 0 | 4 | 5 |
+------------+----+------------+---------+
Then selecting min(timestamp) of files where rn = 0:
+------------+-------------+
| customerId | min(tstamp) |
+------------+-------------+
| 1 | 1000000002 |
+------------+-------------+
| 2 | 1000000006 |
+------------+-------------+
| 3 | 1000000009 |
+------------+-------------+
| 4 | 1000000010 |
+------------+-------------+
Updated answer with the restriction that this needs to work on the existing data
The following query should work for most cases, there is still an issue with customers having only a single transaction or no sign change. As this is a one time update, I would run the query below and then do a simple update for all users not having a timestamp set, for them it's going to be the timestamp of the first transaction:
# Find the smallest timestamp, e.g. the
# transaction which changed the signum.
SELECT
p.customerId as customerId,
MIN(t.timestamp) as balanceSignSince
FROM
transaction as t,
(
# find the latest timestamp having
# a different sign for each user.
# Here is the issue with users having
# only a single transaction or no sign
# changes.
SELECT
u.customerId as customerId,
MAX(t.timestamp) as balanceSignSince
FROM
transaction as t,
customer as c,
(
# find the timestamp of the very last
# transaction for every user.
SELECT
t.customerId as customerId,
MAX(t.timestamp) as lastTransaction
FROM
transaction as t
GROUP BY
t.customerId
) as u
WHERE
u.customerId = c.id
AND u.customerId = t.customerId
AND SIGN(c.balance) <> SIGN(t.balance)
GROUP BY
u.customerId
) as p
WHERE
p.customerId = t.customerId
AND p.balanceSignSince < t.timestamp
GROUP BY
p.customerId;
Fiddle: http://sqlfiddle.com/#!9/bd0760/13
Original Answer
This should work to get the timestamp of a sign change:
SELECT
c.id as id,
MAX(t.timestamp) as balanceSignSince
FROM
transaction as t,
customer as c
WHERE
t.customerId = c.id
AND SIGN(t.balance) <> SIGN(c.balance)
This needs to be executed before the customer table is updated with the new balance. If you have a trigger on transation:insert you should probably put the above into the query updating the customer table.
Related
I am building a trading system where users need to know their running account balance by date for a specific user (uid) including how much they made from trading (results table) and how much they deposited or withdrew from their accounts (adjustments table).
Here is the sqlfiddle and tables: http://sqlfiddle.com/#!9/6bc9e4/1
Adjustments table:
+-------+-----+-----+--------+------------+
| adjid | aid | uid | amount | date |
+-------+-----+-----+--------+------------+
| 1 | 1 | 1 | 20 | 2019-08-18 |
| 2 | 1 | 1 | 50 | 2019-08-21 |
| 3 | 1 | 1 | 40 | 2019-08-21 |
| 4 | 1 | 1 | 10 | 2019-08-19 |
+-------+-----+-----+--------+------------+
Results table:
+-----+-----+-----+--------+-------+------------+
| tid | uid | aid | amount | taxes | date |
+-----+-----+-----+--------+-------+------------+
| 1 | 1 | 1 | 100 | 3 | 2019-08-19 |
| 2 | 1 | 1 | -50 | 1 | 2019-08-20 |
| 3 | 1 | 1 | 100 | 2 | 2019-08-21 |
| 4 | 1 | 1 | 100 | 2 | 2019-08-21 |
+-----+-----+-----+--------+-------+------------+
How do I get the below results for uid (1)
+--------------+------------+------------------+----------------+------------+
| ResultsTotal | TaxesTotal | AdjustmentsTotal | RunningBalance | Date |
+--------------+------------+------------------+----------------+------------+
| - | - | 20 | 20 | 2019-08-18 |
| 100 | 3 | 10 | 133 | 2019-08-19 |
| -50 | 1 | - | 84 | 2019-08-20 |
| 200 | 4 | 90 | 378 | 2019-08-21 |
+--------------+------------+------------------+----------------+------------+
Where RunningBalance is the current account balance for the particular user (uid).
Based on #Gabriel's answer, I came up with something like, but it gives me empty balance and duplicate records
SELECT SUM(ResultsTotal), SUM(TaxesTotal), SUM(AdjustmentsTotal), #runningtotal:= #runningtotal+SUM(ResultsTotal)+SUM(TaxesTotal)+SUM(AdjustmentsTotal) as Balance, date
FROM (
SELECT 0 AS ResultsTotal, 0 AS TaxesTotal, adjustments.amount AS AdjustmentsTotal, adjustments.date
FROM adjustments LEFT JOIN results ON (results.uid=adjustments.uid) WHERE adjustments.uid='1'
UNION ALL
SELECT results.amount AS ResultsTotal, taxes AS TaxesTotal, 0 as AdjustmentsTotal, results.date
FROM results LEFT JOIN adjustments ON (results.uid=adjustments.uid) WHERE results.uid='1'
) unionTable
GROUP BY DATE ORDER BY date
For what you are asking you would want to union then group the results from both tables, this should give the results you want. However, I recommend calculating the running balance outside of MySQL since this adds some complexity to our query.
Weird things could start to happen, for example, if someone already defined the #runningBalance variable as part of the queries scope.
SELECT aggregateTable.*, #runningBalance := ifNULL(#runningBalance, 0) + TOTAL
FROM (
SELECT SUM(ResultsTotal), SUM(TaxesTotal), SUM(AdjustmentsTotal)
, SUM(ResultsTotal) + SUM(TaxesTotal) + SUM(AdjustmentsTotal) as TOTAL
, date
FROM (
SELECT 0 AS ResultsTotal, 0 AS TaxesTotal, amount AS AdjustmentsTotal, date
FROM adjustments
UNION ALL
SELECT amount AS ResultsTotal, taxes AS TaxesTotal, 0 as AdjustmentsTotal, date
FROM results
) unionTable
GROUP BY date
) aggregateTable
I'm wondering how to select the second smallest value from a mysql table, grouped on a non-numeric column. If I have a table that looks like this:
+----+----------+------------+--------+------------+
| id | customer | order_type | amount | created_dt |
+----+----------+------------+--------+------------+
| 1 | 1 | web | 5 | 2017-01-01 |
| 2 | 1 | web | 7 | 2017-01-05 |
| 3 | 2 | web | 2 | 2017-01-07 |
| 4 | 3 | web | 2 | 2017-02-01 |
| 5 | 3 | web | 3 | 2017-02-01 |
| 6 | 2 | web | 5 | 2017-03-15 |
| 7 | 1 | in_person | 7 | 2017-02-01 |
| 8 | 3 | web | 8 | 2017-01-01 |
| 9 | 2 | web | 1 | 2017-04-01 |
+----+----------+------------+--------+------------+
I want to count the number of second orders in each month/year. I also have a customer table (which is where the customer ids come from). I can find the number of customers with more than at least 2 orders by the customer's created date by querying
select date(c.created_dt) as create_date, count(c.id)
from customer c
where c.id in
(select or.identity_id
from orders or
where
(select count(o.created_dt)
from orders o
where or.customer = o.customer and o.order_tpe in ('web')
) > 1
)
group by 1;
However, that result gives customer by their created date, and I can't seem to figure out how to find the the number of second orders by date.
The desired output i'd like to see, based on the data above, is:
+-------+------+---------------+
| month | year | second_orders |
+-------+------+---------------+
| 1 | 2017 | 1 |
| 2 | 2017 | 1 |
| 3 | 2017 | 1 |
+-------+------+---------------+
One way to approach this
SELECT YEAR(created_dt) year, MONTH(created_dt) month, COUNT(*) second_orders
FROM (
SELECT created_dt,
#rn := IF(#c = customer, #rn + 1, 1) rn,
#c := customer
FROM orders CROSS JOIN (
SELECT #c := NULL, #rn := 1
) i
WHERE order_type = 'web'
ORDER BY customer, id
) q
WHERE rn = 2
GROUP BY YEAR(created_dt), MONTH(created_dt)
ORDER BY year, month
Here is a dbfiddle demo
Output:
+------+-------+---------------+
| year | month | second_orders |
+------+-------+---------------+
| 2017 | 1 | 1 |
| 2017 | 2 | 1 |
| 2017 | 3 | 1 |
+------+-------+---------------+
How does one query the time difference between consecutive rows with a hierarchical data? For example, I'd like to go from the following table:
+-------+----------+---------------------+
| group_id | event | event_time |
+-------+----------+---------------------+
| 1 | alarm | 2016-12-01 17:53:12 |
| 1 | alarm | 2016-12-01 17:59:43 |
| 2 | purchase | 2016-11-29 09:49:47 |
| 2 | purchase | 2016-11-29 09:53:51 |
| 2 | purchase | 2016-11-29 09:57:59 |
| 2 | alarm | 2016-11-29 10:01:02 |
| 2 | alarm | 2016-11-29 10:13:27 |
| 2 | purchase | 2016-11-29 10:15:00 |
| 2 | purchase | 2016-11-29 10:16:24 |
+-------+----------+---------------------+
to:
+-------+----------+---------------------+------------+
| group_id | event | event_time | time_delta |
+-------+----------+---------------------+------------+
| 1 | alarm | 2016-12-01 17:53:12 | 0 |
| 1 | alarm | 2016-12-01 17:59:43 | 00:06:31 |
| 2 | purchase | 2016-11-29 09:49:47 | 0 |
| 2 | purchase | 2016-11-29 09:53:51 | 00:04:04 |
| 2 | purchase | 2016-11-29 09:57:59 | 00:04:08 |
| 2 | alarm | 2016-11-29 10:01:02 | 0 |
| 2 | alarm | 2016-11-29 10:13:27 | 00:12:25 |
| 2 | purchase | 2016-11-29 10:15:00 | 0 |
| 2 | purchase | 2016-11-29 10:16:24 | 00:01:24 |
+-------+----------+---------------------+------------+
Data above is illustrative; my data actually has many groups and many events. So basically, I'd like calculate the time difference whenever the group_id and the event is the same in consecutive rows.
You can get the previous time for a given group by doing:
select t.*,
(select t2.time_delta
from t t2
where t2.group_id = t.group_id and
t2.event = t.event and
t2.event_time < t.event_time
order by t2.event_time desc
limit 1
) as prev_event_time
from t;
You can then get the time difference in a variety of ways, such as:
select t.*, timediff(event_time, prev_event_time)
from (select t.*,
(select t2.time_delta
from t t2
where t2.group_id = t.group_id and
t2.event = t.event and
t2.event_time < t.event_time
order by t2.event_time desc
limit 1
) as prev_event_time
from t
) t
Try this using user defined variables:
SELECT
group_id, event, event_time, diff time_delta
FROM
(SELECT
t1.*,
CASE
WHEN #event = event AND #group = group_id THEN TIME_FORMAT(TIMEDIFF(event_time, #et), '%H:%i:%s')
ELSE 0
END diff,
#event:=event,
#group:=group_id,
#et:=event_time
FROM
(SELECT
*
FROM
your_table
ORDER BY group_id , event_time) t1
CROSS JOIN (SELECT #event:='', #group:=- 1, #et:='') t2) t;
#et variable stores the previous event_time within each group of group_id and event.
I am working with a dataset with a similar format to the following:
Table: Account
*-----------*----------*-------------*
| id | amount | date |
*-----------*----------*-------------*
| 1 | 100 | 01/01/2016 |
| 2 | 100 | 01/02/2016 |
| 3 | 100 | 01/03/2016 |
| 4 | 200 | 01/04/2016 |
| 5 | 200 | 01/05/2016 |
| 6 | 200 | 01/06/2016 |
| 7 | 300 | 01/07/2016 |
| 8 | 300 | 01/08/2016 |
| 9 | 300 | 01/09/2016 |
| 10 | 400 | 01/10/2016 |
*-----------*----------*-------------*
I need a query to return that returns the most recent record for every distinct value in the table. So, the above table would return
*-----------*----------*-------------*
| id | amount | date |
*-----------*----------*-------------*
| 3 | 100 | 01/03/2016 |
| 6 | 200 | 01/06/2016 |
| 9 | 300 | 01/09/2016 |
| 10 | 400 | 01/10/2016 |
*-----------*----------*-------------*
I am still new to subqueries but I tried the following
SELECT a.id, a.amount, a.date FROM account a WHERE a.date IN (SELECT MAX(date) FROM account)
However this only return the latest date. How can I get the latest date for every distinct value in the amount column.
If you only need amount:
SELECT amount, MAX(date) from myTable group by amount
If you need more data:
SELECT * from myTable where (amount, date) IN (
SELECT amount, MAX(date) as date from table group by amount
)
Or maybe this will run faster:
SELECT * from myTable A WHERE NOT EXISTS (
SELECT 1
FROM myTable B
WHERE A.date < B.date
AND A.amount = B.amount
)
I have a table with a list of agent_ids, a previous_status, a new status, and a time stamp. I'm trying to determine the time difference between each status change, by agent, in order to determine how long an agent was active in a particular status.
For example:
+------+--------------+--------------+----------------+----------------------+
| id | agent_id | old_status | new_status | date_time |
+----------------------------------------------------------------------------+
| 1 | 1 | offline | online | 2015-06-11 09:00:01 |
| 2 | 1 | online | busy | 2015-06-11 09:30:23 |
| 3 | 3 | offline | online | 2015-06-11 09:31:27 |
| 4 | 1 | busy | offline | 2015-06-11 09:31:45 |
| 5 | 3 | online | offline | 2015-06-11 09:32:10 |
+----------------------------------------------------------------------------+
The goal would be to create a new result table with a time_difference column,
and the time_difference column for row 5 for example, should be 43 seconds, which is the difference between row 5 (the most recent status for agent_id 3) and row 3, the previous status for agent_id 3. Likewise, the time_difference for row 4 should be difference between row 4 and row 2.
You can do something along the lines of
SELECT id, agent_id, old_status, new_status, date_time, seconds
FROM
(
SELECT id, agent_id, old_status, new_status, date_time,
IF(#a = agent_id, TIMESTAMPDIFF(SECOND, #p, date_time), NULL) seconds,
#a := agent_id, #p := date_time
FROM table1 t CROSS JOIN (SELECT #p := NULL, #a := NULL) i
ORDER BY agent_id, id
) q
Output:
+------+----------+------------+------------+---------------------+---------+
| id | agent_id | old_status | new_status | date_time | seconds |
+------+----------+------------+------------+---------------------+---------+
| 1 | 1 | offline | online | 2015-06-11 09:00:01 | NULL |
| 2 | 1 | online | busy | 2015-06-11 09:30:23 | 1822 |
| 4 | 1 | busy | offline | 2015-06-11 09:31:45 | 82 |
| 3 | 3 | offline | online | 2015-06-11 09:31:27 | NULL |
| 5 | 3 | online | offline | 2015-06-11 09:32:10 | 43 |
+------+----------+------------+------------+---------------------+---------+
Here is a SQLFiddle demo
You can approach this without variables, using a correlated subquery:
select t.*,
timestampdiff(second, t.date_time, t.next_date_time) as secs
from (select t.*,
(select t2.date_time
from table t2
where t2.agent_id = t.agent_id and
t2.date_time > t.date_time
order by t2.date_time
limit 1
) as next_date_time
from table t
) t