Cumulative sum on time interval - mysql

Following is users table
---------------------------------------
| uid | reg_date |
---------------------------------------
| 1 | 2011-07-20 02:24:36 |
---------------------------------------
| 2 | 2012-10-03 07:37:43 |
---------------------------------------
| ... | ... ... ... ... ... |
---------------------------------------
| 300000 | 2015-12-19 04:13:51 |
---------------------------------------
I want to get last 1 year from curdate() data by month basis from this table.
I have tried following query.
SELECT month,
#cnt := #cnt + total cum_sum
FROM (
SELECT MONTH(reg_date) month,
COUNT(*) total
FROM users
WHERE reg_date >= CURDATE() - INTERVAL 1 YEAR
GROUP BY YEAR(reg_date), MONTH(reg_date)
) n, (SELECT #cnt := 0) users_alias
but it generates last twelve months data as there were no data before that. But I want it to be start from actual cumulative count at that month. How can I achieve this? Thanks.
UPDATE
desired output
-----------------------
| month | cum_sum |
-----------------------
| 10 | 1000 |
-----------------------
| 11 | 1500 |
-----------------------
| 12 | 2550 |
-----------------------
| 1 | 9700 |
-----------------------
| 2 | 11000 |
-----------------------
| 3 | 14000 |
-----------------------
| 4 | 15700 |
-----------------------
| 5 | 20000 |
-----------------------
| 6 | 22000 |
-----------------------
| 7 | 27000 |
-----------------------
| 8 | 31000 |
-----------------------
| 9 | 35000 |
-----------------------
| 10 | 41000 |
-----------------------

You initialize #cnt variable to 0 at the end of the sql statement:
(SELECT #cnt := 0) users_alias
You need to change this to initialize the counter to the number of users registered before:
(SELECT #cnt := count(*) from users
where regdate<CURDATE() - INTERVAL 1 YEAR) users_alias

Related

Select complete record with earliest timestamp on a day for each employee [duplicate]

This question already has answers here:
Group by minimum value in one field while selecting distinct rows
(10 answers)
Closed 2 years ago.
I have a table that stores facial login data of employees based upon employee id. I need to get the earliest login for each employee on a day and all other logins to be ignored. I know how to get latest or earliest record for each employee but I am unable to figure out how to get earliest entry in each day by each employee.
+----+-----------+--------------------------------------+-------------+-----------------------+
| id | camera_id | image_name | employee_id | created_at |
+----+-----------+--------------------------------------+-------------+-----------------------+
| 10 | 2 | pjcc7vf142pec6li7k8kqxuqvnmhm0tyo8ib | 16 | 2020-07-11 10:40:20 |
| 11 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-11 10:40:22 |
| 14 | 2 | 3p74yrq35nfaazwdo8auguvn2h5hpugtfvvw | 2 | 2020-07-11 12:07:24 |
| 15 | 2 | hpa2am40ufke7o7q2y733hh83h7ykxxdgkof | 16 | 2020-07-11 12:09:35 |
| 16 | 2 | g7adgyzloab2t4z7xx2id0a9cjqx8ojfni99 | 2 | 2020-07-11 12:09:41 |
| 17 | 2 | tapufkiuj5toxfdoikjicbe3k7tl32yj5khp | 16 | 2020-07-12 12:09:47 |
| 18 | 2 | pjcc7vf142pec6li7k8kqxuqvnmhm0tyo8ib | 16 | 2020-07-12 14:40:20 |
| 19 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-12 15:40:22 |
| 20 | 2 | 3p74yrq35nfaazwdo8auguvn2h5hpugtfvvw | 2 | 2020-07-12 16:07:24 |
| 21 | 2 | hpa2am40ufke7o7q2y733hh83h7ykxxdgkof | 16 | 2020-07-12 17:09:35 |
| 22 | 2 | g7adgyzloab2t4z7xx2id0a9cjqx8ojfni99 | 2 | 2020-07-13 12:09:41 |
+----+-----------+--------------------------------------+-------------+-----------------------+
The result will look like below...
+----+-----------+--------------------------------------+-------------+-----------------------+
| id | camera_id | image_name | employee_id | created_at |
+----+-----------+--------------------------------------+-------------+-----------------------+
| 10 | 2 | pjcc7vf142pec6li7k8kqxuqvnmhm0tyo8ib | 16 | 2020-07-11 10:40:20 |
| 11 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-11 10:40:22 |
| 17 | 2 | tapufkiuj5toxfdoikjicbe3k7tl32yj5khp | 16 | 2020-07-12 12:09:47 |
| 19 | 2 | 9iizfdtk3m81a745ut7tzqzqh8kf9ipz2u02 | 2 | 2020-07-12 15:40:22 |
| 22 | 2 | g7adgyzloab2t4z7xx2id0a9cjqx8ojfni99 | 2 | 2020-07-13 12:09:41 |
+----+-----------+--------------------------------------+-------------+-----------------------+
You can do:
select *
from t
where (employee_id, created_at) in (
select employee_id, min(created_at)
from t
group by employee_id, date(created_at)
)
how to get earliest entry in each day by each employee
You can filter with a correlated subquery:
select t.*
from mytable t
where t.created_at = (
select min(t1.created_at)
from mytable t1
where
t1.employee_id = t.employee_id
and t1.created_at >= date(t.created_at)
and t1.created_at < date(t.created_at) + interval 1 day
)
This query would take advantage of an index on (employee_id, created_at).
Or, if you are running MySQL 8.0, you can use window functions:
select *
from (
select
t.*,
row_number() over(
partition by employee_id, date(created_at)
order by created_at
) rn
from mytable t
) t
where rn = 1

Group subscription duration by month and year in mysql

I have a table of subscriptions, storing user id, subscription end date, program id. One user can be subscribed to many programs, but for the scope of the problem the latest date is considered as the end date of the subscription. The goal is to find the number of users whose subscription is ending for each month of each year. To illustrate it:
-------------------------------------------
| user_id | program_id | end_date |
-------------------------------------------
| 1 | 1 | 2015-12-10 |
| 1 | 2 | 2017-08-27 |
| 2 | 1 | 2017-09-20 |
| 3 | 2 | 2017-10-01 |
| 2 | 3 | 2017-09-18 |
| 5 | 12 | 2017-10-22 |
| 4 | 3 | 2017-10-10 |
| 3 | 8 | 2018-11-15 |
-------------------------------------------
Intermediate result show when will the subscription end for each user (only month is needed):
------------------------------
| user_id | enddate |
------------------------------
| 1 | 2017-08 |
| 2 | 2017-09 |
| 3 | 2018-11 |
| 4 | 2017-10 |
| 5 | 2017-10 |
------------------------------
This was achieved with the query:
Select user_id, DATE_FORMAT(max(end_date), '%Y-%m') AS as enddate
From subscription
Group by user_id
Order by end_date desc;
The final result must further filter the list, showing only how many users will be left with no subscription in each month, like this:
------------------------------
| count | month, year |
------------------------------
| 1 | 2017-08 |
| 1 | 2017-09 |
| 2 | 2017-10 |
| 1 | 2018-11 |
------------------------------
This is where I am stuck with no mysql ideas. Iterating through the results and counting is out of the question.
You could try arranging the results by the enddate, like this:
select count(user_id), DATE_FORMAT(max_end_date, '%Y-%m')as enddate
from (
select user_id, max(end_date) as max_end_date
From subscription
Group by user_id
) n
group by enddate
Order by enddate desc;
Try this -
Select COUNT(*), DATE_FORMAT(MAX(end_date), '%Y-%m') AS as enddate
From subscription
Group by user_id
Order by end_date desc;

Row counter per Column

Say I have a table like so
| id | user_id | event_id | created_at |
|----|---------|----------|------------|
| 1 | 5 | 10 | 2015-01-01 |
| 2 | 6 | 7 | 2015-01-02 |
| 3 | 3 | 8 | 2015-01-01 |
| 4 | 5 | 9 | 2015-01-04 |
| 5 | 5 | 10 | 2015-01-02 |
| 6 | 6 | 1 | 2015-01-01 |
I want to be able to generate a counter of events per user. So my result would be:
| counter | user_id | event_id | created_at |
|---------|---------|----------|------------|
| 1 | 5 | 10 | 2015-01-01 |
| 1 | 6 | 7 | 2015-01-02 |
| 1 | 3 | 8 | 2015-01-01 |
| 2 | 5 | 9 | 2015-01-04 |
| 3 | 5 | 10 | 2015-01-02 |
| 2 | 6 | 1 | 2015-01-01 |
One idea is to self join the table and group by to replicate row_number() over.. function available in other RDBMS.
Check this Rextester Demo and see second query, to understand how inner join works in this case.
select t1.user_id,
t1.event_id,
t1.created_at,
count(*) as counter
from your_table t1
inner join your_table t2
on t1.user_id=t2.user_id
and t1.id>=t2.id
group by t1.user_id,
t1.event_id,
t1.created_at
order by t1.user_id,t1.event_id;
Output:
+---------+----------+------------+---------+
| user_id | event_id | created_at | counter |
+---------+----------+------------+---------+
| 3 | 8 | 01-01-2015 | 1 |
| 5 | 10 | 01-01-2015 | 1 |
| 5 | 10 | 02-01-2015 | 3 |
| 5 | 9 | 04-01-2015 | 2 |
| 6 | 1 | 01-01-2015 | 2 |
| 6 | 7 | 02-01-2015 | 1 |
+---------+----------+------------+---------+
Try the following:
select counter,
xx.user_id,
xx.event_id,
xx.created_at
from xx
join (select a.id,
a.user_id,
count(*) as counter
from xx as a
join xx as b
on a.user_id=b.user_id
and b.id<=a.id
group by 1,2) as counts
on xx.id=counts.id
Use a join to generate rows for each id with all the other lower ids for that user below it and count them.
Try This one:
Sub query will help to get this rsult.
select (select count(*) from user_event iue where iue.user_id == oue.user_id) as counter,
oue.user_id,
oue.event_id,
oue.created_at
from user_event oue
You could try to use a variable as a table, cross join it with the source table and reset whenever user id changes.
SELECT #counter := CASE
WHEN #user = user_id THEN #counter + 1
ELSE 1
END AS counter,
#user := user_id AS user_id,
event_id,
created_at
FROM your_table m,
(SELECT #counter := 0,
#user := '') AS t
ORDER BY user_id;
I've created a demo here

Select second min() or second smallest from mysql table

I'm wondering how to select the second smallest value from a mysql table, grouped on a non-numeric column. If I have a table that looks like this:
+----+----------+------------+--------+------------+
| id | customer | order_type | amount | created_dt |
+----+----------+------------+--------+------------+
| 1 | 1 | web | 5 | 2017-01-01 |
| 2 | 1 | web | 7 | 2017-01-05 |
| 3 | 2 | web | 2 | 2017-01-07 |
| 4 | 3 | web | 2 | 2017-02-01 |
| 5 | 3 | web | 3 | 2017-02-01 |
| 6 | 2 | web | 5 | 2017-03-15 |
| 7 | 1 | in_person | 7 | 2017-02-01 |
| 8 | 3 | web | 8 | 2017-01-01 |
| 9 | 2 | web | 1 | 2017-04-01 |
+----+----------+------------+--------+------------+
I want to count the number of second orders in each month/year. I also have a customer table (which is where the customer ids come from). I can find the number of customers with more than at least 2 orders by the customer's created date by querying
select date(c.created_dt) as create_date, count(c.id)
from customer c
where c.id in
(select or.identity_id
from orders or
where
(select count(o.created_dt)
from orders o
where or.customer = o.customer and o.order_tpe in ('web')
) > 1
)
group by 1;
However, that result gives customer by their created date, and I can't seem to figure out how to find the the number of second orders by date.
The desired output i'd like to see, based on the data above, is:
+-------+------+---------------+
| month | year | second_orders |
+-------+------+---------------+
| 1 | 2017 | 1 |
| 2 | 2017 | 1 |
| 3 | 2017 | 1 |
+-------+------+---------------+
One way to approach this
SELECT YEAR(created_dt) year, MONTH(created_dt) month, COUNT(*) second_orders
FROM (
SELECT created_dt,
#rn := IF(#c = customer, #rn + 1, 1) rn,
#c := customer
FROM orders CROSS JOIN (
SELECT #c := NULL, #rn := 1
) i
WHERE order_type = 'web'
ORDER BY customer, id
) q
WHERE rn = 2
GROUP BY YEAR(created_dt), MONTH(created_dt)
ORDER BY year, month
Here is a dbfiddle demo
Output:
+------+-------+---------------+
| year | month | second_orders |
+------+-------+---------------+
| 2017 | 1 | 1 |
| 2017 | 2 | 1 |
| 2017 | 3 | 1 |
+------+-------+---------------+

Get last balance sign change in (My)SQL

I have a Transaction table that records every amount added to or subtracted from the balance of a Customer, with the new balance:
+----+------------+------------+--------+---------+
| id | customerId | timestamp | amount | balance |
+----+------------+------------+--------+---------+
| 1 | 1 | 1000000001 | 10 | 10 |
| 2 | 1 | 1000000002 | -20 | -10 |
| 3 | 1 | 1000000003 | -10 | -20 |
| 4 | 2 | 1000000004 | -5 | -5 |
| 5 | 2 | 1000000005 | -5 | -10 |
| 6 | 2 | 1000000006 | 10 | 0 |
| 7 | 3 | 1000000007 | -5 | -5 |
| 8 | 3 | 1000000008 | 10 | 5 |
| 9 | 3 | 1000000009 | 10 | 15 |
| 10 | 4 | 1000000010 | 5 | 5 |
+----+------------+------------+--------+---------+
The Customer table stores the current balance, and looks like:
+----+---------+
| id | balance |
+----+---------+
| 1 | -20 |
| 2 | 0 |
| 3 | 15 |
| 4 | 5 |
+----+---------+
I would like to add a balanceSignSince column, that would store the timestamp at which the balance sign last changed. Transitioning to and from positive, negative, or zero counts as a balance change.
After the update, based on the above data, the Customer table should contain:
+----+---------+------------------+
| id | balance | balanceSignSince |
+----+---------+------------------+
| 1 | -20 | 1000000002 |
| 2 | 0 | 1000000006 |
| 3 | 15 | 1000000008 |
| 4 | 5 | 1000000010 |
+----+---------+------------------+
How can I write a SQL query that updates every Customer with the last time the balance sign changed, based on the Transaction table?
I suspect I can't do this without a quite complex stored procedure, but am curious to see if any clever ideas come up.
This uses a simulated rank() function.
select customerId, min(tstamp) from
(
select tstamp,
if (#cust = customerId and sign(#bal) = sign(balance), #rn := #rn,
if (#cust = customerId and sign(#bal) <> sign(balance), #rn := #rn + 1, #rn := 0)) as rn,
#cust := customerId as customerId, #bal := balance as balance
from
(select #rn := 0) x,
(select id, #cust := customerId as customerId, tstamp, amount, #bal := balance as balance
from trans order by customerId, tstamp desc) y
) z
where rn = 0
group by customerId;
Check it: http://rextester.com/XJVKK61181
This script returns a table like this:
+------------+----+------------+---------+
| tstamp | rn | customerId | balance |
+------------+----+------------+---------+
| 1000000003 | 0 | 1 | -20 |
| 1000000002 | 0 | 1 | -10 |
| 1000000001 | 1 | 1 | 10 |
| 1000000006 | 0 | 2 | 0 |
| 1000000005 | 2 | 2 | -10 |
| 1000000004 | 2 | 2 | -5 |
| 1000000009 | 0 | 3 | 15 |
| 1000000008 | 2 | 3 | 5 |
| 1000000007 | 3 | 3 | -5 |
| 1000000010 | 0 | 4 | 5 |
+------------+----+------------+---------+
Then selecting min(timestamp) of files where rn = 0:
+------------+-------------+
| customerId | min(tstamp) |
+------------+-------------+
| 1 | 1000000002 |
+------------+-------------+
| 2 | 1000000006 |
+------------+-------------+
| 3 | 1000000009 |
+------------+-------------+
| 4 | 1000000010 |
+------------+-------------+
Updated answer with the restriction that this needs to work on the existing data
The following query should work for most cases, there is still an issue with customers having only a single transaction or no sign change. As this is a one time update, I would run the query below and then do a simple update for all users not having a timestamp set, for them it's going to be the timestamp of the first transaction:
# Find the smallest timestamp, e.g. the
# transaction which changed the signum.
SELECT
p.customerId as customerId,
MIN(t.timestamp) as balanceSignSince
FROM
transaction as t,
(
# find the latest timestamp having
# a different sign for each user.
# Here is the issue with users having
# only a single transaction or no sign
# changes.
SELECT
u.customerId as customerId,
MAX(t.timestamp) as balanceSignSince
FROM
transaction as t,
customer as c,
(
# find the timestamp of the very last
# transaction for every user.
SELECT
t.customerId as customerId,
MAX(t.timestamp) as lastTransaction
FROM
transaction as t
GROUP BY
t.customerId
) as u
WHERE
u.customerId = c.id
AND u.customerId = t.customerId
AND SIGN(c.balance) <> SIGN(t.balance)
GROUP BY
u.customerId
) as p
WHERE
p.customerId = t.customerId
AND p.balanceSignSince < t.timestamp
GROUP BY
p.customerId;
Fiddle: http://sqlfiddle.com/#!9/bd0760/13
Original Answer
This should work to get the timestamp of a sign change:
SELECT
c.id as id,
MAX(t.timestamp) as balanceSignSince
FROM
transaction as t,
customer as c
WHERE
t.customerId = c.id
AND SIGN(t.balance) <> SIGN(c.balance)
This needs to be executed before the customer table is updated with the new balance. If you have a trigger on transation:insert you should probably put the above into the query updating the customer table.