MYSQL Finding how many users transacted per month - mysql

I have been tasked to find how many users performed a transaction in every month in 2020
I know i have two tables to work with.
Table Name: Receipts|Columns: receipt_id, collection_id, user_id, amount
Table Name: Games |Columns: game_id, collection_id, game_date_time
i tried this but I dont think it makes sense or works
select month(games.game_date_time) AS Month, sum(receipts.id) from bills
join games on bills.collection_id = games.collection_id
WHERE YEAR(games.game_date_time) = 2020
group by receipts.user_id, month(games.game_date_time)
order by month(games.game_date_time)

Use COUNT() to get a count, not SUM(). And if you want a count of users, without counting the same user twice, use COUNT(DISTINCT user_id), don't put user_id in the grouping.
SELECT MONTH(g.game_date_time) AS month, COUNT(DISTINCT r.user_id) AS users
FROM receipts AS r
JOIN games AS g ON r.collection_id = g.collection_id
WHERE YEAR(g.game_date_time) = 2020
GROUP BY month
ORDER BY month

find how many users performed a transaction in every month in 2020
SELECT COUNT(r.user_id)
FROM receipts AS r
JOIN games AS g USING (collection_id)
WHERE YEAR(g.game_date_time) = 2020
GROUP BY r.user_id
HAVING COUNT(DISTINCT MONTH(g.game_date_time)) = MONTH(CURRENT_DATE)
This query:
Selects rows for current year only.
For each user - calculates the amount of distinct months for payments for this user and compares with current month. If user has payments in each month (including current!) these values are equal.
Count the amount of users matched this condition.
PS. The query will fail in 2021 - for to receive correct info in future use
HAVING COUNT(DISTINCT MONTH(g.game_date_time)) = CASE YEAR(CURRENT_DATE)
WHEN 2020
THEN MONTH(CURRENT_DATE)
ELSE 12
END

Related

I would like to count the number of users who made multiple purchases grouped by month

So what i'm trying to do here, is that i am trying to count the number of repeat users (users who made more than one order) in a period of time, let it be month day or year, the case here is months
i'm currently running mysql mariadb and i'm pretty much a beginner in mysql, i've tried multiple subqueries but all have failed till now
This is what i have tried so far ..
This returns all the number of users with no ordering count condition
Since people are asking for sample data, here is what the data is looking like at the moment:
Order_Creation_Date - User_ID - Order_ID
2019-01-01 123 1
2019-01-01 123 2
2019-01-01 231 3
2019-01-01 231 4
This is the query i am using to get the result but it keeps on returning total number of users within the month
select month(o.created_at)month,
year(o.created_at)year,
count(distinct o.user_uuid) from orders o
group by month(o.created_at)
having count(*)>1
and this returns the number of users as 1 ..
select month(o.created_at)month,
year(o.created_at)year,
(select count(distinct ord.user_uuid) from orders ord
where ord.user_uuid = o.user_uuid
group by ord.user_uuid
having count(*)>1) from orders o
group by month(o.created_at)
Expected result will be from the sample data above
Month Count of repeat users
1 2
If you want the number of users that make more than one purchase in January, then do two levels of aggregations: one by user and month and the other by month:
select yyyy, mm, sum( num_orders > 1) as num_repeat_users
from (select year(o.created) as yyyy, month(o.created) as mm,
o.user_uuid, count(*) as num_orders
from orders o
group by yyyy, mm, o.user_uuid
) o
group by yyyy, mm;
I think you should try something like this which will return USer_ID list Month and Year wise who ordered more that once for the period-
SELECT
[user_uuid],
MONTH(o.created_at) month,
YEAR(o.created_at) year,
COUNT(o.user_uuid)
FROM orders o
GROUP BY
MONTH(o.created_at),YEAR(o.created_at)
HAVING COUNT(*) > 1;
For more, if you are looking for the count that how many users placed more that one order, you can just place the above query as a sub query and make a count on column 'user_uuid'

Calculate new users subscription amount MySQL

I have a dataset where I need to find out New subscribers revenue.
These are subscribers that are paying either weekly or monthly depending on the subscription they are on.
The unique identifier is "customer" and the data is at timestamp level, but I want it rolled up at monthly level.
Now for each month, we need to find out revenue for only NEW subscribers.
Basically, imagine customers being on monthly/weekly subscriptions and we only want their FIRST Payments to be counted here.
Here's a sample dataset and
created customer amount
16-Feb-18 14:03:55 cus_BwcisIF1YR1UlD 33300
16-Feb-18 14:28:13 cus_BpLsCvjuubYZAe 156250
15-Feb-18 19:19:14 cus_C3vT6uVBqJC1wz 50000
14-Feb-18 23:00:24 cus_BME5vNeXAeZSN2 162375
9-Feb-18 14:27:26 cus_BpLsCvjuubYZAe 156250
....and so on...
here is the final desired output
yearmonth new_amount
Jan - 2018 100000
Feb - 2018 2000
Dec - 2017 100002
This needs to be done in MySQL interface.
Basically, you want to filter the data to the first customer. One method of doing this involves a correlated subquery.
The rest is just aggregating by year and month. So, overall the query is not that complicated, but it does consist of two distinct parts:
select year(created) as yyyy, month(created) as mm,
count(*) as num_news,
sum(amount) as amount_news
from t
where t.created = (select min(t2.created)
from t t2
where t2.customer = t.customer
)
group by yyyy, mm
We can have sql subquery for only the 1st payment of the new customer with
amount for every month and year
The query is as follows
SELECT month(created) as mm,year(created) as yyyy,
sum(amount) as new_amount
FROM t
WHERE t.created=(select min(t2.created) from t t2 where
t2.customer=t.customer)

Growth for each quarter+year in SQL over my user table

I am using MYSQL and I have a User database table where my registered users are stored. I'd love to see how many users have registered on an increasing timeline for each quarter. So maybe Q1 2016 I had 1000 users total, then in Q2 2016 I had 2000 users register, in Q3 2016 4000 total users registered, etc (so I want to see the increase, not just how many registered in each quarter)
From another Stack Overflow post, I was able to create a query to see it by each day:
select u.created, count(*)
from (select distinct date(DateCreated) created from `Users`) u
join `Users` u2 on u.created >= date(u2.DateCreated)
group by u.created
and this works for each day, but I'd like to now group it by quarter and year. I tried using the QUARTER(d) function in mysql and even QUARTER(d) + YEAR(d) to concat it but I still can't get the data right (The count(*) ends up producing incredibly high values).
Would anyone be able to help me get my data grouped by quarter/year? My timestamp column is called DateCreated (it's a unix timestamp in milliseconds, so I have to divide by 1000 too)
Thanks so much
I would suggest using a correlated subquery -- this allows you to easily define each row in the result set. I think this is the logic that you want:
select dates.yyyy, dates.q,
(select count(*)
from Users u
where u.DateCreated < dates.mindc + interval 3 month
) as cnt
from (select year(DateCreated) as yyyy, quarter(DateCreated) as q
min(DateCreated) as mindc
from Users u
group by year(DateCreated), quarter(DateCreated)
) dates;

Select from 2 tables only the new entries in table 2

I got 2 tables, Customers and Payment. I'm trying to select only the new customers that have payments in the specified month and year, and no previous payments in another month.
table Customer
id - name
table Payment
id - id_customer - month - year - amount
SELECT * FROM customer, payment
WHERE Customer.id = Payment.id_customer
AND month = '$month'
AND year = '$year'
That gets me all the payments in a specific month and year, but I don't know how to exclude all the customers that had other previous payments.
Thank you for your time.
I don't think that you could achieve this without a third table. What you can do is create a third table with all the ids that you have selected in query and update it every time you run a select query.
Then the below query might work:
SELECT * FROM customer c, payment p WHERE c.id = p.id_customer
AND month = '$month'AND year = '$year'AND p.id NOT IN (SELECT id FROM
third_table)
Hope it answers your question.
To get the first date of payment, use GROUP BY. But, you will have to convert the value to something like a date first:
SELECT p.id_customer, MIN(CONCAT_WS, '-', p.year, p.month)) as first_yyyymm
FROM payment p
GROUP BY p.id_customer;
You should store the payment date as a date.

Mysql time calculation with join

I have two tables: sales, actions
Sales table:
id, datetime, status
--------------------
Actions table:
id, datetime, sales_id, action
------------------------------
There's a many-to-one relations ship between the actions and sales tables. For each sales record, there could be numerous actions. I am trying to determine, by each hour of the day, what the average time difference is between when sales records are first created, and when the first action record associated with it's respective sales record was created.
In other words, how fast (in hours) are sales agents responding to leads, based on what hour of the day the lead came in.
Here's what I tried:
SELECT
FROM_UNIXTIME(sales.datetime, '%H') as Hour,
count(actions.id) AS actions,
(MIN(actions.datetime) - sales.datetime) / 3600 as Lag
FROM
actions
INNER JOIN sales ON actions.sales_id = sales.id
group by Hour
I get what looks like reasonable hours numbers for 'Lag', but I am not convinced they're accurate:
Hour Actions Lag
00 66 11.0442
01 30 11.2758
02 50 8.2900
03 25 5.7492
.
.
.
23 77 34.4744
My question is, is this the correct way to get the values for the first action that was recorded for a given sales record? :
(MIN(actions.createDate) - sales.createDate) / 3600 as Lag
It should be:
MIN(actions.datetime - sales.datetime) / 3600 AS Lag
You way is getting the first action from any sale within the hour, and subtracting each sale's timestamp from its timestamp. You need to do the subtraction only within actions and sales that are joined by the ID.
This query has two layers, and it's helpful to crawl through them both.
The lowest layer should compute the lag time from sales.datetime to the earliest action.datetime for each row of sales. That will probably use a MIN() function.
The next layer will compute the statistics for those lag times, worked out in the lowest layer, by hour of the day. That will use an AVG() function.
Here's the lowest layer:
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds
FROM sales AS s
JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
The second part of that ON clause makes sure that you only consider actions taken after the sales order was entered. It may be unnecessary, but I thought I'd throw it in.
Here's the second layer.
SELECT HOUR(datetime) AS hour_Sale_entered,
COUNT(*) AS number_in_that_hour,
AVG(lag_seconds) / 3600.0 AS Lag_to_first_action
FROM (
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds
FROM sales AS s
JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
) AS d
GROUP BY HOUR(datetime)
ORDER BY HOUR(datetime)
See how there are two nested aggregations (GROUP BY) operations? The inner one identifies the first action, and the second one does the hourly averaging.
One more tidbit. If you want to include sales items that have not yet been acted on, you can do this:
SELECT HOUR(datetime) AS hour_Sale_entered,
COUNT(*) AS number_in_that_hour,
SUM(no_action) AS not_acted_upon_yet,
AVG(lag_seconds) / 3600.0 AS Lag_to_first_action
FROM (
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds,
IFNULL(a.id,1,0) AS no_action
FROM sales AS s
LEFT JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
) AS d
GROUP BY HOUR(datetime)
ORDER BY HOUR(datetime)
The average of lag_seconds will still be correct, because the sales rows with no action rows will have NULL values for that, and AVG() ignores nulls.