How to calculate percent? - mysql

Could you help me to calculate percent of users, which made payments?
I've got two tables:
activity
user_id login_time
201 01.01.2017
202 01.01.2017
255 04.01.2017
255 05.01.2017
256 05.01.2017
260 15.03.2017
2
payments
user_id payment_date
200 01.01.2017
202 01.01.2017
255 05.01.2017
I try to use this query, but it calculates wrong percent:
SELECT activity.login_time, (select COUNT(distinct payments.user_id)
from payments where payments.payment_time between '2017-01-01' and
'2017-01-05') / COUNT(distinct activity.user_id) * 100
AS percent
FROM payments INNER JOIN activity ON
activity.user_id = payments.user_id and activity.login_time between
'2017-01-01' and '2017-01-05'
GROUP BY activity.login_time;
I need a result
01.01.2017 100 %
02.01.2017 0%
03.01.2017 0%
04.01.2017 0%
05.01.2017 - 50%

If you want the ratio of users who have made payments to those with activity, just summarize each table individually:
select p.cnt / a.cnt
from (select count(distinct user_id) as cnt from activity a) a cross join
(select count(distinct user_id) as cnt from payment) p;
EDIT:
You need a table with all dates in the range. That is the biggest problem.
Then I would recommend:
SELECT d.dte,
( ( SELECT COUNT(DISTINCT p.user_id)
FROM payments p
WHERE p.payment_date >= d.dte and p.payment_date < d.dte + INTERVAL 1 DAY
) /
NULLIF( (SELECT COUNT(DISTINCT a.user_id)
FROM activity a
WHERE a.login_time >= d.dte and p.login_time < d.dte + INTERVAL 1 DAY
), 0
) as ratio
FROM (SELECT date('2017-01-01') dte UNION ALL
SELECT date('2017-01-02') dte UNION ALL
SELECT date('2017-01-03') dte UNION ALL
SELECT date('2017-01-04') dte UNION ALL
SELECT date('2017-01-05') dte
) d;
Notes:
This returns NULL on days where there is no activity. That makes more sense to me than 0.
This uses logic on the dates that works for both dates and date/time values.
The logic for dates can make use of an index, which can be important for this type of query.
I don't recommend using LEFT JOINs. That will multiply the data which can make the query expensive.

First you need a table with all days in the range. Since the range is small you can build an ad hoc derived table using UNION ALL. Then left join the payments and activities. Group by the day and calculate the percentage using the count()s.
SELECT x.day,
concat(CASE count(DISTINCT a.user_id)
WHEN 0 THEN
1
ELSE
count(DISTINCT p.user_id)
/
count(DISTINCT a.user_id)
END
*
100,
'%')
FROM (SELECT cast('2017-01-01' AS date) day
UNION ALL
SELECT cast('2017-01-02' AS date) day
UNION ALL
SELECT cast('2017-01-03' AS date) day
UNION ALL
SELECT cast('2017-01-04' AS date) day
UNION ALL
SELECT cast('2017-01-05' AS date) day) x
LEFT JOIN payments p
ON p.payment_date = x.day
LEFT JOIN activity a
ON a.login_time = x.day
GROUP BY x.day;

Related

Calculate Ratio of two different SQL queries result that return numbers

I have Query 1
SELECT COUNT(DISTINCT user_id) total_daily_active_user_group_month FROM (SELECT user_id , MONTHNAME(time) mon , COUNT(*) cnt FROM ACTIVITIES
WHERE MONTH(time) = MONTH(NOW() - INTERVAL 1 MONTH) GROUP by user_id, MONTH(time) ) as x
Returns 18
Query 2
SELECT COUNT(DISTINCT user_id) total_daily_active_user_group_month FROM (SELECT user_id , MONTHNAME(time) mon , COUNT(*) cnt FROM ACTIVITIES
WHERE MONTH(time) = MONTH(NOW() - INTERVAL 1 MONTH) GROUP by user_id, MONTH(time) having cnt=31) as x
Return 6
I want the ratio of query 1 and two. Means
18/6 . I am using MySQL
If you use both queries as CTEs, then it becomes relatively simple:
WITH q1
AS (SELECT Count(DISTINCT user_id) total_daily_active_user_group_month
FROM (SELECT user_id,
Monthname(TIME) mon,
Count(*) cnt
FROM activities
WHERE Month(TIME) = Month(Now() - interval 1 month)
GROUP BY user_id,
Month(TIME))),
q2
AS (SELECT Count(DISTINCT user_id) total_daily_active_user_group_month
FROM (SELECT user_id,
Monthname(TIME) mon,
Count(*) cnt
FROM activities
WHERE Month(TIME) = Month(Now() - interval 1 month)
GROUP BY user_id,
Month(TIME)
HAVING cnt = 31))
SELECT q1.total_daily_active_user_group_month /
q2.total_daily_active_user_group_month
AS result
FROM dual;
You commented that you got an error pointing to the WITH keyword; switch to two subqueries, then; simplified:
select a.value / b.value as result
from (select count(distinct user_id) value
from ... your 1st query goes here
) a,
(select count(distinct user_id) value
from ... your 2nd query goes here
) b;

How to set default value from mysql join interval yearmonth

I have problem with my query. I have two tables and I want join them to get the results based on primary key on first table, but I missing 1 data from first table.
this my fiddle
as you can see, I missing "xx3" from month 1
I have tried to change left and right join but, the results stil same.
So as you can see I have to set coalesce(sum(b.sd_qty),0) as total, if no qty, set 0 as default.
You should cross join the table to the distinct dates also:
SELECT a.item_code,
COALESCE(SUM(b.sd_qty), 0) total,
DATE_FORMAT(d.sd_date, '%m-%Y') month_year
FROM item a
CROSS JOIN (
SELECT DISTINCT sd_date
FROM sales_details
WHERE sd_date >= '2020-04-01' - INTERVAL 3 MONTH AND sd_date < '2020-05-01'
) d
LEFT JOIN sales_details b
ON a.item_code = b.item_code AND b.sd_date = d.sd_date
GROUP BY month_year, a.item_code
ORDER BY month_year, a.item_code;
Or, for MySql 8.0+, with a recursive CTE that returns the starting dates of all the months that you want the results, which can be cross joined to the table:
WITH RECURSIVE dates AS (
SELECT '2020-04-01' - INTERVAL 3 MONTH AS sd_date
UNION ALL
SELECT sd_date + INTERVAL 1 MONTH
FROM dates
WHERE sd_date + INTERVAL 1 MONTH < '2020-05-01'
)
SELECT a.item_code,
COALESCE(SUM(b.sd_qty), 0) total,
DATE_FORMAT(d.sd_date, '%m-%Y') month_year
FROM item a CROSS JOIN dates d
LEFT JOIN sales_details b
ON a.item_code = b.item_code AND DATE_FORMAT(b.sd_date, '%m-%Y') = DATE_FORMAT(d.sd_date, '%m-%Y')
GROUP BY month_year, a.item_code
ORDER BY month_year, a.item_code;
See the demo.

MySQL loop and multiple LEFT joins

I got the following code:
SELECT
COALESCE(rv.views, 0) as views
FROM
( select 0 as n
union all select 1
union all select 2
union all select 3 ) n
LEFT JOIN restaurant_views rv
on rv.date = date_add("2015-02-24", interval - n.n day)
and restaurant_id = 192
This code is giving me the amount of views a restaurant had the last 4 days.
I am looking for a similar query to get the amount of likes a restaurant had the last 4 days.
This is what I got so far:
SELECT
( COUNT( DISTINCT a.restaurant_id)
+ COUNT( DISTINCT d.restaurant_id)) as num_likes
FROM
( select 0 as n
union all select 1
union all select 2
union all select 3 ) n
LEFT JOIN apple_likes a
on a.vote_date = date_add("2015-02-24", interval - n.n day)
and a.restaurant_id = 192
LEFT JOIN android_likes d
on d.vote_date = date_add("2015-02-24", interval - n.n day)
and d.restaurant_id = 192
And here is the output, which is as you can see not what I'm looking for:
What do I have to change to get the number of likes in the last query?
(I have checked that the restaurant has likes on all days, so I am positive it's something wrong with the query)
Try this one:
SELECT
( a.likes)
+ d.likes) as num_likes
FROM
( select 0 as n
union all select 1
union all select 2
union all select 3 ) n
LEFT JOIN (
SELECT vote_date,COUNT(*) as likes
FROM apple_likes
WHERE restaurant_id = 192
GROUP BY restaurant_id, vote_date
) as a
on a.vote_date = date_add("2015-02-24", interval - n.n day)
LEFT JOIN (
SELECT vote_date, COUNT(*) as likes
FROM android_likes
WHERE restaurant_id = 192
GROUP BY restaurant_id, vote_date
) as d
on d.vote_date = date_add("2015-02-24", interval - n.n day)
I can think of a couple items that might be what you are encountering...
Just because somebody VIEWS a restaurant, does that mean they actually VOTED??? And if Voted, are the only two devices that of apple or android? What if viewing from a browser and they are on a Windows machine browser-based?
Date Equality. In the restaurant views table, is the date field ALWAYS that of a time = 12:00:00 (ie: midnight/morning of the day). If the time-stamps of the votes are anything other than 12:00:00, and you are trying to compare for a date = date + time is probably failing. What you may need is a comparison of the date( vote_date ) = date( date_add( ... )) so this way BOTH are ignoring the time component... Now, that being said, a function on a date column is not going to be optimized, even if the restaurant ID is numeric and part of the index key... it would be PARTIALLY optimized. You may want to just add a generic date of AND vote_date >= '2015-02-20' so it can optimize the restaurant and date, then apply the DATE( vote_date ) for the actual qualfying of records.

collesce issue with mysql

I have a pretty flat table - tbl_values which has userids as well as netAmounts in a given row. In the example below, 2280 has no records in the past 30 days based on the timestamp.
I'd expect this to return 3 rows, with 2280 as "0" - but I'm only getting 2 back? Am I missing something obvious here?
SELECT userid, (COALESCE(SUM(netAmount),0)) as Sum FROM `tbl_values` where userid in (2280, 399, 2282) and date > (select DATE_SUB(NOW(), INTERVAL 30 day)) GROUP BY userid
Assuming you always want to return the user, regardless of rather they have a matching record in tbl_values, what you're looking for is an outer join:
SELECT u.userid, COALESCE(SUM(v.netAmount),0) as Sum
FROM (
SELECT 2280 userid UNION ALL
SELECT 399 UNION ALL
SELECT 2282
) u
LEFT JOIN `tbl_values` v ON u.userid = v.userid AND
v.date > DATE_SUB(NOW(), INTERVAL 30 day)
GROUP BY u.userid
If you perhaps have a Users table, then you can use it instead of the subquery.
SELECT u.userid, COALESCE(SUM(v.netAmount),0) as Sum
FROM users u
LEFT JOIN `tbl_values` v ON u.userid = v.userid AND
v.date > DATE_SUB(NOW(), INTERVAL 30 day)
WHERE u.userid in (2280, 399, 2282)
GROUP BY u.userid
This is your query:
SELECT userid, (COALESCE(SUM(netAmount),0)) as Sum
FROM `tbl_values`
where userid in (2280, 399, 2282) and
date > (select DATE_SUB(NOW(), INTERVAL 30 day))
GROUP BY userid;
The filter in the where clause finds no rows that match for user id 2280. Assuming that at least one row exists somewhere, you can get what you want by moving the date comparison to a conditional aggregation:
SELECT userid,
sum(case when date > DATE_SUB(NOW(), INTERVAL 30 day)
then netAmount else 0
end) as Sum
FROM `tbl_values`
WHERE userid in (2280, 399, 2282)
GROUP BY userid;
EDIT:
If you really want all three results, then use a left join:
SELECT u.userid,
coalesce(sum(netAmount), 0) as Sum
FROM (select 2280 as userid union all
select 399 union all
select 2282
) u left join
tbl_values t
on u.userid = t.userid and
t.date > DATE_SUB(NOW(), INTERVAL 30 day)
GROUP BY u.userid;

MySQL query with join or subquery

I have such a schema and queries:
http://sqlfiddle.com/#!2/7b032/3
Seperately I have these queries:
SELECT COUNT(*) AS 'times', userid, name
FROM main
WHERE comedate <= DATE_SUB(CURDATE(),
INTERVAL 5 DAY)
GROUP BY userid ORDER BY times DESC LIMIT 0,2;
SELECT * FROM details WHERE 1;
By comparing userid columns of both table I need to join them.
I need an output having these columns:
"times, userid, name, age, location"
Also order, group and limits should be considered.
I would be happy if you can write one query with JOIN and one query with subquery.
I have a 60k table and I will compare the performances.
How about this:
select x.times,
x.userid,
x.name,
d.age,
d.location
from
(
SELECT COUNT(*) AS 'times', userid, name
FROM main
WHERE comedate <= DATE_SUB(CURDATE(),
INTERVAL 5 DAY)
GROUP BY userid
) x
left join details d
on x.userid = d.userid
see SQL Fiddle with Demo
edit:
select x.times,
x.userid,
x.name,
d.age,
d.location
from
(
SELECT COUNT(*) AS 'times', userid, name
FROM main
WHERE comedate <= DATE_SUB(CURDATE(),
INTERVAL 5 DAY)
GROUP BY userid
ORDER BY times DESC
LIMIT 0,2
) x
left join details d
on x.userid = d.userid
see SQL Fiddle with demo