Avg function not returning proper value - mysql

I expect this query to give me the avg value from daily active users up to date and grouped by month (from Oct to December). But the result is 164K aprox when it should be 128K. Why avg is not working? Avg should be SUM of values / number of current month days up to today.
SELECT sq.month_year AS 'month_year', AVG(number)
FROM
(
SELECT CONCAT(MONTHNAME(date), "-", YEAR(DATE)) AS 'month_year', count(distinct id_user) AS number
FROM table1
WHERE date between '2020-10-01' and '2020-12-31 23:59:59'
GROUP BY EXTRACT(year_month FROM date)
) sq
GROUP BY 1

Ok guys thanks for your help. The problem was that on the subquery I was pulling the info by month and not by day. So I should pull the info by day there and group by month in the outer query. This finally worked:
SELECT sq.day_month, AVG(number)
FROM (SELECT date(date) AS day_month,
count(distinct id_user) AS number
FROM table_1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY 1
) sq
GROUP BY EXTRACT(year_month FROM day_month)

Do not use single quotes for column aliases!
SELECT sq.month_year, AVG(number)
FROM (SELECT CONCAT(MONTHNAME(date), '-', YEAR(DATE)) AS month_year,
count(distinct id_user) AS number
FROM table1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY month_year
) sq
GROUP BY 1;
Note the fixes to the query:
The GROUP BY uses the same columns as the SELECT. Your query should return an error (although it works in older versions of MySQL).
The date comparisons have been simplified.
No single quotes on column aliases.
Note that the outer query is not needed. I assume it is there just to illustrate the issue you are having.

Related

Can I combine separate month and year column for this query?

I currently am trying to track the number of messages sent by month as well as the volume's percent change in comparison to one year prior.
Here is my current query:
Select
a.mo,
a.ye,
a.Messages,
((a.Messages - b.Messages) / b.Messages) as "% Change"
from(
select
MONTH(post_date) as mo,
count(*) as "Messages",
YEAR(post_date) as ye
from
pm_messages
WHERE
post_date > "2018-01-01 00:00:00"
group by
year(post_date),
month(post_date)
) a
left join (
select
MONTH(post_date) as mo,
YEAR(post_date) as ye,
count(*) as "Messages"
from
pm_messages
group by
year(post_date),
month(post_date)
) b on a.mo = b.mo
and a.ye -1 = b.ye
This works great, however, it places month and year in separate columns, which has been messing up the graphs I am working with. However, when I try to pull month and year into one columns as I've done in other queries from the same table, i.e. using:
SELECT DATE_FORMAT(`post_date`,'%M %Y')
My query does not work.
Does anyone know how I can combine my current query to still calculate the return from a year prior but have month and date come up as one column, as opposed to (Month | Year | Messages | % Change)
Thanks!!
you can use extract instead of separate year() and month() functions :
EXTRACT(YEAR_MONTH from post_date)
of course you have to group by this instead of year, month . for example :
select
EXTRACT(YEAR_MONTH from post_date) yearmonth,
count(*) as "Messages"
from
pm_messages
group by
EXTRACT(YEAR_MONTH from post_date)
If you have data for every month, you can use lag():
select year(post_date) as ye, month(post_date) as mo,
count(*) as Messages,
lag(count(*)) over (partition by month(post_date) order by year(post_date)) as prev_year
from pm_messages
where post_date >= '2018-01-01'
group by year(post_date), month(post_date)

mySQL query that is a bit tricky

Hi there I want to design this query in mySQL.
Statement: For all the customers that transacted during 2017, what % made another transaction within 30 days?
can you tell me how such query can be designed?
This is the picture of the table to perform this query on:
Table name is: transactions
Just use lead() to get the next date. Then aggregate at the customer level to determine if any transaction in the time period has another within 30 days for that customer.
Finally, aggregate again:
select avg(case when mindiff < 30 then 1.0 else 0 end) as within_30days
from (select customerid, min(datediff(next_date - date)) as mindiff
from (select t.*, lead(date) over (partition by customerid order by date) as next_date
from transactions t
) t
where date >= '2017-01-01' and date < '2018-01-01'
group by customerid
) c

How to count a field per day and then GROUP BY YEARWEEK

If i have a database with 2 columns, date and account and i want to first count account per day and then group by week. How wrong is my code and how to do it?
I edited my code a little bit, i was not thinking right from the beginning. I want the sum to be 9 for week 48.
SELECT date, account,
(SELECT date, COUNT(DISTINCT account)
FROM t1
GROUP BY date
) AS sum
FROM t1
GROUP BY YEARWEEK(date)
You seem to be looking for a simple aggregate query with count(distinct ...):
select yearweek(date) year_week, count(distinct account) cnt_account
from t1
group by yearweek(date)
order by year_week
Note: yearweek() gives you the year and week; this is better than week(), if your data spreads over several years.
EDIT
From the comments, you need two levels of aggregation:
select yearweek(dy) year_week, sum(cnt) cnt_account
from (
select date(t1.date) dy, count(distinct t1.account) cnt
from t1
group by date(t1.date)
) t
group by yearweek(dy)
order by year_week

Can I get every month of the year even if there is not data for that month in DB

I want to get a statistic for every month of the years i have in DB
SELECT monthname(created_at) AS month, YEAR(created_at) AS year, count(*) AS number
FROM tableName
WHERE type_of_user = "someType"
GROUP BY year, month(created_at)
ORDER BY created_at DESC
Now it gives me only month that I have, but I need to get statistics for every month, even if I don't have any stored data for that month
Create a calendar table. This will need one entry per month, for every year that you intend to use.
Then select from the calendar table, and join in the values that you get from your current query. Use COALESCE() to put a zero-value where the entry is NULL (e.g. when there are no records in the tableName for that month and year).
SELECT MONTHNAME(date) as month,
YEAR(date) as year,
COALESCE(number, 0) as number
FROM calendar AS C
LEFT JOIN (
SELECT created_at, COUNT(*) as number
FROM tableName AS T
WHERE T.type_of_user = 'someType'
GROUP BY YEAR(created_at), MONTH(created_at)
) AS T
ON MONTH(T.created_at) = MONTH(C.date) AND YEAR(T.created_at) = YEAR(C.date)
GROUP BY month, YEAR(created_at)
ORDER BY MONTH(date), YEAR(date)
SQL fiddle at http://sqlfiddle.com/#!9/e0a4dc/
SELECT month(created_at) as month
FROM tableName
RIGHT JOIN (select row_number() over (order by 1) as i
from someTableWithMoreThan12Records limit 12) x
ON x.i=month(created_at)
ORDER BY I;
JOINing with a table that has all the records will give you every month.

How to find which year do values tend to increase in ? in SQL

Basically I have a table like this:
Table Time:
ID.......Date
1......08/26/2016
1......08/26/2016
2......05/29/2016
3......06/22/2016
4......08/26/2015
5......05/23/2015
5......05/23/2015
6......08/26/2014
7......04/26/2014
8......08/26/2013
9......03/26/2013
The query should return like this
Year........CountNum
2016........4
2015........3
To find out which year does its value tend to increase in. I notice that I want to display the years that have more values (number of row in this case) than the previous year.
What I've done so far
SELECT Year, count(*) as CountNum
FROM Time
GROUP BY Year
ORDER BY CountNum DESC;
I don't know how to get the year from date format. I tried year(Date) function, but I got Null data.
Please help!
It should works fine.
select year(date), count(*) as countNum
from time
group by year(date)
order by countNum
Join the grouped data to itself with 1 year offset:
select
a.*
from
(
select year(`Date`) as _year, count(*) as _n
from time group by 1
) a
left join
(
select year(`Date`) as _year, count(*) as _n
from time group by 1
) b
on a._year = b._year-1
where a._n > b._n
order by 1