Summing by month in MYSQL - mysql

I've been trying different things for a couple of days to summarise transactions by month (I'm not so interested in which year, just by month).
When I got the result it differs from the one I have in Excel and Tableau so I don't understand what is the problem.
I used the following two queries which gave me the same result however a different one from Tableau and Excel which both show other figures:
SELECT extract(MONTH from TRAN_D) as month, sum(TRAN_A) as total_value
FROM royal_bank_aus.tran
GROUP BY month;
SELECT year(TRAN_D),month(TRAN_D),sum(TRAN_A)
FROM royal_bank_aus.tran
GROUP BY year(TRAN_D),month(TRAN_D)
ORDER BY year(TRAN_D),month(TRAN_D);

Related

Getting average value based on grouped data

I'm trying to find the average of net total for a given month, based on previous years to help show things like seasonal trends in sales.
I have a table called "Invoice" which looks similar to the below (slimmed down for the purpose of this post):
ID - int
IssueDate - DATE
NetTotal - Decimal
Status - Enum
The data I'm trying to get, for example would be similar to this:
(sum of invoices in June 2018 + sum of invoices in June 2019 + sum of invoices in June 2020) divided by number of years covered (3) = Overall average for June
But, doing this for the full 12 months of the year based on all the data (not just 2018 through to 2020).
I'm a bit stumped on how to pull this data. I've tried subqueries and even tried using a SUM within an AVG select, but the query either fails or returns incorrect data.
An example of what I've tried:
SELECT MONTHNAME(`Invoice`.`IssueDate`) AS `CalendarMonth`, AVG(`subtotal`)
FROM (SELECT SUM(`Invoice`.`NetTotal`) AS `subtotal`
FROM `Invoice`
GROUP BY EXTRACT(YEAR_MONTH FROM `Invoice`.`IssueDate`)) AS `sub`, `Invoice`
GROUP BY MONTH(`Invoice`.`IssueDate`)
which returns:
I see two parts to this query, but unsure how to structure it:
A sum and count of all data based on the month
An average based on the number of years
I'm not sure where to go from here and would appreciate any pointers.
Ideally, I'd want to get the totals from rows where "Status" = "Paid", but trying to crack the first part first. Walk before running as they say!
Any guidance greatly appreciated!
Basically you want two levels of aggregation:
SELECT mm, AVG(month_total)
FROM (SELECT YEAR(i.IssueDate) as yyyy, MONTH(i.issueDate) as mm,
SUM(i.`NetTotal`) as month_total
FROM Invoice i
GROUP BY yyyy, mm
) ym
GROUP BY mm;
Just for the Average Amount Part You Could use a query like
Select Date From Your_Table Where Date Like '20__-06-%'
You can arrange it into asc desc order.

MySQL Group By Order and Count(Distinct)

What is the best way to think about the Group By function in MySQL?
I am writing a MySQL query to pull data through an ODBC connection in a pivot table in Excel so that users can easily access the data.
For example, I have:
Select
statistic_date,
week(statistic_date,4),
year(statistic_date),
Emp_ID,
count(distict Emp_ID),
Site
Cost_Center
I'm trying to count the number of unique employees we have by site by week. The problem I'm running into is around year end, the calendar years don't always match up so it is important to have them by date so that I can manually filter down to the correct dates using a pivot table (2013/2014 had a week were we had to add week 53 + week 1).
I'm experimenting by using different group by statements but I'm not sure how the order matters and what changes when I switch them around.
i.e.
Group by week(statistic_date,4), Site, Cost_Center, Emp_ID
vs
Group by Site, Cost_Center, week(statistic_date,4), Emp_ID
Other things to note:
-Employees can work any number of days. Some are working 4 x 10's, others 5 x 8's with possibly a 6th day if they sign up for OT. If I sum the counts by week, I get anywhere between 3-7 per Emp_ID. I'm hoping to get 1 for the week.
-There are different pay code per employee so the distinct count helps when we are looking by day (VTO = Voluntary Time Off, OT = Over Time, LOA = Leave of Absence, etc). The distinct count will show me 1, where often times I will have 2-3 for the same emp in the same day (hits 40 hours and starts accruing OT then takes VTO or uses personal time in the same day).
I'm starting with a query I wrote to understand our paid hours by week. I'm trying to adapt it for this application. Actual code is below:
SELECT
dkh.STATISTIC_DATE AS 'Date'
,week(dkh.STATISTIC_DATE,4) as 'Week'
,month(dkh.STATISTIC_DATE) as 'Month'
,year(dkh.STATISTIC_DATE) as 'Year'
,dkh.SITE AS 'Site ID Short'
,aep.LOC_DESCR as 'Site Name'
,dkh.EMPLOYEE_ID AS 'Employee ID'
,count(distinct dkh.EMPLOYEE_ID) AS 'Distinct Employee ID'
,aep.NAME AS 'Employee Name'
,aep.BUSINESS_TITLE AS 'Business_Ttile'
,aep.SPRVSR_NAME AS 'Manager'
,SUBSTR(aep.DEPTID,1,4) AS 'Cost_Center'
,dkh.PAY_CODE
,dkh.PAY_CODE_SHORT
,dkh.HOURS
FROM metrics.DAT_KRONOS_HOURS dkh
JOIN metrics.EMPLOYEES_PUBLIC aep
ON aep.SNAPSHOT_DATE = SUBDATE(dkh.STATISTIC_DATE, DAYOFWEEK(dkh.STATISTIC_DATE) + 1)
AND aep.EMPLID = dkh.EMPLOYEE_ID
WHERE dkh.STATISTIC_DATE BETWEEN adddate(now(), interval -1 year) AND DATE(now())
group by dkh.SITE, SUBSTR(aep.DEPTID,1,4), week(dkh.STATISTIC_DATE,4), dkh.STATISTIC_DATE, dkh.EMPLOYEE_ID
The order you use in group by doesn't matter. Each unique combination of the values gets a group of its own. Selecting columns you don't group by gives you somewhat arbitrary results; you'd probably want to use some aggregation function on them, such as SUM to get the group total.
Grouping by values you derive from other values that you already use in group by, like below, isn't very useful.
week(dkh.STATISTIC_DATE,4), dkh.STATISTIC_DATE
If two rows have different weeks, they'll also have different dates, right?

Trying to figure out SQL query for monthly user churn based on an activity threshold

I have a table (we're on InfoBright columnar storage and I use MySQL Workbench as my interface) that essentially tracks users and a count of activities with a datestamp. It's a daily aggregate table. Schema is essentially
userid (int)
activity_count (int)
date (date)
What I'm trying to find is how many of my users are churning from month to month, with a basis of an active user defined as one with a monthly activity count that sums up to > 10
To find how many users are active in a given month I am currently using
select year, month, count(distinct user) as users
from
(
select YEAR(date) as year, MONTH(date) as month, userid as user, sum(activity_count) as activity
from table
group by YEAR(date), MONTH(date), userid
having activity > 10
order by YEAR(date), MONTH(date)
) t1
group by year, month
Not being a SQL expert, I am sure this can be improved and would appreciate the input on that.
My bigger goal though is to figure out from month to month, how many of the users who are in this count are new or repeat from the previous month. I don't know how to do that without what feels like ugly nesting or joining, and I feel like it should be fairly simple.
Thanks in advance.
I think that further nesting is the best way to achieve this. I would look to do something like selecting the user for the min concatenated Year & Month as a middle layer to the above (i.e. between outer and inner queries) so that you can establish the first month that the user became active. You can then add a where clause to the outer query to filter so that only the months you require are showing. Let me know if you need help with the syntax.

Select count datediff in SQL Server

I'm having issues getting the desired results from my database. The join_service_date and dropped_service_date columns have dates. The rejects have an r in the column if it is rejected.
I want to be able to count the agents sales, rejects, dropped sales and how many of those sales have dropped our service within 0-30 days, 31-60 days or 61-90 days. I got the results I needed from doing several small queries, but I would like to learn or know how to gather the information in a just 1 or as little as possible queries.
Also how would I specify this for a specific month like march or april.
select agentid,
count(join_service_date),
count(dropped_service_date),
count(rejects),
datediff(day, join_service_date, dropped_service_date)
from dbtable
group by agentid

How to get month using date in MySQL [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how do I get month from date in mysql
I want to get month using date example 2011-04-02 so I want month april. How to get this in MySQL?
SELECT MONTHNAME(date) AS monthName for January, February...
SELECT MONTH(date) AS monthName for 1, 2...
SELECT MONTHNAME(`date`) AS month_name FROM table_name;
You can use MONTHNAME() to get the month name. If you want month number, consider to use MONTH()
You can have a much more elegant solution to this if you use a second table as a date dimension table, and join your date field to it, in order to extract more useful information. This table can contain dates, month names, financial quarters, years, days of week, weekends, etc.
It is a really tiny table, only 365(ish) rows per year of data you have... And you can easily write some code to populate this table with as much data as you require. I did mine in Excel, exported as a CSV file and then imported the data into a blank table.
It also gives lots of benefits, for example, imagine a monthly data table with the following fields (and any others you can think of!) fully populated for all the months in a given range;
Date (E.g. 2009-04-01)
Day (E.g. 1)
Day of Week (E.g. Wednesday)
Month (E.g. 4)
Year (E.g. 2009)
Financial Year (E.g. 2009/10)
Financial Quarter (E.g. 2009Q1)
Calendar Quarter (E.g. 2009Q2)
Then combining this with your own table as follows;
SELECT `DT`.`monthName`
FROM `your_table`
INNER JOIN `dateTable` as DT
ON `your_table`.`your_date_field` = `dateTable`.`theDate`
There are many other nice outputs that you can get from this data.
Hope that helps!