Cumulative monthly reporting - mysql

I have a MySQL table of photovoltaic electricity generation data (pvdata) from which I need to produce a monthly summary table. A simplified table is shown:
id date time pvdata
1 2012-01-01 10:00 50
1 2012-01-31 12:00 60
1 2012-02-10 13:00 70
2 2012-02-08 10:00 12
2 2012-03-20 10:00 17
The monthly summary table needs to show the cumulative generation for all systems in the database, regardless of whether I have received data for that month, so for example month 3 below contains the total generation from id = 1 (data received in month 2).
Also there may be more than one data point for an id in the same month, so the report must report the max(data) for the month.
year month cum_data
2012 1 60
2012 2 82
2012 3 87
I am pretty new to this, so have struggled for a while. The best I can come up with shows the cumulative total for the month, but without including the cumulative total for ids for which there is no data in the current month:
CREATE TEMPORARY TABLE intermed_gen_report
SELECT year(date) AS year, month(date) AS month, id, max(pvdata) AS maxpvdata
FROM pvdata
GROUP BY id, year(date), month(date)
ORDER BY year(date), month(date);
SELECT year, month, SUM(maxpvdata) AS cum_data
FROM intermed_gen_report
GROUP BY year, month
ORDER BY year, month;
Giving:
year month cum_data
2012 1 60
2012 2 82
2012 3 17

I think the problem is one kind of like this http://www.richnetapps.com/using-mysql-generate-daily-sales-reports-filled-gaps/ - you will want to create a table (possibly temporary) with dates (or year / month values). However that example leaves zeros where there is no data - I think you will want to do a join on a subselect that returns the most recent data before that date (or year/ month value).

I agree I think with what Aerik suggests. You will want to join your data of what is usually called a 'date dimension table'. You can find lots of examples on how to populate said table. This is a common technique in data warehousing.
You can also do what you need in one select using sub selects. Take a look at some of the previous threads like: generate days from date range

Related

Dynamic SQL query to select Quarters and perform aggregation on the data

sl.. Country Channel Type Clicks Spend Impressions Date
1. india Social a 14 $25 1,331 2/11/2021
2. india Search b 1,748 $1,801 1,166,140 2/11/2021
3. india Display c 28,615 $3,901 8,279,595 2/11/2021
4. india Display a 1,500 $1,000 1,233 7/10/2020
5. india Display a 11,500 $500 5,133 10/1/2020
6. india Display a 599 $200 6570 1/1/2020
So, what I needed to do is to create clicks/impressions for every quarter based on the historical data. The historical data is the same just that the dates are different, now to create (clicks/impressions) for this quarter the value will be
select sum(clicks)/sum(impressions)
from table_name
groupby (country, channel, type)
where ________
I need help in the where clause, I need to select the data from only some specific quarters while generating clicks/impressions for any quarter and the logic to select is:
Sum(clicks(q3 2020,q4 2020, q1 2020))/Sum(impressions(q3 2020,q4 2020, q1 2020))
where we need to find the quarters and the year from the date column.
By dynamically I mean if we move to the next quarter then I need to compare the average of the last 2 quarters and the same quarter previous year.
I wrote the code to find the quarters and year from the date, but how to proceed?
CASE
WHEN EXTRACT(MONTH FROM day) BETWEEN 7 AND 9 THEN 'Q1'
WHEN EXTRACT(MONTH FROM day) BETWEEN 10 AND 12 THEN 'Q2'
WHEN EXTRACT(MONTH FROM day) BETWEEN 1 AND 3 THEN 'Q3'
WHEN EXTRACT(MONTH FROM day) BETWEEN 4 AND 6 THEN 'Q4'
END AS quarter,
EXTRACT(Year FROM day) AS Year
Desired output
slno. quarter clicks/impressions
1. Q1 2021 0.53
2. Q4 2020 1.35
.......
When you're trying to filter data, don't do complex maths on the data and filter the results. That requires processing the whole table, then throwing away the rows not required.
Instead, do the maths in the filter parameters, which will allow the database to fulfil the filtering by checking indexes and only loading the rows it needs.
Fot example...
SELECT
SUM(clicks) / SUM(impressions)
FROM
yourTable
WHERE
(Date >= '2020-07-01' AND Date < '2021-01-01')
OR (Date >= '2020-01-01' AND Date < '2020-04-01')
That ensures you only process the data for the last two quarters of 2020, plus the first quarter of 2020.
The question then becomes, how to work out those dates based on today's date.
The start of the current quarter can be as provided by this... How do I get the first date of a quarter in MySQL?
If you store the result of that calculation in a variable named #CurrentQuarterStart, the WHERE clause becomes this...
WHERE
(Date >= #CurrentQuarterStart - INTERVAL 2 QUARTERS AND Date < #CurrentQuarterStart)
OR (Date >= #CurrentQuarterStart - INTERVAL 4 QUARTERS AND Date < #CurrentQuarterStart - INTERVAL 3 QUARTERS)

grouping data in SQL based on date manipulations

I've sample data here (date: yyyy-mm-dd format)
Headquarter date Sales monthyear
1 2020-10-30 1000 202010
1 2020-10-31 500 202010
1 2020-11-01 1000 202011
1 2020-11-02 2000 202011
1 2020-11-03 3000 202011
1 2020-11-04 1000 202011
1 2020-11-05 1000 202011
I have to sum all the sales values from 2nd of current month to 2nd of next month. Grouping by headquarter and monthyear. So if I run the query, based on current_date(), the sales value should be added to that month respectively.
To brief my explanation, here's the desired result
headquarter sales monthyear
1 4500 202010
1 5000 202011
So here 30th, 31st of Oct and 1st, 2nd of Nov falls in my condition and their values are summed to oct month. But 3rd, 4th of Nov values are summed to Nov month. Likewise it happens with every month.
If I run the sql query, as today is 1st Nov (IST), today's value should be added to previous month.
I'm looking for some help in making the SQL query here.
GROUP BY DATE_FORMAT(`date` - INTERVAL 1 DAY, '%y%m')
MySQL offers the year_month specifier for extract(). I assume you want everything from the 2nd of one month to the first of the next month. If so, subtract one day:
select headquarter,
extract(year_month from date - interval 1 day) as yyyymm,
sum(sales)
from sample
group by headquarter, yyyymm;

MySQL-Get latest data of 1st 4 days of current month

I have a table in my database which stores the meters energy value of 1st of every month. In case meter is offline it will store the value of the next day.
Below is my case
I have a record of a meter of past 2 months February and March. The February data is of 2019-02-01 00:00:00 but there are 4 rows for the month March. See the below image
In the above image the 1st,2nd and 3rd of March have a null value of FA but the 4th March contains some value.
What I have done?
I am able to select the rows having values of FA.
What I want to do?
I want to get only the current month data i.e. Current month is March so it should get only march record and then next month it should get only April record and so on.
The query should not exceed the days limit more than 4 i.e. It should only check record for 1st four days of every month.
Here is my DB-Fiddle
Any help would be highly appreciated.
one way to solve this is
FOR
I want to get only the current month data i.e. Current month is March so it should get only march record and then next month it should get only April record and so on.
means month(TV)= month(now())
and
The query should not exceed the days limit more than 4 i.e. It should only check record for 1st four days of every month. means day(TV)<= 4
and finally your query
select * from `biz_pub_data_f_energy_m` a
where a.`DATA_ID` = '1b9716122dd5408691a063227316ac0a'
and a.`FA` is NOT NULL and month(TV)= month(now())
and day(TV)<= 4
You can try below -
DEMO
select * from `biz_pub_data_f_energy_m` a
where a.`DATA_ID` = '1b9716122dd5408691a063227316ac0a'
and a.`FA` is NOT NULL and tv>=date(DATE_SUB(now(),INTERVAL DAYOFMONTH(now())-1 DAY))
and tv<=DATE_ADD(date(DATE_SUB(now(),INTERVAL DAYOFMONTH(now())-1 DAY)), INTERVAL 4 DAY)

MySQL - sum data by day intervals

I have such data in my table:
I need to calculate "Paid" field of UserID which reoccurs in 7 day intervals. In this example I will SUM(Paid) for UserID "01" because it occurs 2 times in 7 days interval.
I can calculate it programmaticaly, but only in such date intervals (2016-01-01 - 2016-01-07; 2016-01-07 - 2016-01-13; etc.).
Maybe there is some possibility to perform this calculation at MySQL level in any 7 day intervals? For example: 2016-01-01 - 2016-01-07; 2016-01-02 - 2016-01-08; 2016-10-10 - 2016-10-16; etc.
I believe the method WEEK() returns the week number based on the calendar year, meaning that for 2016: 1st Jan, 2nd Jan and 3rd Jan would return 0, but 4th Jan would return 1, which to my understanding does not fit the requirements.
I would suggest:
SELECT `UserID`, SUM(`Paid`) FROM `table` GROUP BY DATEDIFF(`Date`, (SELECT MIN(`Date`) FROM `table`)) DIV 7, `UserID`

SQL Syntax run operations in between records?

I have a mysql database with a table in it. This table consists of the some of the following information. It has values in one column with months Jan-May. So five months. On the adjacent column, there are "Counts" with integer values to each month. Bear in mind that there can be duplicate values of the months. So, for example, a snippet of the table could read
January | 5
January | 10
February | 1
March | 20
April | 23
April | 34
April | 43
May | 9
There are a lot more records (160). Say the average of the month is running some sql command like
select month, avg(count) from tablename group by month. However, this divides the sum of counts for each month by the number of records. A true average would divide the sum of the counts by the number of days in each month. So I have the following statements,
select month, sum(count)/31 from trendsummary.traffictype where month like 'January';
select month, sum(count)/28 from trendsummary.traffictype where month like 'February';
select month, sum(count)/31 from trendsummary.traffictype where month like 'March';
select month, sum(count)/30 from trendsummary.traffictype where month like 'April';
select month, sum(count)/31 from trendsummary.traffictype where month like 'May';
This gives me the averages for the counts for each month. So the question is...what would be the syntax if I wanted an average of the averages of Jan-April? So... I want to have statements that would take the averages (based on the number of days of the month) for each of the months, and then take the average of the averages for January, February, March, And April and spit that value out? How would one go about this? Thanks!
you can try that :
select month, sum(count)/(31+28+31+30)
from trendsummary.traffictype
where month in ( 'January' , 'February','March','April' );
Union the selects and enclose them in parentheses and treat that as you data source as in this example:
select avg(*) from (
select month, sum(count)/31 as average from ...
union select ...
union select ...
)
remember that most sql engines will require to name the computed expression column like I did (as average) at least in the first select of all union selects.