Insert blank rows in MySQL SELECT statement - mysql

I have a query like this:
SELECT COUNT(id), MONTH(date), YEAR(date)
FROM activity
GROUP BY YEAR(date) DESC, MONTH(date) DESC
ORDER BY YEAR(date) DESC, MONTH(date) DESC
Which orders and groups records by month/date. Is there any way I can insert a blank row if a certain month doesn't have a record?
So, instead of this return:
c | M% | Y%
4 | 01 | 2014 # 4 records for Jan 2014
3 | 11 | 2013 # 3 records for Nov 2013
7 | 10 | 2013 # 7 records for Oct 2013
I want to insert months for which no records could be found (Jan 2013 with count = 0), so I can have a neat visualisation of monthly activities.
c | M% | Y%
4 | 01 | 2014
0 | 12 | 2013 # <<<< no records for Dec 2013, but I still want it in array
3 | 11 | 2013
7 | 10 | 2013

Contrary to others, you can make up things in a MySQL query utilizing SQL variables. My inner query creates a baseline date of one month ahead of whatever Now() is. This is joined to the activity table just to get rows to work with. The column is created by just setting the SQL variable equal to one month less than the month result of the previous. In this example, I am doing a limit of 6 so it only goes back 6 months worth of data, but you can change that to however many you care about... as long as there are that many records in the "Activity" table (could be any table as long as it has as many records as you want to create these place-holder records). This creates a result set of a
what I have as "DynamicCalendar". I then use this as the basis to do a left-join to the activity joined by month/year
SELECT
MONTH( DynamicCalendar.GrpDate ) as Mth,
YEAR( DynamicCalendar.GrpDate ) as Yr,
COUNT(activity.id) as Entries
from
( select
#BaseDate := date(date_sub(#BaseDate, interval 1 month)) as GrpDate
from
( select #BaseDate := date_add(Now(), interval 1 month)) sqlvars,
Activity,
limit
6 ) DynamicCalendar
LEFT JOIN Activity
ON MONTH( DynamicCalendar.GrpDate ) = MONTH( Activity.Date )
AND YEAR( DynamicCalendar.GrpDate ) = YEAR( Activity.Date )
group by
MONTH( DynamicCalendar.GrpDate ) as Mth,
YEAR( DynamicCalendar.GrpDate ) as Yr
order by
YEAR( DynamicCalendar.GrpDate ) DESC,
MONTH( DynamicCalendar.GrpDate )

Related

MySQL query for records that existed at any point each week

I have a table with created_at and deleted_at timestamps. I need to know, for each week, how many records existed at any point that week:
week
records
2022-01
4
2022-02
5
...
...
Essentially, records that were created before the end of the week and deleted after the beginning of the week.
I've tried various variations of the following but it's under-reporting and I can't work out why:
SELECT
DATE_FORMAT(created_at, '%Y-%U') AS week,
COUNT(*)
FROM records
WHERE
deleted_at > DATE_SUB(deleted_at, INTERVAL (WEEKDAY(deleted_at)+1) DAY)
AND created_at < DATE_ADD(created_at, INTERVAL 7 - WEEKDAY(created_at) DAY)
GROUP BY week
ORDER BY week
Any help would be massively appreciated!
I would create a table wktable that looks like so (for the last 5 weeks of last year):
yrweek | wkstart | wkstart
-------+------------+------------
202249 | 2022-11-27 | 2022-12-03
202250 | 2022-12-04 | 2022-12-10
202251 | 2022-12-11 | 2022-12-17
202252 | 2022-12-18 | 2022-12-24
202253 | 2022-12-25 | 2022-12-31
To get there, find a way to create 365 consecutive integers, make all the dates of 2022 out of that, and group them by year-week.
This is an example:
CREATE TABLE wk AS
WITH units(units) AS (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION
SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
)
,tens AS(SELECT units * 10 AS tens FROM units )
,hundreds AS(SELECT tens * 10 AS hundreds FROM tens )
,
i(i) AS (
SELECT hundreds +tens +units
FROM units
CROSS JOIN tens
CROSS JOIN hundreds
)
,
dt(dt) AS (
SELECT
DATE_ADD(DATE '2022-01-01', INTERVAL i DAY)
FROM i
WHERE i < 365
)
SELECT
YEAR(dt)*100 + WEEK(dt) AS yrweek
, MIN(dt) AS wkstart
, MAX(dt) AS wkend
FROM dt
GROUP BY yrweek
ORDER BY yrweek;
With that table, go:
SELECT
yrweek
, COUNT(*) AS records
FROM wk
JOIN input_table ON wk.wkstart < input_table.deleted_at
AND wk.wkend > input_table.created_at
GROUP BY
yrweek
;
I first build a list with the records, their open count, and the closed count
SELECT
created_at,
deleted_at,
(SELECT COUNT(*)
from records r2
where r2.created_at <= r1.created_at ) as new,
(SELECT COUNT(*)
from records r2
where r2.deleted_at <= r1.created_at) as closed
FROM records r1
ORDER BY r1.created_at;
After that it's just adding a GROUP BY:
SELECT
date_format(created_at,'%Y-%U') as week,
MAX((SELECT COUNT(*)
from records r2
where r2.created_at <= r1.created_at )) as new,
MAX((SELECT COUNT(*)
from records r2
where r2.deleted_at <= r1.created_at)) as closed
FROM records r1
GROUP BY week
ORDER BY week;
see: DBFIDDLE
NOTE: Because I use random times, the results will change when re-run. A sample output is:
week
new
closed
2022-00
31
0
2022-01
298
64
2022-02
570
212
2022-03
800
421

Calculating aggregated number of days in each month in sql

I've got a table with multiple columns and two of the columns are start_date and end_date.
I need to calculate the number of days in each month. Let's assume I have following data in my table
id | start_date | end_date
1 04.01.2016 15.02.2016
2 07.01.2016 22.01.2016
3 16.05.2016 11.07.2016
I want an output as follows
Month | numberOfTravelDays
January 51
February 15
May 15
June 31
July 11
This output I am expecting is the number of total travel days each month has been utilized. I am having trouble constructing the sql query for this. Can someone assist me on this?
This is what I have for now. And it's not doing the job. The below query also filters only this year's records(but ignore that).
select MONTH(start_date) as month,
COUNT(DATEDIFF(start_date, end_date)) as numberOfTravelDays
from travel
where YEAR(start_date) = YEAR(CURDATE())
group by MONTH(start_date),
MONTH(end_date)
Use a derived table:
select monstart,
sum(datediff(least(m.monend, t.end_date) + interval 1 day,
greatest(m.monstart, t.start_date)
)
) as days_worked
from travel t join
(select date('2016-01-01') as monstart, date('2016-01-31') as monend union all
select date('2016-02-01') as monstart, date('2016-02-29') as monend union all
. . .
) m
on t.end_date >= m.monstart and t.start_date <= m.monend
group by monstart;

Combining and merging data on different MySQL tables with the same columns into unique rows and running query to it

Here is the code that I run to analyse server logs on MySQL database:
SELECT YEAR(datetime), MONTH( datetime ), MIN(DATE(datetime)), MAX(DATE(datetime)), COUNT(DISTINCT (ip)), COUNT(ip), (COUNT(ip) / COUNT(DISTINCT (ip))) AS Ratio
FROM `server_log_1`
WHERE `state` LIKE 'action'
AND `user_id` LIKE '9'
GROUP BY MONTH( datetime )
UNION
SELECT YEAR(datetime), MONTH( datetime ), MIN(DATE(datetime)), MAX(DATE(datetime)), COUNT(DISTINCT (ip)), COUNT(ip), (COUNT(ip) / COUNT(DISTINCT (ip))) AS Ratio
FROM `server_log_2`
WHERE `state` LIKE 'action'
AND `user_id` LIKE '9'
GROUP BY MONTH( datetime )
UNION
SELECT YEAR(datetime), MONTH( datetime ), MIN(DATE(datetime)), MAX(DATE(datetime)), COUNT(DISTINCT (ip)), COUNT(ip), (COUNT(ip) / COUNT(DISTINCT (ip))) AS Ratio
FROM `server_log_3`
WHERE `state` LIKE 'action'
AND `user_id` LIKE '9'
GROUP BY MONTH( datetime )
This gives me the result:
YEAR(datetime) MONTH( datetime ) MIN(DATE(datetime)) MAX(DATE(datetime)) COUNT(DISTINCT (ip)) COUNT(ip) Ratio
2015 12 2015-12-14 2015-12-30 16 20 1.2500
2016 1 2016-01-05 2016-01-27 15 20 1.3333
2016 2 2016-02-02 2016-02-29 27 36 1.3333
2016 3 2016-03-04 2016-03-29 24 32 1.3333
2016 4 2016-04-01 2016-04-08 5 8 1.6000
2016 4 2016-04-09 2016-04-29 19 27 1.4211
2016 5 2016-05-02 2016-05-28 21 31 1.4762
2016 6 2016-06-01 2016-06-30 28 34 1.2143
2016 7 2016-07-01 2016-07-20 14 16 1.1429
2016 7 2016-07-21 2016-07-21 1 1 1.0000
These are accurate results for each database however you see when a month is split into 2 different databases, (like 2016-4 and 2016-7) this causes 2 different rows to be generated for that month.
I want the these rows to be generated as a single row which has the sum of the values of the corresponding month. (only one row per month)
Also, simplify the query if possible.
And I'll be in trouble after 2016-12 where grouping by month will merge data from 2015-12 and 2016-12. How can I avoid that problem as well?
Could you write the correct SQL statement, please?
How about doing the union all before the group by:
SELECT YEAR(datetime), MONTH(datetime), MIN(DATE(datetime)), MAX(DATE(datetime)), COUNT(DISTINCT (ip)), COUNT(ip), (COUNT(ip) / COUNT(DISTINCT (ip))) AS Ratio
FROM (
(SELECT datetime, ip FROM server_log_1 WHERE state = 'action' AND user_id = 9) UNION ALL
(SELECT datetime, ip FROM server_log_2 WHERE state = 'action' AND user_id = 9) UNION ALL
(SELECT datetime, ip FROM server_log_3 WHERE state = 'action' AND user_id = 9)
) AS table_all
GROUP BY YEAR(datetime), MONTH(datetime);
In terms of performance, you want an index for each table on state, user_id (and perhaps adding datetime and ip).

Given a table with time periods, query for a list of sum per day

Let's say I have a table that says how many items of something are valid between two dates.
Additionally, there may be multiple such periods.
For example, given a table:
itemtype | count | start | end
A | 10 | 2014-01-01 | 2014-01-10
A | 10 | 2014-01-05 | 2014-01-08
This means that there are 10 items of type A valid 2014-01-01 - 2014-01-10 and additionally, there are 10 valid 2014-01-05 - 2014-01-08.
So for example, the sum of valid items at 2014-01-06 are 20.
How can I query the table to get the sum per day? I would like a result such as
2014-01-01 10
2014-01-02 10
2014-01-03 10
2014-01-04 10
2014-01-05 20
2014-01-06 20
2014-01-07 20
2014-01-08 20
2014-01-09 10
2014-01-10 10
Can this be done with SQL? Either Oracle or MySQL would be fine
The basic syntax you are looking for is as follows:
For my example below I've defined a new table called DateTimePeriods which has a column for StartDate and EndDate both of which are DATE columns.
SELECT
SUM(NumericColumnName)
, DateTimePeriods.StartDate
, DateTimePeriods.EndDate
FROM
TableName
INNER JOIN DateTimePeriods ON TableName.dateColumnName BETWEEN DateTimePeriods.StartDate and DateTimePeriods.EndDate
GROUP BY
DateTimePeriods.StartDate
, DateTimePeriods.EndDate
Obviously the above code won't work on your database but should give you a reasonable place to start. You should look into GROUP BY and Aggregate Functions. I'm also not certain of how universal BETWEEN is for each database type, but you could do it using other comparisons such as <= and >=.
There are several ways to go about this. First, you need a list of dense dates to query. Using a row generator statement can provide that:
select date '2014-01-01' + level -1 d
from dual
connect by level <= 15;
Then for each date, select the sum of inventory:
with
sample_data as
(select 'A' itemtype, 10 item_count, date '2014-01-01' start_date, date '2014-01-10' end_date from dual union all
select 'A', 10, date '2014-01-05', date '2014-01-08' from dual),
periods as (select date '2014-01-01' + level -1 d from dual connect by level <= 15)
select
periods.d,
(select sum(item_count) from sample_data where periods.d between start_date and end_date) available
from periods
where periods.d = date '2014-01-06';
You would need to dynamically set the number of date rows to generate.
If you only needed a single row, then a query like this would work:
with
sample_data as
(select 'A' itemtype, 10 item_count, date '2014-01-01' start_date, date '2014-01-10' end_date from dual union all
select 'A', 10, date '2014-01-05', date '2014-01-08' from dual)
select sum(item_count)
from sample_data
where date '2014-01-06' between start_date and end_date;

MySQL calculate gain, loss and net gain over a period of time

I have a table something like this:
id | Customer | date
-----------------------------------------
1 | Customer2 | 2013-08-01 00:00:00
-----------------------------------------
2 | Customer1 | 2013-07-15 00:00:00
-----------------------------------------
3 | Customer1 | 2013-07-01 00:00:00
-----------------------------------------
. | ... | ...
-----------------------------------------
n | CustomerN | 2012-03-01 00:00:00
I want to calculate the "gained" customers for each month, the "lost" customers for each month and the Net Gain for each month, even if done in separate tables / views.
How can I do that?
EDIT
Ok, let me demonstrate what I've done so far.
To select Gained customers for any month, I've tried to select customers from Bookings table where the following not exist:
select Customer
from Bookings
where not exists
(select Customer
from Bookings
where
(Bookings.date BETWEEN
DATE_FORMAT(DATE_SUB(Bookings.date, INTERVAL 1 MONTH), '%Y-%m-01 00:00:00')
AND DATE_FORMAT(Bookings.date, '%Y-%m-01 00:00:00'
)
) AND Bookings.date >= STR_TO_DATE('2010-11-01 00:00:00', '%Y-%m-%d 00:00:00'))
This supposedly gets the customers that existed in the "selected" month but not in the previous one. "2010-11-01" is the date of the start of bookings + 1 month.
To select Lost customers for any month, I've tried to select customers from Bookings table where the following not exist:
select Customer
from Booking
where not exists
(select Customer
from Bookings
where
(Bookings.date BETWEEN
DATE_FORMAT(Bookings.date, '%Y-%m-01 00:00:00')
AND Bookings.date
)
AND Bookings.date >= STR_TO_DATE('2010-11-01 00:00:00', '%Y-%m-%d 00:00:00'
)
)
This supposedly gets the customers that existed in a previous month but not in the "selected" one.
For the "Loss" SQL query I got empty result! For the "Gain" I got thousands of rows but not sure if that's accurate.
You can use COUNT DISTINCT to count your customers, and WHERE YEAR(Date) = [year] AND MONTH(Date) = [month] to get the month.
The total number of customers in Sept 2013:
SELECT COUNT(DISTINCT Customer) AS MonthTotalCustomers FROM table
WHERE YEAR(date) = 2013 AND MONTH(date) = 9
The customers gained in Sept 2013:
SELECT COUNT(DISTINCT Customer) AS MonthGainedCustomers FROM table
WHERE YEAR(date) = 2013 AND MONTH(date) = 9
AND Customer NOT IN
(SELECT Customer FROM table
WHERE date < '2013-09-01')
Figuring out the lost customers is more difficult. I would need to know by what criteria you consider them to be 'lost.' If you just mean that they were around in August 2013 but they were not around in September 2013:
SELECT COUNT(DISTINCT Customer) AS MonthLostCustomers FROM table
WHERE YEAR(date) = 2013 AND MONTH(date) = 8
AND Customer NOT IN
(SELECT Customer FROM table
WHERE YEAR(date) = 2013 AND MONTH(date) = 9)
I hope from these examples you can extrapolate what you're looking for.