SQL Find date range gaps in Table - mysql

Good day.
I seem to be struggling with what seems like a simple problem.
I have a table that has a value connected to a date (Monthly) for a finite number of ID's
ie. Table1
ID | Date ---| Value
01 | 2015-01 | val1
01 | 2015-02 | val2
02 | 2015-01 | val1
02 | 2015-03 | val2
So ID: 02 does not have a value for date 2015-02.
I would like to return all ID's and Dates that do not have a value.
Date range is: select distinct date from Table1
I can't seem to think outside the realms of selecting and joining on the same table.
I need to include the ID in my select to I can somehow select the ID and Date range that exists for that ID and compare to the entire date range, to get all the dates for each ID that isn't in the "entire" date range.
Please advise.
Thank you

Not very clear about your last two sentences. But you can play with the following query with different #max_days and #min_date:
-- DROP TABLE table1;
CREATE TABLE table1(ID int not null, `date` date not null, value varchar(64) not null);
INSERT table1(ID,`date`,value)
VALUES (1,'2015-01-01','v1'),(1,'2015-01-02','v2'),(2,'2015-01-01','v1'),(2,'2015-01-03','v2'),(4,'2015-01-01','v1'),(4,'2015-01-04','v2');
SELECT * FROM table1;
SET #day=0;
SET #max_days=5;
SET #min_date='2015-01-01';
SELECT i.ID,d.`date`
FROM (SELECT DISTINCT ID FROM table1) i
CROSS JOIN (
SELECT TIMESTAMPADD(DAY,#day,#min_date) AS `date`,#day:=#day+1 AS day_num
FROM table1 WHERE #day<#max_days) d
LEFT JOIN table1 t
ON t.ID=i.ID
AND t.`date`=d.`date`
WHERE t.`date` IS NULL
ORDER BY i.ID,d.`date`;

I now understand your requirement of dates being taken from the table; you want to find any gaps in the date ranges for each id.
This does what you need, but can probably be improved. Explanation below and you can view a working example.
DROP TABLE IF EXISTS Table1;
DROP TABLE IF EXISTS Year_Month_Calendar;
CREATE TABLE Table1 (
id INTEGER
,date CHAR(7)
,value CHAR(4)
);
INSERT INTO Table1
VALUES
(1,'2015-01','val1')
,(1,'2015-02','val2')
,(2,'2015-01','val1')
,(2,'2015-03','val1');
CREATE TABLE Year_Month_Calendar (
date CHAR(10)
);
INSERT INTO Year_Month_Calendar
VALUES
('2015-01')
,('2015-02')
,('2015-03');
SELECT ID_Year_Month.id, ID_Year_Month.date, Table1.id, Table1.date
FROM (
SELECT Distinct_ID.id, Year_Month_Calendar.date
FROM Year_Month_Calendar
CROSS JOIN
( SELECT DISTINCT id FROM Table1 ) AS Distinct_ID
WHERE Year_Month_Calendar.date >= (SELECT MIN(date) FROM Table1 WHERE id=Distinct_ID.ID)
AND Year_Month_Calendar.date <= (SELECT MAX(date) FROM Table1 WHERE id=Distinct_ID.ID)
) AS ID_Year_Month
LEFT JOIN Table1
ON ID_Year_Month.id = Table1.id AND ID_Year_Month.date = Table1.date
-- WHERE Table1.id IS NULL
ORDER BY ID_Year_Month.id, ID_Year_Month.date
Explanation
You need a calendar table which contains all dates (year/months) to cover the data you are querying.
CREATE TABLE Year_Month_Calendar (
date CHAR(10)
);
INSERT INTO Year_Month_Calendar
VALUES
('2015-01')
,('2015-02')
,('2015-03');
The inner select creates a table with all dates between the min and max date for each id.
SELECT Distinct_ID.id, Year_Month_Calendar.date
FROM Year_Month_Calendar
CROSS JOIN
( SELECT DISTINCT id FROM Table1 ) AS Distinct_ID
WHERE Year_Month_Calendar.date >= (SELECT MIN(date) FROM Table1 WHERE id=Distinct_ID.ID)
AND Year_Month_Calendar.date <= (SELECT MAX(date) FROM Table1 WHERE id=Distinct_ID.ID)
This is then LEFT JOINED to the original table to find the missing rows.
If you only want to return the missing row (my query displays the whole table to show how it works), add a WHERE clause to restrict the output to those rows where an id and date is not returned from Table1
Original answer before comments
You can do this without a tally table, since you say
Date range is: select distinct date from Table1
I've slightly changed the field names to avoid reserved words in SQL.
SELECT id_table.ID, date_table.`year_month`, table1.val
FROM (SELECT DISTINCT ID FROM table1) AS id_table
CROSS JOIN
(SELECT DISTINCT `year_month` FROM table1) AS date_table
LEFT JOIN table1
ON table1.ID=id_table.ID AND table1.`year_month` = date_table.`year_month`
ORDER BY id_table.ID
I've not filtered the results, in order to show how the query is working. To return the rows where only where a date is missing, add WHERE table1.year_month IS NULL to the outer query.
SQL Fiddle

You will need a tally table(s) or month/year tables. So you can then generate all of the potential combinations you want to test with. As far as exactly how to use it your example could use some expanding on such as last 12 months, last3 months, etc. but here is an example that might help you understand what you are looking for:
http://rextester.com/ZDQS5259
CREATE TABLE IF NOT EXISTS Tbl (
ID INTEGER
,Date VARCHAR(10)
,Value VARCHAR(10)
);
INSERT INTO Tbl VALUES
(1,'2015-01','val1')
,(1,'2015-02','val2')
,(2,'2015-01','val1')
,(2,'2015-03','val1');
SELECT yr.YearNumber, mn.MonthNumber, i.Id
FROM
(
SELECT 2016 as YearNumber
UNION SELECT 2015
) yr
CROSS JOIN (
SELECT 1 MonthNumber
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
UNION SELECT 7
UNION SELECT 8
UNION SELECT 9
UNION SELECT 10
UNION SELECT 11
UNION SELECT 12
) mn
CROSS JOIN (
SELECT DISTINCT ID
FROM
Tbl
) i
LEFT JOIN Tbl t
ON yr.YearNumber = CAST(LEFT(t.Date,4) as UNSIGNED)
AND mn.MonthNumber = CAST(RIGHT(t.Date,2) AS UNSIGNED)
AND i.ID = t.ID
WHERE
t.ID IS NULL
The basic idea to determine what you don't know is to generate all possible combinations of something could be. E.g. Year X Month X DISTINCT Id and then join back to figure out what is missing.

Probably not the prettiest but this should work.
select distinct c.ID, c.Date, d.Value
from (select a.ID, b.Date
from (select distinct ID from Table1) as a, (select distinct Date from Table1) as b) as c
left outer join Table1 d on (c.ID = d.ID and c.Date = d.Date)
where d.Value is NULL

Related

How join statements execute in sql

I'm trying to fetch the data from user table such that every row contains date value(not null). If value is null then it should be view that column with a date of id of above date which have same id.
Without updating the table rows, only with select statement?
Here is the table
NAME, DATE, ID
A, 2021-01-21, 1
B, null, 1
C, null, 1
D, 2021-01-18, 2
D, null, 2
It should be viewed like
A, 2021-01-21, 1
B, 2021-01-21, 1
C, 2021-01-21, 1
D, 2021-01-18, 2
D, 2021-01-18, 2
Now the query I think is =>
select t1.name, t2.date ,t1.id from user t1
left join (select id ,date from user where id=1) t2
on t1.id=t2.id;
But this query doesn't work like I thought.
Can anyone please tell me how above join query works ? And how can I improve it ? So that I got the required result.
For testing of above query use this queries =>
create table user(
name varchar(20),
date date,
id integer
);
insert into user values("A",'2021-01-21',1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("B",'2021-01-20',2);
select t1.name, t2.date ,t1.id from user t1
left join (select id ,date from user where id=1) t2
on t1.id=t2.id;
The first problem is that you are joining a table with itself on the condition t1.id = t2.id. So if you have 4 rows with id=1 and 3 rows with id=2 just as an example, you will end up with a result that had 4 * 4 + 3 * 3 = 25 rows. In your specific case you will end up with 6 * 6 + 1 * 1 = 37 rows.
The second problem is that you have hard-code selecting id=1 in your subquery:
(select id ,date from user where id=1) t2
This can't be the appropriate value for all possible rows.
You could try the obvious:
select
t1.name,
ifnull(t1.date, (select t2.date from user t2 where t2.date is not null and t2.id = t1.id limit 1)) as date,
t1.id
from user t1
;
see db-fiddle
name
id
date
A
1
2021-01-21
1
2021-01-21
1
2021-01-21
1
2021-01-21
1
2021-01-21
1
2021-01-21
B
2
2021-01-20
But better would be to use a join:
select u.name, ifnull(u.date, sq.date) as date, u.id
from user u join (
select id, min(date) as date from user group by id
) sq on u.id = sq.id
;
see db-fiddle
I would expect the second version using a join to be more efficient because the first version has a dependent subquery that has to get executed for every row that has a null date.
You don't need a join. Just use a window function:
select name,
max(date) over (partition by id) as date,
id
from users;
Note that your sample data doesn't match the data in the question. That data suggests:
select max(name) over (partition by id) as name,
max(date) over (partition by id) as date,
id
from user;
Here is a db<>fiddle.

left join result of two union

Trying to join those two information and create a new table. My goal is to add create year and month table
For example
t1 is
Year
1990
1991
1992
t2 is
Jan
Feb
Mar
I want all the possible combinations in one table.
1990-Jan
1990-Feb
1990-Mar
1991-Jan
1992-Feb...
thats why i didnt put any conditions on my code.
SELECT b.year
FROM
(
SELECT year from barcelona.births
UNION
SELECT year from barcelona.immigrants_by_age
)
AS b
left join a.month,
FROM
(
SELECT month
from barcelona.unemployment
UNION
SELECT month from barcelona.accidents
)
AS a
;
You can use the following solution:
INSERT INTO new_table_name (create_year, create_month)
SELECT t1.create_year, t2.create_month FROM t1, t2
In case you want to concat the values together you can use the following:
INSERT INTO new_table_name (create_year_month)
SELECT CONCAT_WS('-', t1.create_year, t2.create_month) FROM t1, t2
Using a JOIN (or like on the solution ,) without a condition (without ON) creates every possible combination between the rows of the two tables. By using a INSERT INTO ... SELECT ... you can SELECT and INSERT the selected values in one query into the new table.
demo on dbfiddle.uk
Your query should look like the following:
SELECT b.`year`, a.`month`
FROM (
SELECT `year` FROM barcelona.births
UNION
SELECT `year` FROM barcelona.immigrants_by_age
) AS b, (
SELECT `month` FROM barcelona.unemployment
UNION
SELECT `month` from barcelona.accidents
) AS a

MySQL join two tables with group by in each

I have two mysql tables with part numbers and qty's. I want to sum each tables qty sum(qty) ... group by partNumber Then join the two tables on the part number.
Sometimes table A will have part numbers that table b does not and vice versa. Below is an image of what I am expecting.
I've tried something like this, but this returns a row for each table and I want it to return 1 combined row
SELECT *, null as macroQty, sum(qty) as cardinalQty
FROM parts.cardinal where fileinfoid IN
(select cardinalFiles from parts.reports where fileinfoid = 418)
GROUP BY partNumber UNION ALL
SELECT *, sum(qty) as macroQty, null as cardinalQty
FROM parts.macro where fileinfoid IN
(select macroFiles from parts.reports where fileinfoid = 418 )
GROUP BY partNumber
I also tried wrapping it in an outer select and grouping by the part number from the outer select like this, but this results in the second inner select being null always
SELECT * FROM (
SELECT *, null as macroQty, sum(qty) as cardinalQty
FROM parts.cardinal where fileinfoid IN
(select cardinalFiles from parts.reports where fileinfoid = 418)
GROUP BY partNumber UNION ALL
SELECT *, sum(qty) as macroQty, null as cardinalQty
FROM parts.macro where fileinfoid IN
(select macroFiles from parts.reports where fileinfoid = 418 )
GROUP BY partNumber
) combined GROUP BY combined.partNumber
One approach would be to identify unique part numbers across the 2 tables (using a UNION with it's applied distinct) and then use correlated sub queries to get the sums. For example
drop table if exists a,b;
create table a(id int,val int);
create table b(id int,val int);
insert into a values(1,10),(1,10),(3,10),(4,10);
insert into b values (2,10),(4,10),(4,10);
select (select sum(a.val) from a where a.id = s.id) aval,
(select sum(b.val) from b where b.id = s.id) bval,
s.id partno
from
(
select id from a
union select id from b
) s
order by s.id;
+------+------+--------+
| aval | bval | partno |
+------+------+--------+
| 20 | NULL | 1 |
| NULL | 10 | 2 |
| 10 | NULL | 3 |
| 10 | 20 | 4 |
+------+------+--------+
4 rows in set (0.00 sec)
I would phrase this as a join between two subqueries which each find the sum in their respective tables. However, since each table does not necessarily contain all part numbers, and in fact there may be part numbers unique to each table, we will have to use a full outer join approach.
SELECT
t1.partNumber,
t1.cardinalQty,
COALECSE(t2.macroQty, 0) AS macroQty
FROM
(
SELECT partNumber, SUM(qty) AS cardinalQty
FROM cardinal
GROUP BY partNumber
) t1
LEFT JOIN
(
SELECT partNumber, SUM(qty) AS macroQty
FROM macro
GROUP BY partNumber
) t2
ON t1.partNumber = t2.partNumber
UNION ALL
SELECT
t2.partNumber,
0 AS cardinalQty,
t2.macroQty
FROM
(
SELECT partNumber, SUM(qty) AS cardinalQty
FROM cardinal
GROUP BY partNumber
) t1
RIGHT JOIN
(
SELECT partNumber, SUM(qty) AS macroQty
FROM macro
GROUP BY partNumber
) t2
ON t1.partNumber = t2.partNumber
WHERE t1.partNumber IS NULL;
Keep in mind that under normal conditions, in a well designed database, you should rarely encounter a situation which requires using a full outer join. Actually, a full outer join screams out that there is a design problem. In this case, you don't have a single parts table containing all part numbers. That table should exist, so unless you enjoy big ugly queries, you should create a parts table where the partNumber is a primary key.

Select how many rows up to a date

having a list of people like:
name date_of_birth
john 1987-09-08
maria 1987-09-08
samuel 1987-09-09
claire 1987-09-10
jane 1987-09-10
rose 1987-09-12
...
How can I get a result view using SQL of how many people are born up to that date, like the output for that table should be:
date count
1987-09-08 2
1987-09-09 3
1987-09-10 5
1987-09-11 5
1987-09-12 6
...
Thanks!
Here is another way, in addition to Gordon's answer. It uses joins:
SELECT
t1.date_of_birth,
COUNT(*) AS count
FROM (SELECT DISTINCT date_of_birth FROM yourTable) t1
INNER JOIN yourTable t2
ON t1.date_of_birth >= t2.date_of_birth
GROUP BY
t1.date_of_birth;
Note: I left out a step. Apparently you also want to report missing dates. If so, then you may replace what I aliased as t1 with a calendar table. For the sake of demonstration, you can inline all the dates:
SELECT
t1.date_of_birth,
COUNT(*) AS count
FROM
(
SELECT '1987-09-08' AS date_of_birth UNION ALL
SELECT '1987-09-09' UNION ALL
SELECT '1987-09-10' UNION ALL
SELECT '1987-09-11' UNION ALL
SELECT '1987-09-12'
) t1
LEFT JOIN yourTable t2
ON t1.date_of_birth >= t2.date_of_birth
GROUP BY
t1.date_of_birth;
Demo
In practice, your calendar table would be a bona fide table which just contains all the dates you want to appear in your result set.
One method is a correlated subquery:
select dob.date_of_birth,
(select count(*) from t where t.date_of_birth <= dob.date_of_birth) as running_count
from (select distinct date_of_birth from t) dob;
This is not particularly efficient. If your data has any size, variables are better (or window functions if you are using MySQL 8.0):
select date_of_birth,
(#x := #x + cnt) as running_count
from (select date_of_birth, count(*) as cnt
from t
group by date_of_birth
order by date_of_birth
) dob cross join
(select #x := 0) params;
Use subquery with correlation approach :
select date_of_birth, (select count(*)
from table
where date_of_birth <= t.date_of_birth
) as count
from table t
group by date_of_birth;

Query to select duplicates in column 2 based on column 1 in MySQL

Let's say I have two columns: id and date.
I want to give it an id and it'll find all the duplicates of the value date of the column id.
Example:
id |date
1 |2013-09-16
2 |2013-09-16
3 |2013-09-23
4 |2013-09-23
I want to give it id 1 (without giving anything about date) and it'll give me a table of 2 columns listing the duplicates of id 1's date
Thanks in advance!
select * from your_table
where `date` in
(
select `date`
from your_table
where id = 1
)
or if you like to use a join
select t.*
from your_table t
inner join
(
select `date`
from your_table
where id = 1
) x on x.date = t.date