I have a table that contains two columns
ID | Name
----------------
1 | John
2 | Sam
3 | Peter
6 | Mike
It has missed IDs. In this case these are 4 and 5.
How do I find and insert them together with random names into this table?
Update: cursors and temp tables are not allowed. The random name should be 'Name_'+ some random number. Maybe it would be the specified value like 'Abby'. So it doesn't matter.
Using a recursive CTE you can determine the missing IDs as follows
DECLARE #Table TABLE(
ID INT,
Name VARCHAR(10)
)
INSERT INTO #Table VALUES (1, 'John'),(2, 'Sam'),(3,'Peter'),(6, 'Mike')
DECLARE #StartID INT,
#EndID INT
SELECT #StartID = MIN(ID),
#EndID = MAX(ID)
FROM #Table
;WITH IDS AS (
SELECT #StartID IDEntry
UNION ALL
SELECT IDEntry + 1
FROM IDS
WHERE IDEntry + 1 <= #EndID
)
SELECT IDS.IDEntry [ID]
FROM IDS LEFT JOIN
#Table t ON IDS.IDEntry = t.ID
WHERE t.ID IS NULL
OPTION (MAXRECURSION 0)
The option MAXRECURSION 0 will allow the code to avoid the recursion limit of SQL SERVER
From Query Hints and WITH common_table_expression (Transact-SQL)
MAXRECURSION number Specifies the maximum number of recursions
allowed for this query. number is a nonnegative integer between 0 and
32767. When 0 is specified, no limit is applied. If this option is not specified, the default limit for the server is 100.
When the specified or default number for MAXRECURSION limit is reached
during query execution, the query is ended and an error is returned.
Because of this error, all effects of the statement are rolled back.
If the statement is a SELECT statement, partial results or no results
may be returned. Any partial results returned may not include all rows
on recursion levels beyond the specified maximum recursion level.
Generating the RANDOM names will largly be affected by the requirements of such a name, and the column type of such a name. What exactly does this random name entail?
You can do this using a recursive Common Table Expression CTE. Here's an example how:
DECLARE #MaxId INT
SELECT #MaxId = MAX(ID) from MyTable
;WITH Numbers(Number) AS
(
SELECT 1
UNION ALL
SELECT Number + 1 FROM Numbers WHERE Number < #MaxId
)
SELECT n.Number, 'Random Name'
FROM Numbers n
LEFT OUTER JOIN MyTable t ON n.Number=t.ID
WHERE t.ID IS NULL
Here are a couple of articles about CTEs that will be helpful to Using Common Table Expressions and Recursive Queries Using Common Table Expressions
Start by selecting the highest number in the table (select top 1 id desc), or select max(id), then run a while loop to iterate from 1...max.
See this article about looping.
For each iteration, see if the row exists, and if not, insert into table, with that ID.
I think recursive CTE is a better solution, because it's going to be faster, but here is what worked for me:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TestTable]') AND type in (N'U'))
DROP TABLE [dbo].[TestTable]
GO
CREATE TABLE [dbo].[TestTable](
[Id] [int] NOT NULL,
[Name] [varchar](50) NOT NULL,
CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
(
[Id] ASC
))
GO
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (1, 'John')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (2, 'Sam')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (3, 'Peter')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (6, 'Mike')
GO
declare #mod int
select #mod = MAX(number)+1 from master..spt_values where [type] = 'P'
INSERT INTO [dbo].[TestTable]
SELECT y.Id,'Name_' + cast(newid() as varchar(45)) Name from
(
SELECT TOP (select MAX(Id) from [dbo].[TestTable]) x.Id from
(
SELECT
t1.number*#mod + t2.number Id
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
WHERE t1.[type] = 'P' and t2.[type] = 'P'
) x
WHERE x.Id > 0
ORDER BY x.Id
) y
LEFT JOIN [dbo].[TestTable] on [TestTable].Id = y.Id
where [TestTable].Id IS NULL
GO
select * from [dbo].[TestTable]
order by Id
GO
http://www.sqlfiddle.com/#!3/46c7b/18
It's actually very simple :
Create a table called #All_numbers which should contain all the natural number in the range that you are looking for.
#list is a table containing your data
select a.num as missing_number ,
'Random_Name' + convert(varchar, a.num)
from #All_numbers a left outer join #list l on a.num = l.Id
where l.id is null
Related
I'd like to find the first "gap" in a counter column in an SQL table. For example, if there are values 1,2,4 and 5 I'd like to find out 3.
I can of course get the values in order and go through it manually, but I'd like to know if there would be a way to do it in SQL.
In addition, it should be quite standard SQL, working with different DBMSes.
In MySQL and PostgreSQL:
SELECT id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
LIMIT 1
In SQL Server:
SELECT TOP 1
id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
In Oracle:
SELECT *
FROM (
SELECT id + 1 AS gap
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
)
WHERE rownum = 1
ANSI (works everywhere, least efficient):
SELECT MIN(id) + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
Systems supporting sliding window functions:
SELECT -- TOP 1
-- Uncomment above for SQL Server 2012+
previd
FROM (
SELECT id,
LAG(id) OVER (ORDER BY id) previd
FROM mytable
) q
WHERE previd <> id - 1
ORDER BY
id
-- LIMIT 1
-- Uncomment above for PostgreSQL
Your answers all work fine if you have a first value id = 1, otherwise this gap will not be detected. For instance if your table id values are 3,4,5, your queries will return 6.
I did something like this
SELECT MIN(ID+1) FROM (
SELECT 0 AS ID UNION ALL
SELECT
MIN(ID + 1)
FROM
TableX) AS T1
WHERE
ID+1 NOT IN (SELECT ID FROM TableX)
There isn't really an extremely standard SQL way to do this, but with some form of limiting clause you can do
SELECT `table`.`num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
LIMIT 1
(MySQL, PostgreSQL)
or
SELECT TOP 1 `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
(SQL Server)
or
SELECT `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
AND ROWNUM = 1
(Oracle)
The first thing that came into my head. Not sure if it's a good idea to go this way at all, but should work. Suppose the table is t and the column is c:
SELECT
t1.c + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL
ORDER BY gap ASC
LIMIT 1
Edit: This one may be a tick faster (and shorter!):
SELECT
min(t1.c) + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL
This works in SQL Server - can't test it in other systems but it seems standard...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1))
You could also add a starting point to the where clause...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1)) AND ID > 2000
So if you had 2000, 2001, 2002, and 2005 where 2003 and 2004 didn't exist, it would return 2003.
The following solution:
provides test data;
an inner query that produces other gaps; and
it works in SQL Server 2012.
Numbers the ordered rows sequentially in the "with" clause and then reuses the result twice with an inner join on the row number, but offset by 1 so as to compare the row before with the row after, looking for IDs with a gap greater than 1. More than asked for but more widely applicable.
create table #ID ( id integer );
insert into #ID values (1),(2), (4),(5),(6),(7),(8), (12),(13),(14),(15);
with Source as (
select
row_number()over ( order by A.id ) as seq
,A.id as id
from #ID as A WITH(NOLOCK)
)
Select top 1 gap_start from (
Select
(J.id+1) as gap_start
,(K.id-1) as gap_end
from Source as J
inner join Source as K
on (J.seq+1) = K.seq
where (J.id - (K.id-1)) <> 0
) as G
The inner query produces:
gap_start gap_end
3 3
9 11
The outer query produces:
gap_start
3
Inner join to a view or sequence that has a all possible values.
No table? Make a table. I always keep a dummy table around just for this.
create table artificial_range(
id int not null primary key auto_increment,
name varchar( 20 ) null ) ;
-- or whatever your database requires for an auto increment column
insert into artificial_range( name ) values ( null )
-- create one row.
insert into artificial_range( name ) select name from artificial_range;
-- you now have two rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have four rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have eight rows
--etc.
insert into artificial_range( name ) select name from artificial_range;
-- you now have 1024 rows, with ids 1-1024
Then,
select a.id from artificial_range a
where not exists ( select * from your_table b
where b.counter = a.id) ;
This one accounts for everything mentioned so far. It includes 0 as a starting point, which it will default to if no values exist as well. I also added the appropriate locations for the other parts of a multi-value key. This has only been tested on SQL Server.
select
MIN(ID)
from (
select
0 ID
union all
select
[YourIdColumn]+1
from
[YourTable]
where
--Filter the rest of your key--
) foo
left join
[YourTable]
on [YourIdColumn]=ID
and --Filter the rest of your key--
where
[YourIdColumn] is null
For PostgreSQL
An example that makes use of recursive query.
This might be useful if you want to find a gap in a specific range
(it will work even if the table is empty, whereas the other examples will not)
WITH
RECURSIVE a(id) AS (VALUES (1) UNION ALL SELECT id + 1 FROM a WHERE id < 100), -- range 1..100
b AS (SELECT id FROM my_table) -- your table ID list
SELECT a.id -- find numbers from the range that do not exist in main table
FROM a
LEFT JOIN b ON b.id = a.id
WHERE b.id IS NULL
-- LIMIT 1 -- uncomment if only the first value is needed
My guess:
SELECT MIN(p1.field) + 1 as gap
FROM table1 AS p1
INNER JOIN table1 as p3 ON (p1.field = p3.field + 2)
LEFT OUTER JOIN table1 AS p2 ON (p1.field = p2.field + 1)
WHERE p2.field is null;
I wrote up a quick way of doing it. Not sure this is the most efficient, but gets the job done. Note that it does not tell you the gap, but tells you the id before and after the gap (keep in mind the gap could be multiple values, so for example 1,2,4,7,11 etc)
I'm using sqlite as an example
If this is your table structure
create table sequential(id int not null, name varchar(10) null);
and these are your rows
id|name
1|one
2|two
4|four
5|five
9|nine
The query is
select a.* from sequential a left join sequential b on a.id = b.id + 1 where b.id is null and a.id <> (select min(id) from sequential)
union
select a.* from sequential a left join sequential b on a.id = b.id - 1 where b.id is null and a.id <> (select max(id) from sequential);
https://gist.github.com/wkimeria/7787ffe84d1c54216f1b320996b17b7e
Here is an alternative to show the range of all possible gap values in portable and more compact way :
Assume your table schema looks like this :
> SELECT id FROM your_table;
+-----+
| id |
+-----+
| 90 |
| 103 |
| 104 |
| 118 |
| 119 |
| 120 |
| 121 |
| 161 |
| 162 |
| 163 |
| 185 |
+-----+
To fetch the ranges of all possible gap values, you have the following query :
The subquery lists pairs of ids, each of which has the lowerbound column being smaller than upperbound column, then use GROUP BY and MIN(m2.id) to reduce number of useless records.
The outer query further removes the records where lowerbound is exactly upperbound - 1
My query doesn't (explicitly) output the 2 records (YOUR_MIN_ID_VALUE, 89) and (186, YOUR_MAX_ID_VALUE) at both ends, that implicitly means any number in both of the ranges hasn't been used in your_table so far.
> SELECT m3.lowerbound + 1, m3.upperbound - 1 FROM
(
SELECT m1.id as lowerbound, MIN(m2.id) as upperbound FROM
your_table m1 INNER JOIN your_table
AS m2 ON m1.id < m2.id GROUP BY m1.id
)
m3 WHERE m3.lowerbound < m3.upperbound - 1;
+-------------------+-------------------+
| m3.lowerbound + 1 | m3.upperbound - 1 |
+-------------------+-------------------+
| 91 | 102 |
| 105 | 117 |
| 122 | 160 |
| 164 | 184 |
+-------------------+-------------------+
select min([ColumnName]) from [TableName]
where [ColumnName]-1 not in (select [ColumnName] from [TableName])
and [ColumnName] <> (select min([ColumnName]) from [TableName])
Here is standard a SQL solution that runs on all database servers with no change:
select min(counter + 1) FIRST_GAP
from my_table a
where not exists (select 'x' from my_table b where b.counter = a.counter + 1)
and a.counter <> (select max(c.counter) from my_table c);
See in action for;
PL/SQL via Oracle's livesql,
MySQL via sqlfiddle,
PostgreSQL via sqlfiddle
MS Sql via sqlfiddle
It works for empty tables or with negatives values as well. Just tested in SQL Server 2012
select min(n) from (
select case when lead(i,1,0) over(order by i)>i+1 then i+1 else null end n from MyTable) w
If You use Firebird 3 this is most elegant and simple:
select RowID
from (
select `ID_Column`, Row_Number() over(order by `ID_Column`) as RowID
from `Your_Table`
order by `ID_Column`)
where `ID_Column` <> RowID
rows 1
-- PUT THE TABLE NAME AND COLUMN NAME BELOW
-- IN MY EXAMPLE, THE TABLE NAME IS = SHOW_GAPS AND COLUMN NAME IS = ID
-- PUT THESE TWO VALUES AND EXECUTE THE QUERY
DECLARE #TABLE_NAME VARCHAR(100) = 'SHOW_GAPS'
DECLARE #COLUMN_NAME VARCHAR(100) = 'ID'
DECLARE #SQL VARCHAR(MAX)
SET #SQL =
'SELECT TOP 1
'+#COLUMN_NAME+' + 1
FROM '+#TABLE_NAME+' mo
WHERE NOT EXISTS
(
SELECT NULL
FROM '+#TABLE_NAME+' mi
WHERE mi.'+#COLUMN_NAME+' = mo.'+#COLUMN_NAME+' + 1
)
ORDER BY
'+#COLUMN_NAME
-- SELECT #SQL
DECLARE #MISSING_ID TABLE (ID INT)
INSERT INTO #MISSING_ID
EXEC (#SQL)
--select * from #MISSING_ID
declare #var_for_cursor int
DECLARE #LOW INT
DECLARE #HIGH INT
DECLARE #FINAL_RANGE TABLE (LOWER_MISSING_RANGE INT, HIGHER_MISSING_RANGE INT)
DECLARE IdentityGapCursor CURSOR FOR
select * from #MISSING_ID
ORDER BY 1;
open IdentityGapCursor
fetch next from IdentityGapCursor
into #var_for_cursor
WHILE ##FETCH_STATUS = 0
BEGIN
SET #SQL = '
DECLARE #LOW INT
SELECT #LOW = MAX('+#COLUMN_NAME+') + 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' < ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + '
DECLARE #HIGH INT
SELECT #HIGH = MIN('+#COLUMN_NAME+') - 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' > ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + 'SELECT #LOW,#HIGH'
INSERT INTO #FINAL_RANGE
EXEC( #SQL)
fetch next from IdentityGapCursor
into #var_for_cursor
END
CLOSE IdentityGapCursor;
DEALLOCATE IdentityGapCursor;
SELECT ROW_NUMBER() OVER(ORDER BY LOWER_MISSING_RANGE) AS 'Gap Number',* FROM #FINAL_RANGE
Found most of approaches run very, very slow in mysql. Here is my solution for mysql < 8.0. Tested on 1M records with a gap near the end ~ 1sec to finish. Not sure if it fits other SQL flavours.
SELECT cardNumber - 1
FROM
(SELECT #row_number := 0) as t,
(
SELECT (#row_number:=#row_number+1), cardNumber, cardNumber-#row_number AS diff
FROM cards
ORDER BY cardNumber
) as x
WHERE diff >= 1
LIMIT 0,1
I assume that sequence starts from `1`.
If your counter is starting from 1 and you want to generate first number of sequence (1) when empty, here is the corrected piece of code from first answer valid for Oracle:
SELECT
NVL(MIN(id + 1),1) AS gap
FROM
mytable mo
WHERE 1=1
AND NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
AND EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = 1
)
DECLARE #Table AS TABLE(
[Value] int
)
INSERT INTO #Table ([Value])
VALUES
(1),(2),(4),(5),(6),(10),(20),(21),(22),(50),(51),(52),(53),(54),(55)
--Gaps
--Start End Size
--3 3 1
--7 9 3
--11 19 9
--23 49 27
SELECT [startTable].[Value]+1 [Start]
,[EndTable].[Value]-1 [End]
,([EndTable].[Value]-1) - ([startTable].[Value]) Size
FROM
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS startTable
JOIN
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS EndTable
ON [EndTable].Record = [startTable].Record+1
WHERE [startTable].[Value]+1 <>[EndTable].[Value]
If the numbers in the column are positive integers (starting from 1) then here is how to solve it easily. (assuming ID is your column name)
SELECT TEMP.ID
FROM (SELECT ROW_NUMBER() OVER () AS NUM FROM 'TABLE-NAME') AS TEMP
WHERE ID NOT IN (SELECT ID FROM 'TABLE-NAME')
ORDER BY 1 ASC LIMIT 1
SELECT ID+1 FROM table WHERE ID+1 NOT IN (SELECT ID FROM table) ORDER BY 1;
Given the following data:
name | temp
-----------
hoi | 15
hoi | 15
hoi | 16
hoi | 15
hej | 13
hoi | 13
I would like to select the data in the given two columns without duplicates, However I do want to keep duplicates that are duplicates if they where interrupted by another value:
name | temp
-----------
hoi | 15 // selected
hoi | 15 // ignored duplicate
hoi | 15 // ignored duplicate
hoi | 16 // selected
hoi | 15 // selected because while not being unique it follows a different value
hoi | 15 // ignored duplicate
hej | 13 // selected
hoi | 13 // selected
hoi | 13 // ignored duplicate
hoi | 14 // selected
hoi | 13 // selected because while not being unique it follows a different value
This question was hard to formulate for me given English is not my native tongue, Feel free to edit the question or ask for clarifications.
Edit:
There is an id field and a datetime field.
Edit 2:
I use mySQL 5.7
Since you are using MySQL 5.7, which doesn't support analytical functions, you will need to use variables to store the values of temp and name, from the previous row:
SELECT t.ID,
t.Name,
t.Temp
FROM ( SELECT t.*,
IF(#temp = t.temp AND #name = t.Name, 1, 0) AS IsDuplicate,
#temp:= t.temp,
#name:= t.Name
FROM YourTable AS t
CROSS JOIN (SELECT #temp := 0, #name := '') AS v
ORDER BY t.ID
) AS t
WHERE t.IsDuplicate = 0
ORDER BY ID;
Example on DB<>Fiddle
The key parts are (not in the order in which they appear, but in the order in which it is logical to think about it).
(1) Initialise the variables, and order by ID (or whatever field(s) you like) to ensure variables are assigned in the correct order
CROSS JOIN (SELECT #temp := 0, #name := '') AS v
ORDER BY t.ID
(2) Check if the values stored in the variables matches the current row, and flag with a 1 or a 0
IIF(#temp = t.temp AND #name = t.Name, 1, 0) AS IsDuplicate
(3) Assign the values of temp and name in the current row to the variables, so they can be checked against the next row:
#temp:= t.temp,
#name:= t.Name
(4) Remove duplicates from the final data set:
WHERE t.IsDuplicate = 0;
To go one further, you could change the IsDuplicate flag to be a group marker, and use GROUP BY, so you can find out how many records there were in total, while still not displaying duplicates:
SELECT MIN(ID) AS FirstID,
t.Name,
t.Temp,
COUNT(*) AS Records,
MAX(ID) AS LastID
FROM ( SELECT t.*,
#group:= IF(#temp = t.temp AND #name = t.Name, #group, #group + 1) AS GroupID,
#temp:= t.temp,
#name:= t.Name
FROM YourTable AS t
CROSS JOIN (SELECT #temp := 0, #name := '', #group:= 0) AS v
ORDER BY t.ID
) AS t
GROUP BY t.GroupID, t.Name, t.Temp
ORDER BY t.GroupID;
Example on DB<>Fiddle
This may be surplus to requirements, but it can be useful as you are able to extract a lot more information than when just identifying duplicate rows.
Finally if/when you upgrade to version 8.0 or newer, you will be able to use ROW_NUMBER(), or if you move to any other DBMS that supports ROW_NUMBER() (which is most nowadays), then you can use the following:
SELECT MIN(ID) AS FirstID,
t.Name,
t.Temp,
COUNT(*) AS Records,
MAX(ID) AS LastID
FROM ( SELECT t.*,
ROW_NUMBER() OVER(ORDER BY ID) -
ROW_NUMBER() OVER(PARTITION BY Temp, Name ORDER BY ID) AS GroupID
FROM YourTable AS t
ORDER BY t.ID
) AS t
GROUP BY t.GroupID, t.Name, t.Temp
ORDER BY t.GroupID;
Example on DB<>Fiddle
Generic Solution
You can use the following query to do this on any DBMS:
select nd.*
from dedup nd
inner join (
-- find the previous id for each id
select id, (select max(id) from dedup where id < o.id) prev_id
from dedup o
) id_to_prev on id_to_prev.id = nd.id
-- join with the prev row to check for dups
left join dedup d on d.id = id_to_prev.prev_id
and d.name = nd.name
and d.temp = nd.temp
where d.id is null -- if no prev row found with same name+temp, include this row
order by nd.id
SQL Fiddle: http://sqlfiddle.com/#!9/0584ca3/9
If You are using Oracle:
select name, temp from (
select id,
name,
temp,
lag(temp,1,-99999) over (order by id) as temp_prev
from table
order by id) t
where t.temp != t.temp_prev
might work for You (depending on Your Oracle version!), it uses the LAG analytics function to look into previous rows values, creates a temp table then filters it.
create table #temp (name varchar(3),temp int)
insert into #temp values ('hoi',15)
insert into #temp values ('hoi',15)
insert into #temp values ('hoi',15)
insert into #temp values ('hoi',16)
insert into #temp values ('hoi',15)
insert into #temp values ('hoi',15)
insert into #temp values ('hej',13)
insert into #temp values ('hoi',13)
insert into #temp values ('hoi',13)
insert into #temp values ('hoi',14)
insert into #temp values ('hoi',13)
;with FinalResult as (
select ROW_NUMBER()Over(partition by name,temp order by name) RowNumber,*
from #temp
)
select * from FinalResult where RowNumber =1
drop table #temp
You want to look at the previous row in order to decide whether to show a row or not. This would be easy with LAG, available as of MySQL 8. With MySQL 5.7 you need a correlated subquery with LIMIT instead to get the previous row.
select *
from mytable
where not (name, temp) <=>
(
select prev.name, prev.temp
from mytable prev
where prev.id < mytable.id
order by prev.id desc
limit 1
);
Demo: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=4c775dbee12298cd93c5087d7085982f
My database schema looks like this:
Table t1:
id
valA
valB
Table t2:
id
valA
valB
What I want to do, is, for a given set of rows in one of these tables, find rows in both tables that have the same valA or valB (comparing valA with valA and valB with valB, not valA with valB). Then, I want to look for rows with the same valA or valB as the rows in the result of the previous query, and so on.
Example data:
t1 (id, valA, valB):
1, a, B
2, b, J
3, d, E
4, d, B
5, c, G
6, h, J
t2 (id, valA, valB):
1, b, E
2, d, H
3, g, B
Example 1:
Input: Row 1 in t1
Output:
t1/4, t2/3
t1/3, t2/2
t2/1
...
Example 2:
Input: Row 6 in t1
Output:
t1/2
t2/1
I would like to have the level of the search at that the row was found in the result (e.g. in Example 1: Level 1 for t1/2 and t2/1, level 2 for t1/5, ...) A limited depth of recursion is okay. Over time, I maybe want to include more tables following the same schema into the query. It would be nice if it was easy to extend the query for that purpose.
But what matters most, is the performance. Can you tell me the fastest possible way to accomplish this?
Thanks in advance!
try this although it's not fully tested but looked like it was working :P (http://pastie.org/1140339)
drop table if exists t1;
create table t1
(
id int unsigned not null auto_increment primary key,
valA char(1) not null,
valB char(1) not null
)
engine=innodb;
drop table if exists t2;
create table t2
(
id int unsigned not null auto_increment primary key,
valA char(1) not null,
valB char(1) not null
)
engine=innodb;
drop view if exists t12;
create view t12 as
select 1 as tid, id, valA, valB from t1
union
select 2 as tid, id, valA, valB from t2;
insert into t1 (valA, valB) values
('a','B'),
('b','J'),
('d','E'),
('d','B'),
('c','G'),
('h','J');
insert into t2 (valA, valB) values
('b','E'),
('d','H'),
('g','B');
drop procedure if exists find_children;
delimiter #
create procedure find_children
(
in p_tid tinyint unsigned,
in p_id int unsigned
)
proc_main:begin
declare done tinyint unsigned default 0;
declare dpth smallint unsigned default 0;
create temporary table children(
tid tinyint unsigned not null,
id int unsigned not null,
valA char(1) not null,
valB char(1) not null,
depth smallint unsigned default 0,
primary key (tid, id, valA, valB)
)engine = memory;
insert into children select p_tid, t.id, t.valA, t.valB, dpth from t12 t where t.tid = p_tid and t.id = p_id;
create temporary table tmp engine=memory select * from children;
/* http://dec.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */
while done <> 1 do
if exists(
select 1 from t12 t
inner join tmp on tmp.valA = t.valA or tmp.valB = t.valB and tmp.depth = dpth) then
insert ignore into children
select
t.tid, t.id, t.valA, t.valB, dpth+1
from t12 t
inner join tmp on tmp.valA = t.valA or tmp.valB = t.valB and tmp.depth = dpth;
set dpth = dpth + 1;
truncate table tmp;
insert into tmp select * from children where depth = dpth;
else
set done = 1;
end if;
end while;
select * from children order by depth;
drop temporary table if exists children;
drop temporary table if exists tmp;
end proc_main #
delimiter ;
call find_children(1,1);
call find_children(1,6);
You can do it with stored procedures (see listings 7 and 7a):
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html
You just need to figure out a query for the step of the recursion - taking the already-found rows and finding some more rows.
If you had a database which supported SQL-99 recursive common table expressions (like PostgreSQL or Firebird, hint hint), you could take the same approach as in the above link, but using a rCTE as the framework, so avoiding the need to write a stored procedure.
EDIT: I had a go at doing this with an rCTE in PostgreSQL 8.4, and although i can find the rows, i can't find a way to label them with the depth at which they were found. First, i create a a view to unify the tables:
create view t12 (tbl, id, vala, valb) as (
(select 't1', id, vala, valb from t1)
union
(select 't2', id, vala, valb from t2)
)
Then do this query:
with recursive descendants (tbl, id, vala, valb) as (
(select *
from t12
where tbl = 't1' and id = 1) -- the query that identifies the seed rows, here just t1/1
union
(select c.*
from descendants p, t12 c
where (p.vala = c.vala or p.valb = c.valb)) -- the recursive term
)
select * from descendants;
You would imagine that capturing depth would be as simple as adding a depth column to the rCTE, set to zero in the seed query, then somehow incremented in the recursive step. However, i couldn't find any way to do that, given that you can't write subqueries against the rCTE in the recursive step (so nothing like select max(depth) + 1 from descendants in the column list), and you can't use an aggregate function in the column list (so no max(p.depth) + 1 in the column list coupled with a group by c.* on the select).
You would also need to add a restriction to the query to exclude already-selected rows; you don't need to do that in the basic version, because of the distincting effect of the union, but if you add a count column, then a row can be included in the results more than once with different counts, and you'll get a Cartesian explosion. But you can't easily prevent it, because you can't have subqueries against the rCTE, which means you can't say anything like and not exists (select * from descendants d where d.tbl = c.tbl and d.id = c.id)!
I know all this stuff about recursive queries is of no use to you, but i find it riveting, so please do excuse me.
I have a record in my table like
memberId PersonId Year
4057 1787 2
4502 1787 3
I want a result from a query like this
memberId1 MemberId2 PersonId
4057 4502 1787
How to write a query ??
Don't do this in a query, do it in the application layer.
Don't do this in SQL. At best you can try:
SELECT table1.memberId memberId1, table2.memberId MemberId2, PersonId
FROM table table1 JOIN table table2 USING (PersonId)
But it won't do what you expect if you have more than 2 Members for a person. (It will return every possible combination.)
Below an example on how to do it directly in SQL. Mind that there is plenty of room for optimisation but this version should be rather fast, esp if you have an index on PersonID and year.
SELECT DISTINCT PersonID,
memberId1 = Convert(int, NULL),
memberId2 = Convert(int, NULL)
INTO #result
FROM myTable
WHERE Year IN (2 , 3)
CREATE UNIQUE CLUSTERED INDEX uq0_result ON #result (PersonID)
UPDATE #result
SET memberId1 = t.memberId
FROM #result upd
JOIN myTable t
ON t.PersionId = upd.PersonID
AND t.Year = 2
UPDATE #result
SET memberId2 = t.memberId
FROM #result upd
JOIN myTable t
ON t.PersionId = upd.PersonID
AND t.Year = 3
SELECT * FROM #result
if you want all member ids for each person_id, you could use [for xml path] statement (great functionality)
to concat all memberId s in a string
select distinct PersonId
, (select ' '+cast(t0.MemberId as varchar)
from table t0
where t0.PersonId=t1.PersonId
for xml path('')
) [Member Ids]
from table t1
resulting in:
PersonId Members Ids
1787 ' 4057 4502'
if you really need seperate columns, with unlimited number of memberIds, consider using
a PIVOT table, but far more complex to use
I have a table with a column (registration_no varchar(9)). Here is a sample:
id registration no
1 42400065
2 483877668
3 019000702
4 837478848
5 464657588
6 19000702
7 042400065
Please take note of registration numbers like (042400065) and (42400065), they are almost the same, the difference is just the leading zero.
I want to select all registration numbers that have the same case as above and delete the ones without a leading zero i.e (42400065)
pls, also note that before i delete the ones without leading zeros (42400065), i need to be sure that there is an equivalent with leading zeros(042400065)
declare #T table
(
id int,
[registration no] varchar(9)
)
insert into #T values
(1, '42400065'),
(2, '483877668'),
(3, '019000702'),
(4, '837478848'),
(5, '464657588'),
(6, '19000702'),
(7, '042400065')
;with C as
(
select row_number() over(partition by cast([registration no] as int)
order by [registration no]) as rn
from #T
)
delete from C
where rn > 1
create table temp id int;
insert into temp select id from your_table a where left (registration_no, ) = '0' and
exists select id from your_table
where a.registration_no = concat ('0', registration_no)
delete from your_table where id in (select id from temp);
drop table temp;
I think you can do this with a single DELETE statement. The JOIN ensures that only duplicates can get deleted, and the constraint limits it further by the registration numbers that don't start with a '0'.
DELETE
r1
FROM
Registration r1
JOIN
Registration r2 ON RIGHT(r1.RegistrationNumber, 8) = r2.RegistrationNumber
WHERE
LEFT(r1.RegistrationNumber, 1) <> '0'
Your table looks like this after running the above DELETE. I tested it on a SQL Server 2008 instance.
ID RegistrationNumber
----------- ------------------
2 483877668
3 019000702
4 837478848
5 464657588
7 042400065
This solution won't depend on the registration numbers being a particular length, it just looks for the ones that are the same integer, yet not the same value (because of the leading zeroes) and selects for the entry that has a '0' as the first character.
DELETE r
FROM Registration AS r
JOIN Registration AS r1 ON r.RegistrationNo = CAST(r1.RegistrationNo AS INT)
AND r.RegistrationNo <> r1.RegistrationNo
WHERE CHARINDEX('0',r.registrationno) = 1