Recommended way to search hierarchical data MSSQL2008

Recommended way to search hierarchical data MSSQL2008 - sql-server-2008

I have a table with the following contents:
CategoryID
ParentID
Name
I would like to have a search functionality that would search the whole hierarchy, for exmple this is the breadcrumb of a category:
Motorcycles/Japan/Kawasaki/600cc to 800cc/1998-2004
If someone searches for "600cc Kawasaki" I would like the above category to be returned. So the categorypath which has the most matches should return.
At the moment I came up with this:
IF ISNULL(#searchTerm, '') = ''
SET #searchTerm = '""'
DECLARE #Result TABLE (CategoryId int)
DECLARE CategoryCursor CURSOR LOCAL FAST_FORWARD FOR
SELECT CategoryId, ParentId, Name
FROM Category
WHERE FREETEXT([Name], #searchTerm)
OPEN CategoryCursor
DECLARE #CategoryId int
DECLARE #ParentId int
DECLARE #Name nvarchar(100)
FETCH NEXT FROM CategoryCursor INTO #CategoryId, #ParentId, #Name
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #FullPath nvarchar(1000)
SET #FullPath = #Name
WHILE #ParentId <> 0
BEGIN
SELECT #ParentId = ParentId, #Name = [Name]
FROM Category
WHERE CategoryId = #ParentId
SET #FullPath = #Name + '\' + #FullPath
END
-- Check if #FullPath contains all of the searchterms
DECLARE #found bit
DECLARE #searchWords NVARCHAR(100)
DECLARE #searchText NVARCHAR(255)
DECLARE #pos int
SET #found = 1
SET #searchWords = #searchTerm + ' '
SET #pos = CHARINDEX(' ', #searchWords)
WHILE #pos <> 0
BEGIN
SET #searchText = LEFT(#searchWords, #pos - 1)
SET #searchWords = STUFF(#searchWords, 1, #pos, '')
SET #pos = CHARINDEX(' ', #searchWords)
IF #searchText = '' CONTINUE
IF #FullPath NOT LIKE '%' + #searchText + '%'
BEGIN
SET #found = 0
BREAK
END
END
IF #found = 1
INSERT INTO #Result VALUES(#CategoryId)
FETCH NEXT FROM CategoryCursor INTO #CategoryId, #ParentId, #Name
END
CLOSE CategoryCursor
DEALLOCATE CategoryCursor
SELECT *
FROM Category
WHERE categoryID IN (SELECT categoryId FROM #Result)
This will first find all catagorynames which contain any of the searchwords. Problem is, I don't want "600cc" for other brands to return, only the one which is related to "Kawasaki".
So next I build the breadcrumb for the current category and see if it contains all of the searchwords.
It works but I think it is ineffective so i'm looking for a better method.
Perhaps storing the complete path as text in a new column and search on that?

I'd suggest using the hierarchyid which is in 2008. You would essentially set your hierarchy like this
/1/ - Root Node
/1/1/ - Motorcycles
/1/1/1/ - Japan
/1/1/1/1/ - Kawasaki
/1/1/1/2/ - Honda
/1/1/2/ - US
/1/1/2/1/ - Harley.
Then you can use the hierarchyid to get the entire tree from your 600cc 1984 kawasaki all the way up to motorcycles.
Here's a code sample from Programming Microsoft SQL Server 2008
CREATE FUNCTION dbo.fnGetFullDisplayPath(#EntityNodeId hierarchyid) RETURNS varchar(max) AS
BEGIN
DECLARE #EntityLevelDepth smallint
DECLARE #LevelCounter smallint
DECLARE #DisplayPath varchar(max)
DECLARE #ParentEmployeeName varchar(max)
-- Start with the specified node
SELECT #EntityLevelDepth = NodeId.GetLevel(),
#DisplayPath = EmployeeName
FROM Employee
WHERE NodeId = #EntityNodeId
-- Loop through all its ancestors
SET #LevelCounter = 0
WHILE #LevelCounter < #EntityLevelDepth
BEGIN
SET #LevelCounter = #LevelCounter + 1
SELECT #ParentEmployeeName = EmployeeName
FROM Employee WHERE NodeId = (SELECT NodeId.GetAncestor(#LevelCounter)
FROM Employee
WHERE NodeId = #EntityNodeId)
-- Prepend the ancestor name to the display path
SET #DisplayPath = #ParentEmployeeName + ' > ' + #DisplayPath
END
RETURN(#DisplayPath)
END
My /1/1/2 representation is the string representation. In the database you'd actually see the hex representation (e.g. 0x79).
There are a few key functions on the hierarchyid.
declare #motorcycleAncestor hieararchyid
select #motorcycleAncestor = nodeId.GetAncestor(1)
from parts
where Label = 'motorcycle'
select * from Parts
where Node.GetAncestor(1) = #motorcyleAncestor;
This query does a couple things. First, it gets the hierarchy id for the node that contains "Motorcycle" as the label. (I assume the hiearchy field is named 'nodeid' but you can obviously call it whatever.)
Next, it takes this node value and finds all the immediate children of motorcycles (who's ancestor, 1 level up, is the motorcycle node. You can actually specify any value, like GetAncestor(3) would be the ancestor 3 levels up). So in that case, it would find Japan, US, Germany etc.
There is another method, called IsDescendantOf(node). You can use it like this:
declare #motorcycleAncestor hieararchyid
select #motorcycleAncestor = nodeId.GetAncestor(1)
from parts
where Label = 'motorcycle'
select * from Parts
where Node.IsDescendantOf(#motorcycleAncestor) = 1
This would return all items that are children (of any level) underneath motorcycles. It would actually also include Motorcycles.
You can combine these in different ways. For example, we're using them in an org chart of sorts. We have the ability to show results for a single user, or for a user and his siblings (everyone at the exact same level) and a user and all his descendants.
So I could show your information, or I could show everyone in your department, or I could show everyone in your company.

Related

How to match any value of search string from a column containing multiple values separated by space in table in sql?

I have a column in table which has multiple values separated by space.
i want to return those rows which has any of the matching values from search string.
Eg:
search string= 'mumbai pune'
This need to return rows matching word 'mumbai' or 'pune' or matching both
Declare #str nvarchar(500)
SET #str='mumbai pune'
create table #tmp
(
ID int identity(1,1),
citycsv nvarchar(500)
)
insert into #tmp(citycsv)Values
('mumbai pune'),
('mumbai'),
('nagpur')
select *from #tmp t
select *from #tmp t
where t.citycsv like '%'+#str+'%'
drop table #tmp
Required Out put:
ID CityCSV
1 mumbai pune
2 mumbai

You can use a splitter function to split your search string out as a table contain the desired search keys. Then you can join your main table with the table containing the search key using the LIKE statement.
For completeness I have included an example of a string splitter function, however there are plenty of example here on SO.
Example string splitter function:
CREATE FUNCTION [dbo].[SplitString]
(
#string NVARCHAR(MAX),
#delimiter CHAR(1)
)
RETURNS #output TABLE(splitdata NVARCHAR(MAX)
)
BEGIN
DECLARE #start INT, #end INT
SELECT #start = 1, #end = CHARINDEX(#delimiter, #string)
WHILE #start < LEN(#string) + 1 BEGIN
IF #end = 0
SET #end = LEN(#string) + 1
INSERT INTO #output (splitdata)
VALUES(SUBSTRING(#string, #start, #end - #start))
SET #start = #end + 1
SET #end = CHARINDEX(#delimiter, #string, #start)
END
RETURN
END
The following query demonstrates how the string splitter function can be combined with regular expressions to get the desired result:
SELECT DISTINCT
C.ID
,C.citycsv
FROM #tmp C
INNER JOIN (
SELECT splitdata + '[ ]%' AS MatchFirstWord -- Search pattern to match the first word in the string with the target search word.
,'%[ ]' + splitdata AS MatchLastWord -- Search pattern to match the last word in the string with the target search word.
,'%[ ]' + splitdata + '[ ]%' AS MatchMiddle -- Search pattern to match any words in the middle of the string with the target search word.
,splitdata AS MatchExact -- Search pattern for exact match.
FROM dbo.SplitString(#str, ' ')
) M ON (
(C.citycsv LIKE M.MatchFirstWord) OR
(C.citycsv LIKE M.MatchLastWord) OR
(C.citycsv LIKE M.MatchMiddle) OR
(C.citycsv LIKE M.MatchExact)
)
ORDER BY C.ID

Another approach , by using ReplaceFunction
Its syntax as following:
REPLACE ( string_expression , string_pattern , string_replacement )
so we could reach the target via replacing the every space that separated the values with the next pattern
'%'' OR t.citycsv like ''%'
An example:
Declare #str nvarchar(500),
#Where nvarchar (1000),
#Query nvarchar (4000)
SET #str='mumbai pune'
create table #tmp
(
ID int identity(1,1),
citycsv nvarchar(500)
)
insert into #tmp(citycsv)Values
('mumbai pune'),
('mumbai'),
('nagpur')
select * from #tmp t
Set #Where = 'where t.citycsv like ' + '''%'+ replace (RTRIM(LTRIM(#str)), ' ', '%'' OR t.citycsv like ''%') +'%'''
Set #Query = 'select * from #tmp t ' + #Where
execute sp_executesql #Query
drop table #tmp
The Result:

T-SQL function with dynamic SELECT (not possible) - solved with procedure instead

I've managed to use EXEC sp_executesql in a one off statement to do a dynamic lookup, but am unable to adjust the code to create a function since EXEC is not allowed in functions. It works in procedures and I've managed to get output via PRINT for a single lookup by using a temporary table, but really that was just me struggling to find a workaround. Ideally I'd like to be able to create a scalar-value function.
The reason that I need a dynamic lookup is because the column name is stored in another table.
Here's a quick breakdown of the tables:
Questions:
Columns: Q_Group, Q_Nbr, Question_Desc, Data_Field
Sample data: 'R3', 5, 'Do you have any allergies?', 'TXT_04'
Responses:
Columns: Order_Nbr, Q_Group, TXT_01, TXT_02, TXT_03, TXT_04, etc.
Data: 999, 'R3', 'blah', 'blah', 'blah', 'NO'
Orders will be assigned a particular set of questions 'Q_Group' and often a particular question will be the same across various different sets of questions. The problem is that when the set/groups of questions were set up, the questions may not have been added in the same order, and thus the responses go into different columns.
So here's where I'm at...
I can get 'TXT_04' from the Data_Field column in Questions and use EXEC sp_executesql to do a lookup for a single order, but am struggling to find a way to accomplish this as a function of some sort.
DECLARE #col_name VARCHAR(6)
DECLARE #sql VARCHAR(100)
SET #col_name = SELECT Data_Field FROM QUESTIONS WHERE Q_Group = 'R3'
AND Question_Desc = 'Do you have any allergies?'
SET #sql = 'SELECT ' + #col_name + ' FROM RESPONSES WHERE Order_Nbr = 999'
EXEC sp_executesql #sql
I'm just at a loss as to how this could be incorporated into a function so that I could get responses for several orders in a result set. Any workarounds possible? Maybe I'm totally off base using EXEC sp_executesql?
Thanks.
Edit...
Okay, I've changed the title to reflect that I'm going to consider this solved with a procedure instead of a function, as it ended up getting the output that I wanted. Which was a table with all of the corresponding responses.
Here's the code that I settled on. I decided to use LIKE to match the Question_Desc instead of equals, and then included the Question_Desc in the results, so that it could be used a bit more broadly. Thankfully it's pretty quick to run currently. Although that could always change as the database grows!
CREATE PROCEDURE get_all_responses (#question_txt VARCHAR(255))
AS
DECLARE #response_col VARCHAR(35)
DECLARE #t TABLE (order_nbr int, question_txt VARCHAR(255), response_col VARCHAR(35), response VARCHAR(255))
DECLARE #i TABLE (id INT PRIMARY KEY IDENTITY(1,1), response_col VARCHAR(35))
DECLARE #u TABLE (order_nbr int, response VARCHAR(255))
DECLARE #sql VARCHAR(200)
INSERT #t
SELECT Order_Nbr, Question_Desc, Data_Field, NULL
FROM Responses
JOIN (
SELECT Q_Group, Question_Desc, Data_Field
FROM Questions
WHERE Question_Desc LIKE #question_txt
) #Q ON Q_Group = #Q.Q_Group
WHERE Q_Group <> '0'
ORDER BY Data_Field, Order_Nbr
-- Stop if no results found and return empty result set
IF (SELECT COUNT(*) FROM #t) = 0
BEGIN
SELECT order_nbr, question_txt, response FROM #t
RETURN
END
INSERT #i SELECT response_col FROM #t GROUP BY response_col
DECLARE #row_nbr int
DECLARE #last_row int
SET #row_nbr = 1
SET #last_row = (SELECT COUNT(*) FROM #i)
-- Iterate through each Data_Field found
WHILE #row_nbr <= #last_row
BEGIN
SET #response_col = (SELECT response_col FROM #i WHERE id = #row_nbr)
SET #sql = 'SELECT Order_Nbr, ' + #response_col + ' FROM Responses WHERE NullIf(' + #response_col + ','''') IS NOT NULL'
INSERT INTO #u
EXEC (#sql)
UPDATE #t
SET response = y.response
FROM #t AS x
INNER JOIN #u AS y ON x.order_nbr = y.order_nbr
SET #row_nbr = #row_nbr + 1
END
-- Remove results with no responses
DELETE FROM #t WHERE response IS NULL
SELECT order_nbr, question_txt, response FROM #t
RETURN

You will not be able to execute dynamic SQL from within a function but you could do this with a stored procedure and capture the output.
DECLARE #col_name VARCHAR(6), #param NVARCHAR(50), #myReturnValue VARCHAR(50)
SET #param = N'#result VARCHAR(50) OUTPUT'
DECLARE #sql VARCHAR(100)
SET #col_name = SELECT Data_Field FROM QUESTIONS WHERE Q_Group = 'R3'
AND Question_Desc = 'Do you have any allergies?'
SET #sql = 'SELECT #result = ' + #col_name + ' FROM RESPONSES WHERE Order_Nbr = 999'
EXEC sp_executesql #sql, #param, #result = #myReturnValue output
--manipulate value here
print #myReturnValue
You could also create a temp table and do an insert into from exec sp_executesql.

search array element in LIKE mysql

i have a stored procedure that searches like bellow:
BEGIN
#At first, Search in name of job:
(SELECT * FROM tb_job WHERE `name` LIKE '%some%' AND `name` LIKE '%thing%')
UNION
# second, search for tags:
(SELECT * FROM tb_job WHERE id IN
(
SELECT idJob FROM
(
(SELECT 2 AS priority1, COUNT(tb_job_tag.idTag) AS priority2, idJob FROM tb_job_tag WHERE idTag IN
(SELECT tb_tag.id FROM tb_tag WHERE tag LIKE '%some%' OR tag LIKE '%thing%')
GROUP BY tb_job_tag.idJob)
UNION
(SELECT 1, COUNT(tb_job_tag.idTag), idJob FROM tb_job_tag WHERE idTag IN
(SELECT tb_tag.id FROM tb_tag WHERE tag LIKE '%some%' AND tag LIKE '%thing%')
GROUP BY tb_job_tag.idJob)
)
AS t ORDER BY priority1, priority2 DESC
)
)
END
now i have 2 questions: how can i pass an array of words and separate them in mysql and use them in LIKE? second, how can i make this search better?
(i have 3table: tb_job, tb_tag, tb_job_tag that stores job's id and tag's id). thanks for your help.

/**
* http://www.aspdotnet-suresh.com/2013/07/sql-server-split-function-example-in.html
*/
CREATE FUNCTION dbo.Array(#String nvarchar(4000), #Delimiter char(1))
RETURNS #Results TABLE ([id] [bigint] IDENTITY(1,1) NOT NULL, Items nvarchar(4000))
AS
BEGIN
DECLARE #INDEX INT
DECLARE #SLICE nvarchar(4000)
-- HAVE TO SET TO 1 SO IT DOESNT EQUAL Z
-- ERO FIRST TIME IN LOOP
SELECT #INDEX = 1
WHILE #INDEX !=0
BEGIN
-- GET THE INDEX OF THE FIRST OCCURENCE OF THE SPLIT CHARACTER
SELECT #INDEX = CHARINDEX(#Delimiter,#STRING)
-- NOW PUSH EVERYTHING TO THE LEFT OF IT INTO THE SLICE VARIABLE
IF #INDEX !=0
SELECT #SLICE = LEFT(#STRING,#INDEX - 1)
ELSE
SELECT #SLICE = #STRING
-- PUT THE ITEM INTO THE RESULTS SET
INSERT INTO #Results(Items) VALUES(#SLICE)
-- CHOP THE ITEM REMOVED OFF THE MAIN STRING
SELECT #STRING = RIGHT(#STRING,LEN(#STRING) - #INDEX)
-- BREAK OUT IF WE ARE DONE
IF LEN(#STRING) = 0 BREAK
END
RETURN
END
execute this function once in your database and now onward to access specific value you can write your query like below
DECLARE #VALUE VARCHAR(100);
SELECT TOP 1 #VALUE = Items FROM [dbo].[Array] ('some,thing,like,that' , ',') where id = 2
PRINT #VALUE
All you need to change is id in select statement. It'll accept only String values for now. But you can convert String to Int in SQL using CAST
I just created this function in hurry if you have any suggestions/modifications let me know...

T-SQL: split and aggregate comma-separated values

I have the following table with each row having comma-separated values:
ID
-----------------------------------------------------------------------------
10031,10042
10064,10023,10060,10065,10003,10011,10009,10012,10027,10004,10037,10039
10009
20011,10027,10032,10063,10023,10033,20060,10012,10020,10031,10011,20036,10041
I need to get a count for each ID (a groupby).
I am just trying to avoid cursor implementation and stumped on how to do this without cursors.
Any Help would be appreciated !

You will want to use a split function:
create FUNCTION [dbo].[Split](#String varchar(MAX), #Delimiter char(1))
returns #temptable TABLE (items varchar(MAX))
as
begin
declare #idx int
declare #slice varchar(8000)
select #idx = 1
if len(#String)<1 or #String is null return
while #idx!= 0
begin
set #idx = charindex(#Delimiter,#String)
if #idx!=0
set #slice = left(#String,#idx - 1)
else
set #slice = #String
if(len(#slice)>0)
insert into #temptable(Items) values(#slice)
set #String = right(#String,len(#String) - #idx)
if len(#String) = 0 break
end
return
end;
And then you can query the data in the following manner:
select items, count(items)
from table1 t1
cross apply dbo.split(t1.id, ',')
group by items
See SQL Fiddle With Demo

Well, the solution i always use, and probably there might be a better way, is to use a function that will split everything. No use for cursors, just a while loop.
if OBJECT_ID('splitValueByDelimiter') is not null
begin
drop function splitValueByDelimiter
end
go
create function splitValueByDelimiter (
#inputValue varchar(max)
, #delimiter varchar(1)
)
returns #results table (value varchar(max))
as
begin
declare #delimeterIndex int
, #tempValue varchar(max)
set #delimeterIndex = 1
while #delimeterIndex > 0 and len(isnull(#inputValue, '')) > 0
begin
set #delimeterIndex = charindex(#delimiter, #inputValue)
if #delimeterIndex > 0
set #tempValue = left(#inputValue, #delimeterIndex - 1)
else
set #tempValue = #inputValue
if(len(#tempValue)>0)
begin
insert
into #results
select #tempValue
end
set #inputValue = right(#inputValue, len(#inputValue) - #delimeterIndex)
end
return
end
After that you can call the output like this :
if object_id('test') is not null
begin
drop table test
end
go
create table test (
Id varchar(max)
)
insert
into test
select '10031,10042'
union all select '10064,10023,10060,10065,10003,10011,10009,10012,10027,10004,10037,10039'
union all select '10009'
union all select '20011,10027,10032,10063,10023,10033,20060,10012,10020,10031,10011,20036,10041'
select value
from test
cross apply splitValueByDelimiter(Id, ',')
Hope it helps, although i am still looping through everything

After reiterating the comment above about NOT putting multiple values into a single column (Use a separate child table with one value per row!),
Nevertheless, one possible approach: use a UDF to convert delimited string to a table. Once all the values have been converted to tables, combine all the tables into one table and do a group By on that table.
Create Function dbo.ParseTextString (#S Text, #delim VarChar(5))
Returns #tOut Table
(ValNum Integer Identity Primary Key,
sVal VarChar(8000))
As
Begin
Declare #dlLen TinyInt -- Length of delimiter
Declare #wind VarChar(8000) -- Will Contain Window into text string
Declare #winLen Integer -- Length of Window
Declare #isLastWin TinyInt -- Boolean to indicate processing Last Window
Declare #wPos Integer -- Start Position of Window within Text String
Declare #roVal VarChar(8000)-- String Data to insert into output Table
Declare #BtchSiz Integer -- Maximum Size of Window
Set #BtchSiz = 7900 -- (Reset to smaller values to test routine)
Declare #dlPos Integer -- Position within Window of next Delimiter
Declare #Strt Integer -- Start Position of each data value within Window
-- -------------------------------------------------------------------------
-- ---------------------------
If #delim is Null Set #delim = '|'
If DataLength(#S) = 0 Or
Substring(#S, 1, #BtchSiz) = #delim Return
-- --------------------------------------------
Select #dlLen = DataLength(#delim),
#Strt = 1, #wPos = 1,
#wind = Substring(#S, 1, #BtchSiz)
Select #winLen = DataLength(#wind),
#isLastWin = Case When DataLength(#wind) = #BtchSiz
Then 0 Else 1 End,
#dlPos = CharIndex(#delim, #wind, #Strt)
-- --------------------------------------------
While #Strt <= #winLen
Begin
If #dlPos = 0 Begin -- No More delimiters in window
If #isLastWin = 1 Set #dlPos = #winLen + 1
Else Begin
Set #wPos = #wPos + #Strt - 1
Set #wind = Substring(#S, #wPos, #BtchSiz)
-- ----------------------------------------
Select #winLen = DataLength(#wind), #Strt = 1,
#isLastWin = Case When DataLength(#wind) = #BtchSiz
Then 0 Else 1 End,
#dlPos = CharIndex(#delim, #wind, 1)
If #dlPos = 0 Set #dlPos = #winLen + 1
End
End
-- -------------------------------
Insert #tOut (sVal)
Select LTrim(Substring(#wind,
#Strt, #dlPos - #Strt))
-- -------------------------------
-- Move #Strt to char after last delimiter
Set #Strt = #dlPos + #dlLen
Set #dlPos = CharIndex(#delim, #wind, #Strt)
End
Return
End
Then write, (using your table schema),
Declare #AllVals VarChar(8000)
Select #AllVals = Coalesce(#allVals + ',', '') + ID
From Table Where ID Is Not null
-- -----------------------------------------
Select sVal, Count(*)
From dbo.ParseTextString(#AllVals, ',')
Group By sval

T-SQL strip all non-alpha and non-numeric characters

Is there a smarter way to remove all special characters rather than having a series of about 15 nested replace statements?
The following works, but only handles three characters (ampersand, blank and period).
select CustomerID, CustomerName,
Replace(Replace(Replace(CustomerName,'&',''),' ',''),'.','') as CustomerNameStripped
from Customer

One flexible-ish way;
CREATE FUNCTION [dbo].[fnRemovePatternFromString](#BUFFER VARCHAR(MAX), #PATTERN VARCHAR(128)) RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE #POS INT = PATINDEX(#PATTERN, #BUFFER)
WHILE #POS > 0 BEGIN
SET #BUFFER = STUFF(#BUFFER, #POS, 1, '')
SET #POS = PATINDEX(#PATTERN, #BUFFER)
END
RETURN #BUFFER
END
select dbo.fnRemovePatternFromString('cake & beer $3.99!?c', '%[$&.!?]%')
(No column name)
cake beer 399c

Create a function:
CREATE FUNCTION dbo.StripNonAlphaNumerics
(
#s VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
DECLARE #p INT = 1, #n VARCHAR(255) = '';
WHILE #p <= LEN(#s)
BEGIN
IF SUBSTRING(#s, #p, 1) LIKE '[A-Za-z0-9]'
BEGIN
SET #n += SUBSTRING(#s, #p, 1);
END
SET #p += 1;
END
RETURN(#n);
END
GO
Then:
SELECT Result = dbo.StripNonAlphaNumerics
('My Customer''s dog & #1 friend are dope, yo!');
Results:
Result
------
MyCustomersdog1friendaredopeyo
To make it more flexible, you could pass in the pattern you want to allow:
CREATE FUNCTION dbo.StripNonAlphaNumerics
(
#s VARCHAR(255),
#pattern VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
DECLARE #p INT = 1, #n VARCHAR(255) = '';
WHILE #p <= LEN(#s)
BEGIN
IF SUBSTRING(#s, #p, 1) LIKE #pattern
BEGIN
SET #n += SUBSTRING(#s, #p, 1);
END
SET #p += 1;
END
RETURN(#n);
END
GO
Then:
SELECT r = dbo.StripNonAlphaNumerics
('Bob''s dog & #1 friend are dope, yo!', '[A-Za-z0-9]');
Results:
r
------
Bobsdog1friendaredopeyo

I faced this problem several years ago, so I wrote a SQL function to do the trick. Here is the original article (was used to scrape text out of HTML). I have since updated the function, as follows:
IF (object_id('dbo.fn_CleanString') IS NOT NULL)
BEGIN
PRINT 'Dropping: dbo.fn_CleanString'
DROP function dbo.fn_CleanString
END
GO
PRINT 'Creating: dbo.fn_CleanString'
GO
CREATE FUNCTION dbo.fn_CleanString
(
#string varchar(8000)
)
returns varchar(8000)
AS
BEGIN
---------------------------------------------------------------------------------------------------
-- Title: CleanString
-- Date Created: March 26, 2011
-- Author: William McEvoy
--
-- Description: This function removes special ascii characters from a string.
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
declare #char char(1),
#len int,
#count int,
#newstring varchar(8000),
#replacement char(1)
select #count = 1,
#len = 0,
#newstring = '',
#replacement = ' '
---------------------------------------------------------------------------------------------------
-- M A I N P R O C E S S I N G
---------------------------------------------------------------------------------------------------
-- Remove Backspace characters
select #string = replace(#string,char(8),#replacement)
-- Remove Tabs
select #string = replace(#string,char(9),#replacement)
-- Remove line feed
select #string = replace(#string,char(10),#replacement)
-- Remove carriage return
select #string = replace(#string,char(13),#replacement)
-- Condense multiple spaces into a single space
-- This works by changing all double spaces to be OX where O = a space, and X = a special character
-- then all occurrences of XO are changed to O,
-- then all occurrences of X are changed to nothing, leaving just the O which is actually a single space
select #string = replace(replace(replace(ltrim(rtrim(#string)),' ', ' ' + char(7)),char(7)+' ',''),char(7),'')
-- Parse each character, remove non alpha-numeric
select #len = len(#string)
WHILE (#count <= #len)
BEGIN
-- Examine the character
select #char = substring(#string,#count,1)
IF (#char like '[a-z]') or (#char like '[A-Z]') or (#char like '[0-9]')
select #newstring = #newstring + #char
ELSE
select #newstring = #newstring + #replacement
select #count = #count + 1
END
return #newstring
END
GO
IF (object_id('dbo.fn_CleanString') IS NOT NULL)
PRINT 'Function created.'
ELSE
PRINT 'Function NOT created.'
GO

I know this is an old thread, but still, might be handy for others.
Here's a quick and dirty (Which I've done inversely - stripping out non-numerics) - using a recursive CTE.
What makes this one nice for me is that it's an inline function - so gets around the nasty RBAR effect of the usual scalar and table-valued functions.
Adjust your filter as needs be to include or exclude whatever char types.
Create Function fncV1_iStripAlphasFromData (
#iString Varchar(max)
)
Returns
Table With Schemabinding
As
Return(
with RawData as
(
Select #iString as iString
)
,
Anchor as
(
Select Case(IsNumeric (substring(iString, 1, 1))) when 1 then substring(iString, 1, 1) else '' End as oString, 2 as CharPos from RawData
UNION ALL
Select a.oString + Case(IsNumeric (substring(#iString, a.CharPos, 1))) when 1 then substring(#iString, a.CharPos, 1) else '' End, a.CharPos + 1
from RawData r
Inner Join Anchor a on a.CharPos <= len(rtrim(ltrim(#iString)))
)
Select top 1 oString from Anchor order by CharPos Desc
)
Go
select * from dbo.fncV1_iStripAlphasFromData ('00000')
select * from dbo.fncV1_iStripAlphasFromData ('00A00')
select * from dbo.fncV1_iStripAlphasFromData ('12345ABC6789!&*0')

If you can use SQL CLR you can use .NET regular expressions for this.
There is a third party (free) package that includes this and more - SQL Sharp .

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Recommended way to search hierarchical data MSSQL2008 - sql-server-2008

Related

How to match any value of search string from a column containing multiple values separated by space in table in sql?

T-SQL function with dynamic SELECT (not possible) - solved with procedure instead

search array element in LIKE mysql

T-SQL: split and aggregate comma-separated values

T-SQL strip all non-alpha and non-numeric characters

Categories

Resources