SQL Server: Split operation - sql-server-2008

How to split a string in SQL Server.
Example:
Input string: stack over flow
Result:
stack
over
flow

if you can't use table value parameters, see: "Arrays and Lists in SQL Server 2008 Using Table-Valued Parameters" by Erland Sommarskog , then there are many ways to split string in SQL Server. This article covers the PROs and CONs of just about every method:
"Arrays and Lists in SQL Server 2005 and Beyond, When Table Value Parameters Do Not Cut it" by Erland Sommarskog
You need to create a split function. This is how a split function can be used:
SELECT
*
FROM YourTable y
INNER JOIN dbo.yourSplitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach to split a string in TSQL but there are numerous ways to split strings in SQL Server, see the previous link, which explains the PROs and CONs of each.
For the Numbers Table method to work, you need to do this one time table setup, which will create a table Numbers that contains rows from 1 to 10,000:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTable(' ','stack over flow')
OUTPUT:
ListValue
-------------------
stack
over
flow
(3 row(s) affected)

A common set-based solution to this kind of problem is to use a numbers table.
The following solution uses a simple recursive CTE to generate the numbers table on the fly - if you need to work with longer strings, this should be replaced with a static numbers table.
DECLARE #vch_string varchar(max)
DECLARE #chr_delim char(1)
SET #chr_delim = ' '
SET #vch_string = 'stack over flow'
;WITH nums_cte
AS
(
SELECT 1 AS n
UNION ALL
SELECT n+1 FROM nums_cte
WHERE n < len(#vch_string)
)
SELECT n - LEN(REPLACE(LEFT(s,n),#chr_delim,'')) + 1 AS pos
,SUBSTRING(s,n,CHARINDEX(#chr_delim, s + #chr_delim,n) -n) as ELEMENT
FROM (SELECT #vch_string as s) AS D
JOIN nums_cte
ON n <= LEN(s)
AND SUBSTRING(#chr_delim + s,n,1) = #chr_delim
OPTION (MAXRECURSION 0);

I know this question was for SQL Server 2008 but things evolve so starting with SQL Server 2016 you can do this
DECLARE #string varchar(100) = 'Richard, Mike, Mark'
SELECT value FROM string_split(#string, ',')

CREATE FUNCTION [dbo].[Split]
(
#List varchar(max),
#SplitOn nvarchar(5)
)
RETURNS #RtnValue table
(
Id int identity(1,1),
Value nvarchar(max)
)
AS
BEGIN
While (Charindex(#SplitOn,#List)>0)
Begin
Insert Into #RtnValue (value)
Select
Value = ltrim(rtrim(Substring(#List,1,Charindex(#SplitOn,#List)-1)))
Set #List = Substring(#List,Charindex(#SplitOn,#List)+len(#SplitOn),len(#List))
End
Insert Into #RtnValue (Value)
Select Value = ltrim(rtrim(#List))
Return
END
Create Above Function And Execute Belowe Query To Get Your Result.
Select * From Dbo.Split('Stack Over Flow',' ')
Suggestion : use delimiter for get split value. it's better. (for ex. 'Stack,Over,Flow')

Hard. Really hard - Strin Manipulation and SQL... BAD combination. C# / .NET for a stored procedure is a way, could return a table defined type (table) with one item per row.

Related

T-SQL How can I query a column with incorrect JSON?

I've been asked to create a VIEW off a table that includes a varchar(MAX) column containing a JSON string. Unfortunately, some of the entries contain double quotes that aren't escaped.
Example (invalid in Notes):
{"Eligible":"true","Reason":"","Notes":"Left message for employee to "call me"","EDate":"08/16/2021"}
I don't have access to correct wherever this is being inserted so I just have to work with the data as is.
So in my view I need to find a way to escape those double quotes.
I'm pulling the data like so:
JSON_VALUE(JsonData, '$.Notes') as Notes
However, I get the following error:
JSON text is not properly formatted. Unexpected character '"' is found at position 102.
I can't do a simple replace on the whole field because that would create invalid JSON also.
I tried JSON_MODIFY but run into the problem of getting the notes field to replace itself.
JSON_MODIFY(JsonData, '$.Notes', REPLACE(JSON_VALUE(JsonData, '$.Notes'), '"', '\"'))
Maybe I'm missing something obvious, but I can't figure out how to handle this. Is there a way to escape those double quotes in my query?
So this is incredibly hacky and there are probably several examples that could break it as is, but if you absolutely can't fix your source data output or simply flag bad JSON for manual adjustment, this may be the route you need to take and further flesh out.
Based on your example and a couple extras I have thrown in, with the help of a custom string splitting table valued function that maintains sort order, you can achieve the output as follows:
Query
declare #t table (JsonData nvarchar(max));
insert into #t values('{"Eligible":true,"Reason":"","Notes":"Left message for employee to "call me"","EDate":"08/16/2021","Test": "999","Another Test":"Value with " character"}');
with q as
(
select t.JsonData
,s.rn
,case when right(trim(lag(s.item,1) over (order by s.rn)),1) in('{',':',',')
then '"'
else ''
end -- Do we need a starting double quote?
+ s.item -- Value from the split text
+ case when right(trim(lead(s.item,1) over (order by s.rn)),1) not in('}',':',',')
and right(trim(s.item),1) not in('{','}',':',',')
then '\"'
else ''
end -- Do we need an escaped double quote?
+ case when left(trim(lead(s.item,1) over (order by s.rn)),1) in('}',':',',')
then '"'
else ''
end -- Do we need an ending double quote?
as Quoted
from #t as t
cross apply dbo.fn_StringSplit4k(t.JsonData,'"',null) as s -- By splitting on " characters, we know where they all are even though they are removed, so we can add them back in as required based on the remaining text
)
,j as
(
select JsonData
,string_agg(Quoted,'') within group (order by rn) as JsonFixed
from q
group by JsonData
)
select json_value(JsonFixed, '$.Eligible') as Eligible
,json_value(JsonFixed, '$.Reason') as Reason
,json_value(JsonFixed, '$.Notes') as Notes
,json_value(JsonFixed, '$.EDate') as EDate
,json_value(JsonFixed, '$.Test') as Test
,json_value(JsonFixed, '$."Another Test"') as AnotherTest
from j;
Output
Eligible
Reason
Notes
EDate
Test
AnotherTest
true
Left message for employee to "call me"
08/16/2021
999
Value with " character
String Splitter
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
I would like to suggest a stored procedure along these lines:
CREATE FUNCTION dbo.clearJSon(#v nvarchar(max)) RETURNS nvarchar(max)
AS
BEGIN
DECLARE #i AS int
DECLARE #security int
SET #i=PATINDEX('%[^{:,]"[^,:}]%',#v)
SET #security=0 -- just to prevent an endless loop
WHILE #i>0 and #security<100
BEGIN
SET #v = LEFT(#v,#i)+''''+SUBSTRING(#v,#i+2,len(#v))
SET #i=PATINDEX('%[^{:,]"[^,:}]%',#v)
SET #security = #security+1
END
RETURN #v
END
which returns
{"Eligible":"true","Reason":"","Notes":"Left message for employee to 'call me'","EDate":"08/16/2021"} as the result of dbo.clearJSon(JsonData)
I have to admit though, that the above code would fail, if the unescaped quotes would be followed by one of ,:} or if it would trail one of {:,

StoredprocedureinMSSQLtoMySQL

I have a stored procedure in MSSQl, i would like to write it int My sql,
Any help or sugegstions please.I can not get to use XML function in Mysql.
stored proc:
ALTER PROCEDURE uspGetProductDetailsCSV (
#sku NVARCHAR(MAX)
)
AS
BEGIN
-
SELECT T.C.value('.', 'NVARCHAR(100)') AS [SKU]
INTO #tblPersons
FROM (SELECT CAST ('<Name>' + REPLACE (#sku, ',', '</Name><Name>')
+ '</Name>' AS XML) AS [Products]) AS A
CROSS APPLY Products.nodes('/Name') as T(C)
SELECT *
FROM ProductInformation Pr
WHERE EXISTS (SELECT Name FROM #tblPersons tmp WHERE tmp.SKU
= case when len(tmp.SKU) = 11 then Product_No+Colour_Code+Size_Code
when len(tmp.SKU) = 8 then Product_No+Colour_Code
when len(tmp.sku) = 6 then Product_No end)
DROP TABLE #tblPersons
END
Edit: I could not write XML part of stored proc, as i have pasted same code in Mysql, it doesnt create stored proc
Error: >can not cast as XML<
I dont believe XML is a valid type in MySql. Try just leaving it as a VARCHAR.
So, just remove the cast...I also think you will have to use CONCAT instead of + and change the [] around columns to ticks.
So Instead of:
FROM (SELECT CAST ('<Name>' + REPLACE (#sku, ',', '</Name><Name>')
+ '</Name>' AS XML) AS [Products]) AS A
TRY:
FROM (SELECT CONCAT('<Name>' , REPLACE(#sku, ',', '</Name><Name>'),
'</Name>') AS `Products`) AS A

SQL Server T-SQL breaking a string into a temp table for a join

We have a SQL Server Scalar Function and part of the process is to take one of the input values and do the following
'inputvalue'
Create a table variable and populate with the following rows
inputvalue
inputvalu
inputval
inputva
inputv
input
inpu
inp
Then this table is joined to a query, ordered by len of the inputvalue desc and returns the top 1. The actual code is here
DECLARE #Result NVARCHAR(20);
DECLARE #tempDialCodes TABLE (tempDialCode NVARCHAR(20));
DECLARE #counter INT = LEN(#PhoneNumber);
WHILE #counter > 2
BEGIN
INSERT INTO #tempDialCodes(tempDialCode) VALUES(#PhoneNumber);
SET #PhoneNumber = SUBSTRING(#PhoneNumber, 1, #counter - 1);
SET #counter = #counter - 1;
END
SET #Result = (SELECT TOP 1 [DialCodeID]
FROM DialCodes dc JOIN #tempDialCodes s
ON dc.DialCode = s.tempDialCode
ORDER BY LEN(DialCode) DESC);
RETURN #Result
It works fine but I am asking if there is a way to replace the while loop and somehow joining to the inputvalue to get the same result. When I say it works fine, it's too dam slow but it does work.
I'm stumped on how to break up this string without using a loop and to a table variable but my warning light tells me this is not efficient for running against a table with a million rows.
Are you familiar with tally tables? The speed difference can be incredible. I try to replace every loop with a tally table if possible. The only time I haven't been able to so far is when calling a proc from within a cursor. If using this solution I would recommend a permanent dbo.Tally table with a sufficiently large size rather than recreating every time in the function. You will find other uses for it!
declare #PhoneNumber nvarchar(20) = 'inputvalue';
declare #tempDialCodes table (tempDialCode nvarchar(20));
--create and populate tally table if you don't already a permanent one
--arbitrary 1000 rows for demo...you should figure out if that is enough
--this a 1-based tally table - you will need to tweak if you make it 0-based
declare #Tally table (N int primary key);
insert #Tally
select top (1000) row_number() over (order by o1.object_id) from sys.columns o1, sys.columns o2 order by 1;
--select * from #Tally order by N;
insert #tempDialCodes
select substring(#PhoneNumber, 1, t.N)
from #Tally t
where t.N between 3 and len(#PhoneNumber)
order by t.N desc;
select *
from #tempDialCodes
order by len(tempDialCode) desc;

SQL: GROUP BY Clause for Comma Separated Values

Can anyone help me how to check duplicate values from multiple comma separated value. I have a customer table and in that one can insert multiple comma separated contact number and I want to check duplicate values from last five digits.For reference check screenshot attached and the required output is
contact_no. count
97359506775 -- 2
390558073039-- 1
904462511251-- 1
I would advise you to redesign your database schema, if possible. Your current database violates First Normal Form since your attribute values are not indivisible.
Create a table where id together with a single phone number constitutes a key, this constraint enforces that no duplicates occur.
I don't remember much but I will try to put the idea (it's something which I had used a long time ago):
Create a table value function which will take the id and phone number as input and then generate a table with id and phone numbers and return it.
Use this function in query passing id and phone number. The query is such that for each id you get as many rows as the phone numbers. CROSS APPLY/OUTER APPLY needs to be used.
Then you can check for the duplicates.
The function would be something like this:
CREATE FUNCTION udf_PhoneNumbers
(
#Id INT
,#Phone VARCHAR(300)
) RETURNS #PhonesTable TABLE(Id INT, Phone VARCHAR(50))
BEGIN
DECLARE #CommaIndex INT
DECLARE #CurrentPosition INT
DECLARE #StringLength INT
DECLARE #PhoneNumber VARCHAR(50)
SELECT #StringLength = LEN(#Phone)
SELECT #CommaIndex = -1
SELECT #CurrentPosition = 1
--index is 1 based
WHILE #CommaIndex < #StringLength AND #CommaIndex <> 0
BEGIN
SELECT #CommaIndex = CHARINDEX(',', #Phone, #CurrentPosition)
IF #CommaIndex <> 0
SELECT #PhoneNumber = SUBSTRING(#Phone, #CurrentPosition, #CommaIndex - #CurrentPosition)
ELSE
SELECT #PhoneNumber = SUBSTRING(#Phone, #CurrentPosition, #StringLength - #CurrentPosition + 1)
SELECT #CurrentPosition = #CommaIndex + 1
INSERT INTO #UsersTable VALUES(#Id, #PhoneNumber)
END
RETURN
END
Then run CROSS APPLY query:
SELECT
U.*
,UD.*
FROM yourtable U CROSS APPLY udf_PhoneNumbers(Userid, Phone) UD
This will give you the table on which you can run query to find duplicate.

splitting a row in sql with different information in sql server [duplicate]

How to split a string in SQL Server.
Example:
Input string: stack over flow
Result:
stack
over
flow
if you can't use table value parameters, see: "Arrays and Lists in SQL Server 2008 Using Table-Valued Parameters" by Erland Sommarskog , then there are many ways to split string in SQL Server. This article covers the PROs and CONs of just about every method:
"Arrays and Lists in SQL Server 2005 and Beyond, When Table Value Parameters Do Not Cut it" by Erland Sommarskog
You need to create a split function. This is how a split function can be used:
SELECT
*
FROM YourTable y
INNER JOIN dbo.yourSplitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach to split a string in TSQL but there are numerous ways to split strings in SQL Server, see the previous link, which explains the PROs and CONs of each.
For the Numbers Table method to work, you need to do this one time table setup, which will create a table Numbers that contains rows from 1 to 10,000:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTable(' ','stack over flow')
OUTPUT:
ListValue
-------------------
stack
over
flow
(3 row(s) affected)
A common set-based solution to this kind of problem is to use a numbers table.
The following solution uses a simple recursive CTE to generate the numbers table on the fly - if you need to work with longer strings, this should be replaced with a static numbers table.
DECLARE #vch_string varchar(max)
DECLARE #chr_delim char(1)
SET #chr_delim = ' '
SET #vch_string = 'stack over flow'
;WITH nums_cte
AS
(
SELECT 1 AS n
UNION ALL
SELECT n+1 FROM nums_cte
WHERE n < len(#vch_string)
)
SELECT n - LEN(REPLACE(LEFT(s,n),#chr_delim,'')) + 1 AS pos
,SUBSTRING(s,n,CHARINDEX(#chr_delim, s + #chr_delim,n) -n) as ELEMENT
FROM (SELECT #vch_string as s) AS D
JOIN nums_cte
ON n <= LEN(s)
AND SUBSTRING(#chr_delim + s,n,1) = #chr_delim
OPTION (MAXRECURSION 0);
I know this question was for SQL Server 2008 but things evolve so starting with SQL Server 2016 you can do this
DECLARE #string varchar(100) = 'Richard, Mike, Mark'
SELECT value FROM string_split(#string, ',')
CREATE FUNCTION [dbo].[Split]
(
#List varchar(max),
#SplitOn nvarchar(5)
)
RETURNS #RtnValue table
(
Id int identity(1,1),
Value nvarchar(max)
)
AS
BEGIN
While (Charindex(#SplitOn,#List)>0)
Begin
Insert Into #RtnValue (value)
Select
Value = ltrim(rtrim(Substring(#List,1,Charindex(#SplitOn,#List)-1)))
Set #List = Substring(#List,Charindex(#SplitOn,#List)+len(#SplitOn),len(#List))
End
Insert Into #RtnValue (Value)
Select Value = ltrim(rtrim(#List))
Return
END
Create Above Function And Execute Belowe Query To Get Your Result.
Select * From Dbo.Split('Stack Over Flow',' ')
Suggestion : use delimiter for get split value. it's better. (for ex. 'Stack,Over,Flow')
Hard. Really hard - Strin Manipulation and SQL... BAD combination. C# / .NET for a stored procedure is a way, could return a table defined type (table) with one item per row.