I know you can remove HTML tags with a command such as this:
REGEXP_REPLACE(overview, '<.+?>')
But, some of the text has actual HTML encoding, where the application actually encoded things, like single quotes to be: ' or ’
I'm assuming these are pretty standard. Is there a way to remove them and replace them with the actual character, or am I stuck with REPLACE and listing them?
Many thanks!
Use a proper XML parser:
with t (overview) as (
SELECT '<div><p>Some entities: & ' < > to be handled </p></div>' from dual UNION ALL
SELECT '<html><head><title>Test</title></head><body><p><test></p></body></html>' from dual
)
SELECT x.*
FROM t
CROSS JOIN LATERAL (
SELECT LISTAGG(value) WITHIN GROUP (ORDER BY ROWNUM) AS text
FROM XMLTABLE(
'//*'
PASSING XMLTYPE(t.overview)
COLUMNS
value CLOB PATH './text()'
)
) x
Which outputs:
TEXT
Some entities: & ' < > to be handled
Test<test>
db<>fiddle here
You can use utl_i18n.unescape_references():
utl_i18n.unescape_reference(regexp_replace(overview, '<.+?>'))
As a demo:
-- sample data
with t (overview) as (
select '<div><p>Some entities: & ' < > to be handled </p></div>'
from dual
)
select REGEXP_REPLACE(overview, '<.+?>') as result1,
utl_i18n.unescape_reference(regexp_replace(overview, '<.+?>')) as result2
from t
gets
RESULT1
RESULT2
Some entities: & ' < > to be handled
Some entities: & ' < > to be handled
db<>fiddle
I'm not endorsing (or attacking) the notion of using regular expressions; that's handled and refuted and discussed elsewhere. I'm just addressing the part about encoded entities.
Specifically I need to pick the part of field label from table tmp.label between delimiters <!-- rwbl_1 --> and <!-- rwbl_2 --> where label contains <span in order to be able to update that field for records erroneously formatted with HTML tags.
The result amounts to something like strip_tags. This is, obviously, only possible due to the presence of the abovementioned (or similar) delimiters.
Here is a simple solution for the abovementioned case to extract a substring given 2 delimiters.
SELECT
SUBSTRING_INDEX(
SUBSTRING_INDEX(label,
'<!-- rwbl_1 -->', -1),
'<!-- rwbl_2 -->', 1) AS cutout
FROM tmp.label
WHERE label LIKE '<span%'
ORDER BY 1
LIMIT 1;
More abstract:
SELECT
SUBSTRING_INDEX(
SUBSTRING_INDEX(string_field, # field name
'delimiter_1', -1), # take the right part of original
'<!-- delimiter_2 -->', 1) # take the left part of the resulting substring
AS cutout
FROM my_table
WHERE my_condition;
And the update now works like this:
UPDATE my_table
SET string_field = SUBSTRING_INDEX(
SUBSTRING_INDEX(string_field,
'delimiter_1', -1),
'<!-- delimiter_2 -->', 1)
WHERE my_condition;
I'd like to use FOR JSON to build a data payload for an HTTP Post call. My Source table can be recreated with this snippet:
drop table if exists #jsonData;
drop table if exists #jsonColumns;
select
'carat' [column]
into #jsonColumns
union
select 'cut' union
select 'color' union
select 'clarity' union
select 'depth' union
select 'table' union
select 'x' union
select 'y' union
select 'z'
select
0.23 carat
,'Ideal' cut
,'E' color
,'SI2' clarity
,61.5 depth
,55.0 [table]
,3.95 x
,3.98 y
,2.43 z
into #jsonData
union
select 0.21,'Premium','E','SI1',59.8,61.0,3.89,3.84,2.31 union
select 0.29,'Premium','I','VS2',62.4,58.0,4.2,4.23,2.63 union
select 0.31,'Good','J','SI2',63.3,58.0,4.34,4.35,2.75
;
The data needs to be formatted as follows:
{
"columns":["carat","cut","color","clarity","depth","table","x","y","z"],
"data":[
[0.23,"Ideal","E","SI2",61.5,55.0,3.95,3.98,2.43],
[0.21,"Premium","E","SI1",59.8,61.0,3.89,3.84,2.31],
[0.23,"Good","E","VS1",56.9,65.0,4.05,4.07,2.31],
[0.29,"Premium","I","VS2",62.4,58.0,4.2,4.23,2.63],
[0.31,"Good","J","SI2",63.3,58.0,4.34,4.35,2.75]
]
}
My attempts thus far is as follows:
select
(select * from #jsonColumns for json path) as [columns],
(select * from #jsonData for json path) as [data]
for json path, without_array_wrapper
However this returns arrays of objects rather than values, like so:
{
"columns":[
{"column":"carat"},
{"column":"clarity"},
{"column":"color"},
{"column":"cut"},
{"column":"depth"},
{"column":"table"},
{"column":"x"},
{"column":"y"},
{"column":"z"}
]...
}
How can I limit the arrays to only showing the values?
Honestly, this seems like it's going to be easier with string aggregation rather than using the JSON functionality.
Because you're using using SQL Server 2016, you don't have access to STRING_AGG or CONCAT_WS, so the code is a lot longer. You have to make use of FOR XML PATH and STUFF instead and insert all the separators manually (why there's so many ',' in the CONCAT expression). This results in the below:
DECLARE #CRLF nchar(2) = NCHAR(13) + NCHAR(10);
SELECT N'{' + #CRLF +
N' "columns":[' + STUFF((SELECT ',' + QUOTENAME(c.[name],'"')
FROM tempdb.sys.columns c
JOIN tempdb.sys.tables t ON c.object_id = t.object_id
WHERE t.[name] LIKE N'#jsonData%' --Like isn't needed if not a temporary table. Use the literal name.
ORDER BY c.column_id ASC
FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,1,N'') + N'],' + #CRLF +
N' "data":[' + #CRLF +
STUFF((SELECT N',' + #CRLF +
N' ' + CONCAT('[',JD.carat,',',QUOTENAME(JD.cut,'"'),',',QUOTENAME(JD.color,'"'),',',QUOTENAME(JD.clarity,'"'),',',JD.depth,',',JD.[table],',',JD.x,',',JD.y,',',JD.z,']')
FROM #jsonData JD
ORDER BY JD.carat ASC
FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,3,N'') + #CRLF +
N' ]' + #CRLF +
N'}';
DB<>Fiddle
I am trying to generate a tool tip text using SQL. The text generated is passed as Title Attribute in HTML. There needs to be some newline characters generated in the tool tip.
I have used the following -
CHAR(13); CHAR(10); <br>; \n.
However in all cases, I see the character itself in HTML and not a new line.
Any idea how to achieve this?
the SQL is something like this
(SELECT
STUFF(';' + WOR.OrderNo + ' - ' + P.ProductNo + ' - ' + CAST(CAST(ROUND(WOR.OrderQuantity , 0) as int) as varchar(20)) + '; <br/> ', 1, 1, '')
FROM
[ORDER] WOR
JOIN
PRODUCT P
ON P.ID = WOR.ProductID
JOIN
PRODUCT_GROUP PGR
ON P.ID = PGR.ProductID FOR XML PATH(''),TYPE).value('.','nvarchar(MAX)')```
And the Tootip that I see is the following
```SMU_100000021 - A-WHEL-001 - 100;<br/>SMU_100000023 - A-WHEL-001 - 90;<br/>```
The CHAR(10) did the trick.
(SELECT
STUFF(';' + WOR.OrderNo + ' - ' + P.ProductNo + ' - ' + CAST(CAST(ROUND(WOR.OrderQuantity , 0) as int) as varchar(20)) +' '+ CHAR(10) + ' ', 1, 1, '')
FROM
[ORDER] WOR
JOIN
PRODUCT P
ON P.ID = WOR.ProductID
JOIN
PRODUCT_GROUP PGR
ON P.ID = PGR.ProductID FOR XML PATH(''),TYPE).value('.','nvarchar(MAX)')
I decided to ask it here after failing to find answer on my question. I think the solution is easy.
I have a SQL query like this:
SELECT
e.*,
STUFF((SELECT ',' + '<a href=test.aspx?tab=2&name=wil&item_id=[er.id]>' + er.name + '</a>'
FROM dbo.role er
FOR XML PATH ('')), 1, 1, '') all
FROM
dbo.worker e
The only problem is that the output is of the html tags are weird like
< = <
& = &
> = >
How can I make it normal sign's ?
You can use type directive to avoid xml encoding
SELECT e.*,
Stuff((SELECT ','
+ '<a href=test.aspx?tab=2&name=wil&item_id=[er.id]>'
+ er.NAME + '</a>'
FROM dbo.role er
FOR XML PATH (''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM dbo.worker e