Traversing strings for multiple instances of substrings - freepascal or delphi - freepascal

Platform : Lazarus 1.1, FreePascal 2.7.1, Win 7 32-bit.
I have a string value as follows:
FileName[12345][45678][6789].jpg
By default (assume this is default behaviour 0), my program currently pulls out the last set of numbers from the last pair of square brackets to the farthest right of the filename, i.e. 6789. It does so using this code:
if chkbxOverrideUniqueID.Checked then
IDOverrideValue := StrToInt(edtToggleValue.Text);
// User is happy to find the right most unique ID
if not chkbxOverrideUniqueID.Checked then
LastSquareBracket := RPos(']', strFileName);
PreceedingSquareBracket := RPosEx('[', strFileName, LastSquareBracket) + 1;
strFileID := AnsiMidStr(strFileName, PreceedingSquareBracket, LastSquareBracket - PreceedingSquareBracket)
else // User doesn't want to find the rightmost ID.
// and now I am stuck!
However, I have now added an option for the user to specify a non-default behaviour. e.g if they enter '1', that means "look for the first ID in from the farthest right ID". e.g. [45678], because [6789] is default behaviour 0, remember. If they enter 2, I want it to find [12345].
My question : How do I adapt the above code to achieve this, please?

The following code will return just the numeric value between brackets:
uses
StrUtils;
function GetNumber(const Text: string; Index: Integer): string;
var
I: Integer;
OpenPos: Integer;
ClosePos: Integer;
begin
Result := '';
ClosePos := Length(Text) + 1;
for I := 0 to Index do
begin
ClosePos := RPosEx(']', Text, ClosePos - 1);
if ClosePos = 0 then
Exit;
end;
OpenPos := RPosEx('[', Text, ClosePos - 1);
if OpenPos <> 0 then
Result := Copy(Text, OpenPos + 1, ClosePos - OpenPos - 1);
end;
If you'd like that value including those brackets, replace the last line with this:
Result := Copy(Text, OpenPos, ClosePos - OpenPos + 1);

Related

How to decode BASE64 text (from JSON data) in TSQL with accents intact [duplicate]

I have a column in SQL Server with utf8 SQL_Latin1_General_CP1_CI_AS encoding. How can I convert and save the text in ISO 8859-1 encoding? I would like to do thing in a query on SQL Server. Any tips?
Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito
I have written a function to repair UTF-8 text that is stored in a varchar field.
To check the fixed values you can use it like this:
CREATE TABLE #Table1 (Column1 varchar(max))
INSERT #Table1
VALUES ('Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito')
SELECT *, NewColumn1 = dbo.DecodeUTF8String(Column1)
FROM Table1
WHERE Column1 <> dbo.DecodeUTF8String(Column1)
Output:
Column1
-------------------------------
Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito
NewColumn1
-------------------------------
Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito
The code:
CREATE FUNCTION dbo.DecodeUTF8String (#value varchar(max))
RETURNS nvarchar(max)
AS
BEGIN
-- Transforms a UTF-8 encoded varchar string into Unicode
-- By Anthony Faull 2014-07-31
DECLARE #result nvarchar(max);
-- If ASCII or null there's no work to do
IF (#value IS NULL
OR #value NOT LIKE '%[^ -~]%' COLLATE Latin1_General_BIN
)
RETURN #value;
-- Generate all integers from 1 to the length of string
WITH e0(n) AS (SELECT TOP(POWER(2,POWER(2,0))) NULL FROM (VALUES (NULL),(NULL)) e(n))
, e1(n) AS (SELECT TOP(POWER(2,POWER(2,1))) NULL FROM e0 CROSS JOIN e0 e)
, e2(n) AS (SELECT TOP(POWER(2,POWER(2,2))) NULL FROM e1 CROSS JOIN e1 e)
, e3(n) AS (SELECT TOP(POWER(2,POWER(2,3))) NULL FROM e2 CROSS JOIN e2 e)
, e4(n) AS (SELECT TOP(POWER(2,POWER(2,4))) NULL FROM e3 CROSS JOIN e3 e)
, e5(n) AS (SELECT TOP(POWER(2.,POWER(2,5)-1)-1) NULL FROM e4 CROSS JOIN e4 e)
, numbers(position) AS
(
SELECT TOP(DATALENGTH(#value)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM e5
)
-- UTF-8 Algorithm (http://en.wikipedia.org/wiki/UTF-8)
-- For each octet, count the high-order one bits, and extract the data bits.
, octets AS
(
SELECT position, highorderones, partialcodepoint
FROM numbers a
-- Split UTF8 string into rows of one octet each.
CROSS APPLY (SELECT octet = ASCII(SUBSTRING(#value, position, 1))) b
-- Count the number of leading one bits
CROSS APPLY (SELECT highorderones = 8 - FLOOR(LOG( ~CONVERT(tinyint, octet) * 2 + 1)/LOG(2))) c
CROSS APPLY (SELECT databits = 7 - highorderones) d
CROSS APPLY (SELECT partialcodepoint = octet % POWER(2, databits)) e
)
-- Compute the Unicode codepoint for each sequence of 1 to 4 bytes
, codepoints AS
(
SELECT position, codepoint
FROM
(
-- Get the starting octect for each sequence (i.e. exclude the continuation bytes)
SELECT position, highorderones, partialcodepoint
FROM octets
WHERE highorderones <> 1
) lead
CROSS APPLY (SELECT sequencelength = CASE WHEN highorderones in (1,2,3,4) THEN highorderones ELSE 1 END) b
CROSS APPLY (SELECT endposition = position + sequencelength - 1) c
CROSS APPLY
(
-- Compute the codepoint of a single UTF-8 sequence
SELECT codepoint = SUM(POWER(2, shiftleft) * partialcodepoint)
FROM octets
CROSS APPLY (SELECT shiftleft = 6 * (endposition - position)) b
WHERE position BETWEEN lead.position AND endposition
) d
)
-- Concatenate the codepoints into a Unicode string
SELECT #result = CONVERT(xml,
(
SELECT NCHAR(codepoint)
FROM codepoints
ORDER BY position
FOR XML PATH('')
)).value('.', 'nvarchar(max)');
RETURN #result;
END
GO
Jason Penny has also written an SQL function to convert UTF-8 to Unicode (MIT licence) which worked on a simple example for me:
CREATE FUNCTION dbo.UTF8_TO_NVARCHAR(#in VarChar(MAX))
RETURNS NVarChar(MAX)
AS
BEGIN
DECLARE #out NVarChar(MAX), #i int, #c int, #c2 int, #c3 int, #nc int
SELECT #i = 1, #out = ''
WHILE (#i <= Len(#in))
BEGIN
SET #c = Ascii(SubString(#in, #i, 1))
IF (#c < 128)
BEGIN
SET #nc = #c
SET #i = #i + 1
END
ELSE IF (#c > 191 AND #c < 224)
BEGIN
SET #c2 = Ascii(SubString(#in, #i + 1, 1))
SET #nc = (((#c & 31) * 64 /* << 6 */) | (#c2 & 63))
SET #i = #i + 2
END
ELSE
BEGIN
SET #c2 = Ascii(SubString(#in, #i + 1, 1))
SET #c3 = Ascii(SubString(#in, #i + 2, 1))
SET #nc = (((#c & 15) * 4096 /* << 12 */) | ((#c2 & 63) * 64 /* << 6 */) | (#c3 & 63))
SET #i = #i + 3
END
SET #out = #out + NChar(#nc)
END
RETURN #out
END
GO
The ticked answer by Anthony "looks" better to me, but maybe run both if doing conversion and investigate any discrepencies?!
Also we used the very ugly code below to detect BMP page unicode characters that were encoded as UTF-8 and then converted from varchar to nvarchar fields, that can be converted to UCS-16.
LIKE (N'%[' + CONVERT(NVARCHAR,(CHAR(192))) + CONVERT(NVARCHAR,(CHAR(193))) + CONVERT(NVARCHAR,(CHAR(194))) + CONVERT(NVARCHAR,(CHAR(195))) + CONVERT(NVARCHAR,(CHAR(196))) + CONVERT(NVARCHAR,(CHAR(197))) + CONVERT(NVARCHAR,(CHAR(198))) + CONVERT(NVARCHAR,(CHAR(199))) + CONVERT(NVARCHAR,(CHAR(200))) + CONVERT(NVARCHAR,(CHAR(201))) + CONVERT(NVARCHAR,(CHAR(202))) + CONVERT(NVARCHAR,(CHAR(203))) + CONVERT(NVARCHAR,(CHAR(204))) + CONVERT(NVARCHAR,(CHAR(205))) + CONVERT(NVARCHAR,(CHAR(206))) + CONVERT(NVARCHAR,(CHAR(207))) + CONVERT(NVARCHAR,(CHAR(208))) + CONVERT(NVARCHAR,(CHAR(209))) + CONVERT(NVARCHAR,(CHAR(210))) + CONVERT(NVARCHAR,(CHAR(211))) + CONVERT(NVARCHAR,(CHAR(212))) + CONVERT(NVARCHAR,(CHAR(213))) + CONVERT(NVARCHAR,(CHAR(214))) + CONVERT(NVARCHAR,(CHAR(215))) + CONVERT(NVARCHAR,(CHAR(216))) + CONVERT(NVARCHAR,(CHAR(217))) + CONVERT(NVARCHAR,(CHAR(218))) + CONVERT(NVARCHAR,(CHAR(219))) + CONVERT(NVARCHAR,(CHAR(220))) + CONVERT(NVARCHAR,(CHAR(221))) + CONVERT(NVARCHAR,(CHAR(222))) + CONVERT(NVARCHAR,(CHAR(223))) + CONVERT(NVARCHAR,(CHAR(224))) + CONVERT(NVARCHAR,(CHAR(225))) + CONVERT(NVARCHAR,(CHAR(226))) + CONVERT(NVARCHAR,(CHAR(227))) + CONVERT(NVARCHAR,(CHAR(228))) + CONVERT(NVARCHAR,(CHAR(229))) + CONVERT(NVARCHAR,(CHAR(230))) + CONVERT(NVARCHAR,(CHAR(231))) + CONVERT(NVARCHAR,(CHAR(232))) + CONVERT(NVARCHAR,(CHAR(233))) + CONVERT(NVARCHAR,(CHAR(234))) + CONVERT(NVARCHAR,(CHAR(235))) + CONVERT(NVARCHAR,(CHAR(236))) + CONVERT(NVARCHAR,(CHAR(237))) + CONVERT(NVARCHAR,(CHAR(238))) + CONVERT(NVARCHAR,(CHAR(239)))
+ N'][' + CONVERT(NVARCHAR,(CHAR(128))) + CONVERT(NVARCHAR,(CHAR(129))) + CONVERT(NVARCHAR,(CHAR(130))) + CONVERT(NVARCHAR,(CHAR(131))) + CONVERT(NVARCHAR,(CHAR(132))) + CONVERT(NVARCHAR,(CHAR(133))) + CONVERT(NVARCHAR,(CHAR(134))) + CONVERT(NVARCHAR,(CHAR(135))) + CONVERT(NVARCHAR,(CHAR(136))) + CONVERT(NVARCHAR,(CHAR(137))) + CONVERT(NVARCHAR,(CHAR(138))) + CONVERT(NVARCHAR,(CHAR(139))) + CONVERT(NVARCHAR,(CHAR(140))) + CONVERT(NVARCHAR,(CHAR(141))) + CONVERT(NVARCHAR,(CHAR(142))) + CONVERT(NVARCHAR,(CHAR(143))) + CONVERT(NVARCHAR,(CHAR(144))) + CONVERT(NVARCHAR,(CHAR(145))) + CONVERT(NVARCHAR,(CHAR(146))) + CONVERT(NVARCHAR,(CHAR(147))) + CONVERT(NVARCHAR,(CHAR(148))) + CONVERT(NVARCHAR,(CHAR(149))) + CONVERT(NVARCHAR,(CHAR(150))) + CONVERT(NVARCHAR,(CHAR(151))) + CONVERT(NVARCHAR,(CHAR(152))) + CONVERT(NVARCHAR,(CHAR(153))) + CONVERT(NVARCHAR,(CHAR(154))) + CONVERT(NVARCHAR,(CHAR(155))) + CONVERT(NVARCHAR,(CHAR(156))) + CONVERT(NVARCHAR,(CHAR(157))) + CONVERT(NVARCHAR,(CHAR(158))) + CONVERT(NVARCHAR,(CHAR(159))) + CONVERT(NVARCHAR,(CHAR(160))) + CONVERT(NVARCHAR,(CHAR(161))) + CONVERT(NVARCHAR,(CHAR(162))) + CONVERT(NVARCHAR,(CHAR(163))) + CONVERT(NVARCHAR,(CHAR(164))) + CONVERT(NVARCHAR,(CHAR(165))) + CONVERT(NVARCHAR,(CHAR(166))) + CONVERT(NVARCHAR,(CHAR(167))) + CONVERT(NVARCHAR,(CHAR(168))) + CONVERT(NVARCHAR,(CHAR(169))) + CONVERT(NVARCHAR,(CHAR(170))) + CONVERT(NVARCHAR,(CHAR(171))) + CONVERT(NVARCHAR,(CHAR(172))) + CONVERT(NVARCHAR,(CHAR(173))) + CONVERT(NVARCHAR,(CHAR(174))) + CONVERT(NVARCHAR,(CHAR(175))) + CONVERT(NVARCHAR,(CHAR(176))) + CONVERT(NVARCHAR,(CHAR(177))) + CONVERT(NVARCHAR,(CHAR(178))) + CONVERT(NVARCHAR,(CHAR(179))) + CONVERT(NVARCHAR,(CHAR(180))) + CONVERT(NVARCHAR,(CHAR(181))) + CONVERT(NVARCHAR,(CHAR(182))) + CONVERT(NVARCHAR,(CHAR(183))) + CONVERT(NVARCHAR,(CHAR(184))) + CONVERT(NVARCHAR,(CHAR(185))) + CONVERT(NVARCHAR,(CHAR(186))) + CONVERT(NVARCHAR,(CHAR(187))) + CONVERT(NVARCHAR,(CHAR(188))) + CONVERT(NVARCHAR,(CHAR(189))) + CONVERT(NVARCHAR,(CHAR(190))) + CONVERT(NVARCHAR,(CHAR(191)))
+ N']%') COLLATE Latin1_General_BIN
The above:
detects multi-byte sequences encoding U+0080 to U+FFFF (U+0080 to U+07FF is encoded as 110xxxxx 10xxxxxx, U+0800 to U+FFFF is encoded as 1110xxxx 10xxxxxx 10xxxxxx)
i.e. it detects hex byte 0xC0 to 0xEF followed by hex byte 0x80 to 0xBF
ignores ASCII control characters U+0000 to U+001F
ignores characters that are already correctly encoded to unicode >= U+0100 (i.e. not UTF-8)
ignores unicode characters U+0080 to U+00FF if they don't appear to be part of a UTF-8 sequence e.g. "coöperatief".
doesn't use LIKE "%[X-Y]" for X=0x80 to Y=0xBF because of potential collation issues
uses CONVERT(VARCHAR,CHAR(X)) instead of NCHAR because we had problems with NCHAR getting converted to the wrong value (for some values).
ignores UTF characters greater than U+FFFF (4 to 6 byte sequences which have a first byte of hex 0xF0 to 0xFD)
I made a solution that also handles 4 byte sequences (like emojis) by combining the answer from #robocat, some more cases with the logic taken from https://github.com/benkasminbullock/unicode-c/blob/master/unicode.c, and a solution for the problem of encoding extended unicode characters from https://dba.stackexchange.com/questions/139551/how-do-i-set-a-sql-server-unicode-nvarchar-string-to-an-emoji-or-supplementary. It's not fast or pretty, but it's working for me anyway. This particular solution includes Unicode replacement characters wherever it finds unknown bytes. It may be better just to throw an exception in these cases, or leave the bytes as they were, as future encoding could be off, but I preferred this for my use case.
-- Started with https://stackoverflow.com/questions/28168055/convert-text-value-in-sql-server-from-utf8-to-iso-8859-1
-- Modified following source in https://github.com/benkasminbullock/unicode-c/blob/master/unicode.c
-- Made characters > 65535 work using https://dba.stackexchange.com/questions/139551/how-do-i-set-a-sql-server-unicode-nvarchar-string-to-an-emoji-or-supplementary
CREATE FUNCTION dbo.UTF8_TO_NVARCHAR(#in VarChar(MAX)) RETURNS NVarChar(MAX) AS
BEGIN
DECLARE #out NVarChar(MAX), #thisOut NVARCHAR(MAX), #i int, #c int, #c2 int, #c3 int, #c4 int
SELECT #i = 1, #out = ''
WHILE (#i <= Len(#in)) BEGIN
SET #c = Ascii(SubString(#in, #i, 1))
IF #c <= 0x7F BEGIN
SET #thisOut = NCHAR(#c)
SET #i = #i + 1
END
ELSE IF #c BETWEEN 0xC2 AND 0xDF BEGIN
SET #c2 = Ascii(SubString(#in, #i + 1, 1))
IF #c2 < 0x80 OR #c2 > 0xBF BEGIN
SET #thisOut = NCHAR(0xFFFD)
SET #i = #i + 1
END
ELSE BEGIN
SET #thisOut = NCHAR(((#c & 31) * 64 /* << 6 */) | (#c2 & 63))
SET #i = #i + 2
END
END
ELSE IF #c BETWEEN 0xE0 AND 0xEF BEGIN
SET #c2 = Ascii(SubString(#in, #i + 1, 1))
SET #c3 = Ascii(SubString(#in, #i + 2, 1))
IF #c2 < 0x80 OR #c2 > 0xBF OR #c3 < 0x80 OR (#c = 0xE0 AND #c2 < 0xA0) BEGIN
SET #thisOut = NCHAR(0xFFFD)
SET #i = #i + 1
END
ELSE BEGIN
SET #thisOut = NCHAR(((#c & 15) * 4096 /* << 12 */) | ((#c2 & 63) * 64 /* << 6 */) | (#c3 & 63))
SET #i = #i + 3
END
END
ELSE IF #c BETWEEN 0xF0 AND 0xF4 BEGIN
SET #c2 = Ascii(SubString(#in, #i + 1, 1))
SET #c3 = Ascii(SubString(#in, #i + 2, 1))
SET #c4 = Ascii(SubString(#in, #i + 3, 1))
IF #c2 < 0x80 OR #c2 >= 0xC0 OR #c3 < 0x80 OR #c3 >= 0xC0 OR #c4 < 0x80 OR #c4 >= 0xC0 OR (#c = 0xF0 AND #c2 < 0x90) BEGIN
SET #thisOut = NCHAR(0xFFFD)
SET #i = #i + 1
END
ELSE BEGIN
DECLARE #nc INT = (((#c & 0x07) * 262144 /* << 18 */) | ((#c2 & 0x3F) * 4096 /* << 12 */) | ((#c3 & 0x3F) * 64) | (#c4 & 0x3F))
DECLARE #HighSurrogateInt INT = 55232 + (#nc / 1024), #LowSurrogateInt INT = 56320 + (#nc % 1024)
SET #thisOut = NCHAR(#HighSurrogateInt) + NCHAR(#LowSurrogateInt)
SET #i = #i + 4
END
END
ELSE BEGIN
SET #thisOut = NCHAR(0xFFFD)
SET #i = #i + 1
END
SET #out = #out + #thisOut
END
RETURN #out
END
GO
i add a little modification to use new string aggregation function string_agg, from sql server 2017 and 2019
SELECT #result=STRING_AGG(NCHAR([codepoint]),'') WITHIN GROUP (ORDER BY position ASC)
FROM codepoints
change de #result parts to this one. The XML still work in old fashion way.
in 2019, string_agg works extreme faster than xml version (obvious... string_agg now is native, and is not fair compare)
Here's my version written as an inline table-valued function (TVF) for SQL Server 2017. It is limited to 4000 byte input strings as that was more than enough for my needs. Limiting the input size and writing as a TVF makes this version significantly faster than the scaler valued functions posted so far. It also handles four-byte UTF-8 sequences (such as those created by emoji), which cannot be represented in UCS-2 strings, by outputting a replacement character in their place.
CREATE OR ALTER FUNCTION [dbo].[fnUTF8Decode](#UTF8 VARCHAR(4001)) RETURNS TABLE AS RETURN
/* Converts a UTF-8 encoded VARCHAR to NVARCHAR (UCS-2). Based on UTF-8 documentation on Wikipedia and the
code/discussion at https://stackoverflow.com/a/31064459/1979220.
One can quickly detect strings that need conversion using the following expression:
<FIELD> LIKE CONCAT('%[', CHAR(192), '-', CHAR(255), ']%') COLLATE Latin1_General_BIN.
Be aware, however, that this may return true for strings that this function has already converted to UCS-2.
See robocat's answer on the above referenced Stack Overflow thread for a slower but more robust expression.
Notes/Limitations
1) Written as a inline table-valued function for optimized performance.
2) Only tested on a database with SQL_Latin1_General_CP1_CI_AS collation. More specifically, this was
not designed to output Supplementary Characters and converts all such UTF-8 sequences to �.
3) Empty input strings, '', and strings with nothing but invalid UTF-8 chars are returned as NULL.
4) Assumes input is UTF-8 compliant. For example, extended ASCII characters such as en dash CHAR(150)
are not allowed unless part of a multi-byte sequence and will be skipped otherwise. In other words:
SELECT * FROM dbo.fnUTF8Decode(CHAR(150)) -> NULL
5) Input should be limited to 4000 characters to ensure that output will fit in NVARCHAR(4000), which is
what STRING_AGG outputs when fed a sequence of NVARCHAR(1) characters generated by NCHAR. However,
T-SQL silently truncates overlong parameters so we've declared our input as VARCHAR(4001) to allow
STRING_AGG to generate an error on overlong input. If we didn't do this, callers would never be
notified about truncation.
6) If we need to process more than 4000 chars in the future, we'll need to change input to VARCHAR(MAX) and
CAST the CASE WHEN expression to NVARCHAR(MAX) to force STRING_AGG to output NVARCHAR(MAX). Note that
this change will significantly degrade performance, which is why we didn't do it in the first place.
7) Due to use of STRING_AGG, this is only compatible with SQL 2017. It will probably work fine on 2019
but that version has native UTF-8 support so you're probably better off using that. For earlier versions,
replace STRING_AGG with a CLR equivalent (ms-sql-server-group-concat-sqlclr) or FOR XML PATH(''), TYPE...
*/
SELECT STRING_AGG (
CASE
WHEN A1 & 0xF0 = 0xF0 THEN --Four byte sequences (like emoji) can't be represented in UCS-2
NCHAR(0xFFFD) --Output U+FFFD (Replacement Character) instead
WHEN A1 & 0xE0 = 0xE0 THEN --Three byte sequence; get/combine relevant bits from A1-A3
NCHAR((A1 & 0x0F) * 4096 | (A2 & 0x3F) * 64 | (A3 & 0x3F))
WHEN A1 & 0xC0 = 0xC0 THEN --Two byte sequence; get/combine relevant bits from A1-A2
NCHAR((A1 & 0x3F) * 64 | (A2 & 0x3F))
ELSE NCHAR(A1) --Regular ASCII character; output as is
END
, '') UCS2
FROM dbo.fnNumbers(ISNULL(DATALENGTH(#UTF8), 0))
CROSS APPLY (SELECT ASCII(SUBSTRING(#UTF8, I, 1)) A1, ASCII(SUBSTRING(#UTF8, I + 1, 1)) A2, ASCII(SUBSTRING(#UTF8, I + 2, 1)) A3) A
WHERE A1 <= 127 OR A1 >= 192 --Output only ASCII chars and one char for each multi-byte sequence
GO
Note that the above requires a "Numbers" table or generator function. Here's the function I use:
CREATE OR ALTER FUNCTION [dbo].[fnNumbers](#MaxNumber BIGINT) RETURNS TABLE AS RETURN
/* Generates a table of numbers up to the specified #MaxNumber, limited to 4,294,967,296. Useful for special case
situations and algorithms. Copied from https://www.itprotoday.com/sql-server/virtual-auxiliary-table-numbers
with minor formatting and name changes.
*/
WITH L0 AS (
SELECT 1 I UNION ALL SELECT 1 --Generates 2 rows
), L1 AS (
SELECT 1 I FROM L0 CROSS JOIN L0 L -- 4 rows
), L2 AS (
SELECT 1 I FROM L1 CROSS JOIN L1 L -- 16 rows
), L3 AS (
SELECT 1 I FROM L2 CROSS JOIN L2 L -- 256 rows
), L4 AS (
SELECT 1 I FROM L3 CROSS JOIN L3 L -- 65,536 rows
), L5 AS (
SELECT 1 I FROM L4 CROSS JOIN L4 L -- 4,294,967,296 rows
), Numbers AS (
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) I FROM L5
)
SELECT TOP (#MaxNumber) I FROM Numbers ORDER BY I
GO
I just succeeded by creating a new field as varchar(255) and setting the new field to the old field which was nvarchar(255). This produced the 'Americanized' version of the international places.
Update WorldCities
Set admin_correct = admin_name
varchar(255) nvarchar(255)
I found the query I need to do, just not the encoding yet.
ALTER TABLE dbo.MyTable ALTER COLUMN CharCol
varchar(10)COLLATE Latin1_General_CI_AS NOT NULL;

Variables automatically changes its value

I'm doing this question with Pascal (Google Kick Start 2020 Round A - Workout) and I ran into a problem that doesn't make any sense at all. Here is a part of my program:
var N,K,i,max,max1 : longint;
M : array [1..100000] of longint;
A : array [1..99999] of longint;
begin
readln(N,K);
for i := 1 to N do
read(M[i]);
for i := 1 to N-1 do A[i] := M[i+1]-M[i];
max := 0;
for i := 1 to N-1 do
if A[i] >= max then
begin
max := A[i];
max1 := i;
end;
writeln('max = ',max); writeln('max1 = ',max1);
readln; readln;
end.
So first I type in all the input data which are:
5 6 and
9
10
20
26
30.
When I run the program, the value of max is 10 and the value of max1 is 2.
But when I change the way max gets its value and totally did nothing with max1, the program becomes like this:
uses crt;
var N,K,i,max,max1 : longint;
M : array [1..100000] of longint;
A : array [1..99999] of longint;
begin
readln(N,K);
for i := 1 to N do
read(M[i]);
for i := 1 to N-1 do A[i] := M[i+1]-M[i];
max := 0;
for i := 1 to N-1 do
if A[i] >= max then
begin
max := i;
max1 := i;
end;
writeln('max = ',max); writeln('max1 = ',max1);
readln; readln;
end.
I run the program, and suddenly both the values of max and max1 are 4. How can this happen? Should I delete Pascal?? By the way if you can't install Pascal for some reasons then go to this link:https://www.onlinegdb.com/, select Pascal language and paste my program. Thanks for helping me!

Functions giving different result with same parameters

I use GMLib to work with Google maps and now I have come to a point where I am very confused.
I have the functions GetDistance and GetHeading to calculate the distance and compass direction between 2 markers on my map.
When I call them from my procedure GetHeadingDistance I get the result I expect (distance and direction is correct)- aSearchCallInfo is a class containing info that needs to be updated with the values.
Now I am trying to add a function that lets the user press the right mouse button on the map and the get info about that location.
But in this case I get very wrong results. As far as I can see of the results it uses GMMarker.Items[1].Position as source even when I know that it is GMMarker.Items[0].Position I send as parameter.
When I try to debug the functions by writing values to a textfile during calculation, I can see that it is the correct values it gets to work with at the correct position.
(GMMarker.Items[0].Position is the position of the user of the software)
Any ideas as what I could try to get this solved?
procedure TfrmQthMap.GMMapRightClick(Sender: TObject; LatLng: TLatLng; X, Y: Double);
var
MessageText: string;
LL: TLatLng;
Heading: double;
Distance: double;
Qra: string;
begin
if GMMarker.Count > 0 then
begin
LL := TLatLng.Create;
try
LL.Lat := LatLng.Lat;
LL.Lng := LatLng.Lng;
Heading := GetHeading(GMMarker.Items[0].Position, LL);
Distance := GetDistance(GMMarker.Items[0].Position, LL);
Qra := Maidenhead(LatLng.LngToStr, LatLng.LatToStr);
finally
FreeAndNil(LL);
end;
MessageText := 'Data for det sted du klikkede på: ' + sLineBreak + sLineBreak +
Format('Længdegrad: %s', [LatLng.LngToStr(Precision)]) + sLineBreak +
Format('Breddegrad: %s', [LatLng.LatToStr(Precision)]) + sLineBreak +
Format('Afstand: %6.1f km', [Distance]) + sLineBreak +
Format('Retning: %6.1f °', [Heading]) + sLineBreak +
Format('Lokator: %s', [Qra]);
ShowMessage(MessageText);
end;
end;
procedure TfrmQthMap.GetHeadingDistance(aSearchCallInfo: TCallInfo);
var
Heading: double;
Distance: double;
begin
if GMMarker.Count > 1 then
begin
Heading := GetHeading(GMMarker.Items[0].Position, GMMarker.Items[1].Position);
Distance := GetDistance(GMMarker.Items[0].Position, GMMarker.Items[1].Position);
barFooter.Panels[PanelDist].Text := Format('Afstand: %6.1f km', [Distance]);
barFooter.Panels[PanelDir].Text := Format('Retning: %6.1f°', [Heading]);
aSearchCallInfo.Distance := Format('%6.1f km', [Distance]);
aSearchCallInfo.Heading := Format('%6.1f °', [Heading]);
aSearchCallInfo.SaveToDatabase;
end;
end;
function TfrmQthMap.GetDistance(aOrigin, aDest: TLatLng): double;
var
Distance: double;
begin
Distance := TGeometry.ComputeDistanceBetween(GMMap, aOrigin, aDest);
Distance := Distance / 1000;
Result := Distance;
end;
function TfrmQthMap.GetHeading(aOrigin, aDest: TLatLng): double;
var
Heading: double;
begin
Heading := TGeometry.ComputeHeading(GMMap, aOrigin, aDest);
Heading := 180 + Heading;
Result := Heading;
end;

STGeomFromText error 24141: A number is expected at position 27 of the input. The input has ,

Can not find the reason why I can not pass the return value of an User Defined function directly to STGeomFromText. Please help.
declare #points nvarchar(max);
set #points = '43.6950681126962,-79.4046143496645,43.6959369175095,-79.3999794923712,43.6946181896527,-79.3994001349161,43.6778911368006,-79.3695525136617,43.6787446722787,-79.3714193302229,43.6760133178263,-79.3941859209041,43.6755011769934,-79.3969110453878,43.6906308086704,-79.4031123121585,43.6950681126962,-79.4046143496645';
/*-----------failed, return error 24141-*/
/*Msg 6522, Level 16, State 1, Line 12
A .NET Framework error occurred during execution of user-defined routine or aggregate "geometry":
System.FormatException: 24141: A number is expected at position 27 of the input. The input has ,.
System.FormatException:
at Microsoft.SqlServer.Types.OpenGisWktReader.RecognizeDouble()
at Microsoft.SqlServer.Types.OpenGisWktReader.ParseLineStringText()
at Microsoft.SqlServer.Types.OpenGisWktReader.ParsePolygonText()
at Microsoft.SqlServer.Types.OpenGisWktReader.ParseTaggedText(OpenGisType type)
at Microsoft.SqlServer.Types.OpenGisWktReader.Read(OpenGisType type, Int32 srid)
at Microsoft.SqlServer.Types.SqlGeometry.GeometryFromText(OpenGisType type, SqlChars text, Int32 srid)
*/
select COUNT(*) from t
where geometry::STGeomFromText( dbo.GeoCoordinateInBoundery(#points), 0) .STContains( geometry::STGeomFromText('Point(' + cast(t.Latitude as varchar(32)) + ' ' +
cast(t.Longitude as varchar(32)) + ')', 0)) = 1
/*----OK-------*/
declare #ss nvarchar(max)
set #ss = dbo.GeoCoordinateInBoundery(#points)
select COUNT(*) from t
where geometry::STGeomFromText( #ss, 0) .STContains( geometry::STGeomFromText('Point(' + cast(t.Latitude as varchar(32)) + ' ' +
cast(t.Longitude as varchar(32)) + ')', 0)) = 1
The valid syntax for WKT puts a space between lat and long value pairs and a comma between each coordinate.
So you should have:
set #points = '43.6950681126962 -79.4046143496645,43.6959369175095 -79.3999794923712,43.6946181896527 -79.3994001349161,43.6778911368006,-79.3695525136617,43.6787446722787 -79.3714193302229,43.6760133178263 -79.3941859209041,43.6755011769934 -79.3969110453878,43.6906308086704 -79.4031123121585,43.6950681126962 -79.4046143496645';

Is there a Delphi standard function for escaping HTML?

I've got a report that's supposed to take a grid control and produce HTML output. One of the columns in the grid can display any of a number of values, or <Any>. When this gets output to HTML, of course, it ends up blank.
I could probably write up some routine to use StringReplace to turn that into <Any> so it would display this particular case correctly, but I figure there's probably one in the RTL somewhere that's already been tested and does it right. Anyone know where I could find it?
I am 99 % sure that such a function does not exist in the RTL (as of Delphi 2009). Of course - however - it is trivial to write such a function.
Update
HTTPUtil.HTMLEscape is what you are looking for:
function HTMLEscape(const Str: string): string;
I don't dare to publish the code here (copyright violation, probably), but the routine is very simple. It encodes "<", ">", "&", and """ to <, >, &, and ". It also replaces characters #92, #160..#255 to decimal codes, e.g. \.
This latter step is unnecessary if the file is UTF-8, and also illogical, because higher special characters, such as ∮ are left as they are, while lower special characters, such as ×, are encoded.
Update 2
In response to the answer by Stijn Sanders, I made a simple performance test.
program Project1;
{$APPTYPE CONSOLE}
uses
Windows, SysUtils;
var
t1, t2, t3, t4: Int64;
i: Integer;
str: string;
const
N = 100000;
function HTMLEncode(const Data: string): string;
var
i: Integer;
begin
result := '';
for i := 1 to length(Data) do
case Data[i] of
'<': result := result + '<';
'>': result := result + '>';
'&': result := result + '&';
'"': result := result + '"';
else
result := result + Data[i];
end;
end;
function HTMLEncode2(Data: string):string;
begin
Result:=
StringReplace(
StringReplace(
StringReplace(
StringReplace(
Data,
'&','&',[rfReplaceAll]),
'<','<',[rfReplaceAll]),
'>','>',[rfReplaceAll]),
'"','"',[rfReplaceAll]);
end;
begin
QueryPerformanceCounter(t1);
for i := 0 to N - 1 do
str := HTMLEncode('Testing. Is 3*4<3+4? Do you like "A & B"');
QueryPerformanceCounter(t2);
QueryPerformanceCounter(t3);
for i := 0 to N - 1 do
str := HTMLEncode2('Testing. Is 3*4<3+4? Do you like "A & B"');
QueryPerformanceCounter(t4);
Writeln(IntToStr(t2-t1));
Writeln(IntToStr(t4-t3));
Readln;
end.
The output is
532031
801969
It seems here is a small contest :) Here is a one more implementation:
function HTMLEncode3(const Data: string): string;
var
iPos, i: Integer;
procedure Encode(const AStr: String);
begin
Move(AStr[1], result[iPos], Length(AStr) * SizeOf(Char));
Inc(iPos, Length(AStr));
end;
begin
SetLength(result, Length(Data) * 6);
iPos := 1;
for i := 1 to length(Data) do
case Data[i] of
'<': Encode('<');
'>': Encode('>');
'&': Encode('&');
'"': Encode('"');
else
result[iPos] := Data[i];
Inc(iPos);
end;
SetLength(result, iPos - 1);
end;
Update 1: Updated initially provided incorrect code.
Update 2: And the times:
HTMLEncode : 2286508597
HTMLEncode2: 3577001647
HTMLEncode3: 361039770
I usually just use this code:
function HTMLEncode(Data:string):string;
begin
Result:=
StringReplace(
StringReplace(
StringReplace(
StringReplace(
StringReplace(
Data,
'&','&',[rfReplaceAll]),
'<','<',[rfReplaceAll]),
'>','>',[rfReplaceAll]),
'"','"',[rfReplaceAll]),
#13#10,'<br />'#13#10,[rfReplaceAll]);
end;
(copyright? it's open source)
Unit HTTPApp has a function called HTMLEncode. It has also other HTML/HTTP related functions.
I dont know in which delphi version it was introduced but, there is the System.NetEncoding unit which has:
TNetEncoding.HTML.Encode
TNetEncoding.HTML.Decode
functions. Read up here. You dont need external libraries anymore for that.
From unit Soap.HTTPUtil or simply HTTPUtil for older delphi versions, you can use
function HTMLEscape(const Str: string): string;
var
i: Integer;
begin
Result := '';
for i := Low(Str) to High(Str) do
begin
case Str[i] of
'<' : Result := Result + '<'; { Do not localize }
'>' : Result := Result + '>'; { Do not localize }
'&' : Result := Result + '&'; { Do not localize }
'"' : Result := Result + '"'; { Do not localize }
{$IFNDEF UNICODE}
#92, Char(160) .. #255 : Result := Result + '&#' + IntToStr(Ord(Str[ i ])) +';'; { Do not localize }
{$ELSE}
// NOTE: Not very efficient
#$0080..#$FFFF : Result := Result + '&#' + IntToStr(Ord(Str[ i ])) +';'; { Do not localize }
{$ENDIF}
else
Result := Result + Str[i];
end;
end;
end;
how about that way of replacing special characters:
function HtmlWeg(sS: String): String;
var
ix,cc: Integer;
sC, sR: String;
begin
result := sS;
ix := pos('\u00',sS);
while ix >0 do
begin
sc := copy(sS,ix+4,2) ;
cc := StrtoIntdef('$' +sC,32);
sR := '' + chr(cc);
sS := Stringreplace(sS, '\u00'+sC,sR,[rfreplaceall]) ;
ix := pos('\u00',sS);
end;
result := sS;
end;
My function combines the for-loop with a minimal reallocation of the string:
function HtmlEncode(const Value: string): string;
var
i: Integer;
begin
Result := Value;
i := 1;
while i <= Length(Result) do
begin
if Result[i] = '<' then
begin
Result[i] := '&';
Insert('lt;', Result, i + 1);
Inc(i, 4);
end
else if Result[i] = '>' then
begin
Result[i] := '&';
Insert('gt;', Result, i + 1);
Inc(i, 4);
end
else if Result[i] = '"' then
begin
Result[i] := '&';
Insert('quot;', Result, i + 1);
Inc(i, 6);
end
else if Result[i] = '&' then
begin
Insert('amp;', Result, i + 1);
Inc(i, 5);
end
else
Inc(i);
end;
end;
in delphi You have the function
THTMLEncoding.HTML.Encode