SQL Server finding string in Unicode HEX - sql-server-2008

I need to find part of string which is stored in a varbinary field on SQL Server 2008. The data that is stored in the field is in HEX Binary.
For example the field has 0xA0000000001A000000000000000000000000000000000000000000000000000000850000002416002F002A002000480065006C006C006F0020002A002F0007
I can ignore the first 37 characters. When I look at this in a SQL application it reads as follows after the first 37 characters
/..H.e.l.l.o../
I know that the hex binary is stored in Unicode format.
My question is how can i search for the word 'Hello' using a SQL Statement.
I tried the following below but i do not get any results
SELECT CONVERT(varchar(max),fieldname) from tablename where fieldname like '%Hello%;
I would really appreciate any help that I can get
Thanks

If the binary value is in a Unicode encoding, you will need to cast it to NVARCHAR, and also use a Unicode string literal for the search value:
SELECT CONVERT(nvarchar(max),fieldname) from tablename where fieldname like N'%Hello%';

Related

Filtering special characters in SQL Server

This is question pertaining to SQL Server 2014. I have a table xxx. There is a column col1 of type varchar. The values in this column can have alphanumeric characters like 1A324G. There can also be special characters along with alphanumeric like !2A93C or #AC934D, etc.
There can be any special character (eg: !$#^().-_) in a value for this column. I wanted to extract data with only alphanumeric values and NOT any special characters in it. I was trying to use the LIKE clause with wildcard search pattern but I am not able to weed out the ones with only alphanumeric values.
Can someone please help me and let me know how I can do it?
It's been a while since I've played with sql but something like this should work.
SELECT *
FROM xxx
WHERE col1 NOT LIKE '%!%' OR '%$%';

Equivalent of MySQL HEX / UNHEX function in SQL Server?

Does SQL Server has an equivalent to the HEX and UNHEX MySQl functions?
Update 2022: This is outdated, read about CONVERT() together with binary types.
What are you going to do?
Something like a script generation?
There might be better approaches for your issue, but you do not provide many details...
Not quite as slim but this would work
--This will show up like needed, but it will not be a string
SELECT CAST('abc' AS VARBINARY(MAX))
--this is the string equivalent
SELECT sys.fn_varbintohexstr(CAST('abc' AS VARBINARY(MAX)));
--This will turn the string equivalent back to varbinary
SELECT sys.fn_cdc_hexstrtobin('0x616263')
--And this will return the original string
SELECT CAST(sys.fn_cdc_hexstrtobin('0x616263') AS VARCHAR(MAX));
###One hint
If you deal with UNICODE you can check the changes if you set N'abc' instead of 'abc' and in den final line you'd have to convert '0x610062006300' to NVARCHAR.
###Another hint
If you need this more often you might put this into an UDF, than it is as eays as with MySQL :-)

how mysql treat the hex in sql statement

I know a little SQL injection about php and i read some posts about sqli!the posts say when hackers want to get the column names of a table,they will use
select group_concat(column_name) from information_schema.columns where table_name=0x7573657273
in the statement "0x7573657273" is the hex string of "users".surprisingly,when I execute this statement in mysql console,it will exactly return the colunms of table users.
does it mean mysql will automately convert hex to string,eg convert "0x7573657273" to "uses" or convert "0x3D" to "="?
does it mean mysql can understand what the hex means?
In string contexts, MySQL treats hexadecimal values as binary strings, where each pair of hex digits is converted to a character.
Refer link: http://dev.mysql.com/doc/refman/5.0/en/hexadecimal-literals.html

MySQL REGEXP not producing expected results (not multi byte safe?). Is there a work around?

I'm trying to write a MySQL query to identify first name fields that actually contain initials. The problem is that the query is picking up records that should not match.
I have tested against the POSIX ERE regex implementation in RegEx Buddy to confirm my regex string is correct, but when running in a MySQL query, the results differ.
For example, the query should identify strings such as:
'A.J.D' or 'A J D'.
But it is also matching strings like 'Ralph' or 'Terrance'.
The query:
SELECT *, firstname REGEXP '^[a-zA-z]{1}(([[:space:]]|\.)+[a-zA-z]{1})+([[:space:]]|\.)?$' FROM test_table
The 'firstname' field here is VARCHAR 255 if that's relevant.
I get the same result when running with a string literal rather than table data:
SELECT 'Ralph' REGEXP '^[a-zA-z]{1}(([[:space:]]|\.)+[a-zA-z]{1})+([[:space:]]|\.)?$'
The MySQL documentation warns about potential issues with REGEXP, I'm unsure if this is related to the problem I'm seeing:
Warning The REGEXP and RLIKE operators work in byte-wise fashion, so
they are not multi-byte safe and may produce unexpected results with
multi-byte character sets. In addition, these operators compare
characters by their byte values and accented characters may not
compare as equal even if a given collation treats them as equal.
Thanks in advance.
If you're testing this in the mysql client, you need to escape the backslashes. Each occurence of \. must turn into \\. This is necessary because your input is first processed by the mysql client, which turns \. into .. So you need to make it keep the backslashes by escaping them.

Unicode (hexadecimal) character literals in MySQL

Is there a way to specify Unicode character literals in MySQL?
I want to replace a Unicode character with an Ascii character, something like the following:
Update MyTbl Set MyFld = Replace(MyFld, "ẏ", "y")
But I'm using even more obscure characters which are not available in most fonts, so I want to be able to use Unicode character literals, something like
Update MyTbl Set MyFld = Replace(MyFld, "\u1e8f", "y")
This SQL statement is being invoked from a PHP script - the first form is not only unreadable, but it doesn't actually work!
You can specify hexadecimal literals (or even binary literals) using 0x, x'', or X'':
select 0xC2A2;
select x'C2A2';
select X'C2A2';
But be aware that the return type is a binary string, so each and every byte is considered a character. You can verify this with char_length:
select char_length(0xC2A2)
2
If you want UTF-8 strings instead, you need to use convert:
select convert(0xC2A2 using utf8mb4)
And we can see that C2 A2 is considered 1 character in UTF-8:
select char_length(convert(0xC2A2 using utf8mb4))
1
Also, you don't have to worry about invalid bytes because convert will remove them automatically:
select char_length(convert(0xC1A2 using utf8mb4))
0
As can be seen, the output is 0 because C1 A2 is an invalid UTF-8 byte sequence.
Thanks for your suggestions, but I think the problem was further back in the system.
There's a lot of levels to unpick, but as far as I can tell, (on this server at least) the command
set names utf8
makes the utf-8 handling work correctly, whereas
set character set utf8
doesn't.
In my environment, these are being called from PHP using PDO, for what difference that may make.
Thanks anyway!
You can use the hex and unhex functions, e.g.:
update mytable set myfield = unhex(replace(hex(myfield),'C383','C3'))
The MySQL string syntax is specified here, as you can see, there is no provision for numeric escape sequences.
However, as you are embedding the SQL in PHP, you can compute the right bytes in PHP. Make sure the bytes you put into the SQL actually match your client character set.
There is also the char function that will allow what you wanted (providing byte numbers and a charset name) and getting a char.