difference between UNHEX and X (MySQL) - mysql

What really is the difference between MySQL UNHEX and X when dealing with hexadecimal values in a database?
Eg.
SELECT * FROM test WHERE guidCol IN (UNHEX('hexadecimalstring'));
SELECT * FROM test WHERE guidCol IN (X'hexadecimalstring');
Both gives me exact result set. So is there any difference? Performance implications?
Edit: the underlying type of guidCol is binary of course

UNHEX() is a function, therefore you can do something like
SET #var = '41';
SELECT UNHEX(#var);
SELECT UNHEX(hex_column) FROM my_table;
X, on the other hand, is the syntax for a hexadecimal litteral. You cannot do this:
SET #var = '41';
SELECT X#var; -- error (string litteral expected)
SELECT X'#var'; -- error (`#` is not a hexadecimal digit)
SELECT X(#var); -- returns NULL, not too sure about the reason... [edit: but this is probably why you are inserting NULL values]
SELECT X(hex_column) FROM my_table; -- returns NULL as well
This explains why you always get better performance with X: you are using a language construct instead of a function call. X does not need to evaluate a variable, since it expects a litteral string.

Note that even in MySQL 5.6, the X'' notation has a length limit in the reference mysql client and UNHEX() does not (appear to). I do not know what the limit is for X'', as it is not officially documented but I have encountered it when trying to INSERT a BLOB. With X'' literal, mysql client threw a syntax error with a sufficiently long hex sequence while UNHEX() of the same sequence did not. Obviously, length is not an issue when it comes to an actual GUID, but I figured this is useful for anyone else using this question to answer mysql insertion of binary data in the general case.

Related

Mixing quoted and unquoted values in IN() condition - MySQL quirk or general issue?

The MySQL manual contains the following interesting note about mixing quoted and unquoted values in an IN condition:
You should never mix quoted and unquoted values in an IN() list because the comparison rules for quoted values (such as strings) and unquoted values (such as numbers) differ. Mixing types may therefore lead to inconsistent results.
However, it doesn't really explain why this is a problem. It has examples, but it doesn't show either the data being queried or the results, so they only serve as illustrations without giving any explanation about the issue.
I have two questions:
Why does this cause problems in MySQL? Ideally, provide an example where the results are wrong/inconsistent/unintuitive, to demonstrate.
Is this a MySQL-specific quirk or does this apply to other database systems? In particular, I am interested in whether this issue affects SQL Server, but would ideally like the question answered in the general case.
It depends what you consider "non-intuitive". This returns false:
'00' in ('0', '01')
However, this returns true:
'00' in (0, '01')
I think the next few lines give an unintuitive example without mixing :
mysql> SELECT 'a' IN (0), 0 IN ('b');
-> 1, 1
That you can extend :
SELECT 'a' IN (0, 1, '2'), 'a' IN ('0', '1', '2');
-> 1, 0
SELECT 0 IN (0.0, 'b'), 0 IN ('0.0', 'b');
-> 1, 1
Also there is this other question :
In MySQL, why does the following query return '----', '0', '000', 'AK3462', 'AL11111', 'C131521', 'TEST', etc.?
select varCharColumn from myTable where varCharColumn in (-1, '');
I get none of these results when I do:
select varCharColumn from myTable where varCharColumn in (-1);
select varCharColumn from myTable where varCharColumn in ('');
Everything is cast into float, most likely, according to this link :
[...] In all other cases, the arguments are compared as floating-point (real) numbers. For example, a comparison of string and numeric operands takes places as a comparison of floating-point numbers.
And string are cast as 0.0, unless they start by digits. Also from the same link, there could be problems with floating point accuracy, and queries not using index because the type is not right (it must cast everything to float, so no index usage, I guess).
I think you might get something similar but not the same with every DBMS because you have to cast things to compare them. It might not be the exact same issue in SQL Server, because the data type precedence is not the same, but you should compare data of the same data type.
According to this link that gives data type precedence for SQL Server :
user-defined data types (highest)
sql_variant
xml
datetimeoffset
datetime2
datetime
smalldatetime
date
time
float
real
decimal
money
smallmoney
bigint
int
smallint
tinyint
bit
ntext
text
image
timestamp
uniqueidentifier
nvarchar (including nvarchar(max) )
nchar
varchar (including varchar(max) )
char
varbinary (including varbinary(max) )
binary (lowest)
int and string would be cast to int (not float) for a SQL server DBMS.
Running some simple tests seems that the control between data types is done correctly, despite what is written in the MySQL manual.
SELECT 0 IN ('0','00',0,00); -> TRUE
SELECT 0 IN ('0','01',1,01); -> TRUE
SELECT 0 IN ('1','00',1,10); -> TRUE
SELECT 0 IN ('11','10',0,10); -> TRUE
SELECT 0 IN ('1','01',1,00); -> TRUE
SELECT '0' IN ('1','01',1,00); -> TRUE
SELECT '0' IN ('0','00',0,00); -> TRUE
SELECT '0' IN ('0','01',1,01); -> TRUE
SELECT '0' IN ('1','00',1,10); -> FALSE
SELECT '0' IN ('11','10',0,10); -> TRUE
SELECT '1' IN ('11','10',1,10); -> TRUE
SELECT '15.32' IN ('11','10',1,15.32); -> TRUE
SELECT 13.12 IN ('11','10',1,13.12); -> TRUE
SELECT 00 IN ('11','00',1,13.12); -> TRUE
SELECT '00' IN ('11',00,1,13.12); -> TRUE
SELECT '00.0' IN ('11',00.0,1,13.12); -> TRUE
SELECT '00.00' IN ('11',0,1,13.12); -> TRUE
SELECT '00.01' IN ('11',0.01,1,13.12); -> TRUE
The above results can be seen in this SQLFiddle
But the above tests are not even close to testing all the different data types of MySQL.
In addition we should simply just think in what cases we would use the IN () operator.
MySQL writes that mixed data types offer surprises on results sometimes, but then again is it actually needed to have different data types inside IN ()?
In short no. What will be checked against the values inside the parenthesis will be a table column having specific data type.
For example doesn't comparing a column of TEXT against IN ('Hello','World',13) seems odd? I know that one could oppose the fact that in the column having data type TEXT you may have numerical values. Good, then just write the above like this IN ('Hello','World','13') since we were speaking about a TEXT column.
In case that we did not know the data type or if somehow the data type is dynamic and could some times change, then we should convert that field to the data type that we expect the majority of results would be.
1. Why does this cause problems in MySQL?
The example below should be able to show you the inconsistency about using IN across quoted (x='1a') and unquoted types (x=1). Note for the same value of x = 1, the same IN expression yields 0 in Query 1, but yields 1 in Query 2.
SELECT
x, x IN ('1b','a1')
FROM
(
select '1a' as x
union all select 1
) q1;
SELECT
x, x IN ('1b','a1')
FROM
(
select 1 as x
) q1;
Results:
Query 1:
'1a': 0
1: 0
Query 2:
1: 1
For far I cannot observe inconsistency if I only alter the list inside IN. But I observed that pattern is like:
expr IN (...array of values)
For expr with string, against string values: compare as string
For expr without string, against string values: compare as number
For expr with string, against numeric values: compare as number
For expr without string, against numeric values: compare as number
2. Is this a MySQL-specific quirk or does this apply to other database systems?
Case by case. For MSSQL I tell you no because when comparing string with number, they give you an error message like:
Conversion failed when converting the varchar value '1a' to data type int.
1. Why does this cause problems in MySQL?
Engine needs to know how it will make comparisons.
If you compare column with integers, the column integer value will be compared with the IN list. If IN list items are strings, comparison will differ.
https://dev.mysql.com/doc/refman/8.0/en/type-conversion.html
2. Is this a MySQL-specific quirk or does this apply to other database systems?
It is not MYSQL specific. For performance reasons (indexing) it is always better not to make casting.
Why does this cause problems in MySQL?
It's not a bug, it's a feature. 😬
Basically it's about how the database handles the field comparison. In particular, MySQL automatically converts the string value to a numeric value when comparing the numeric with string values. Since MySQL is written in C++ , somewhere in the code base, they should cast the string value to double prior to field comparison.
There is nothing special about the IN clause, I think. In the MySQL source code, I saw comments similar to this one:
`WHERE a IN (b, c)` can also be rewritten as `WHERE a = b OR a = c`
Which makes sense and IN is (probably) treated the same way in code base. So based on this, if we have let's say something like this:
... WHERE '04.2' IN ('0', 4.2);
Which means '04.2' = '0' OR '04.2' = 4.2, and will return true, because, in C/C++:
"04.2" = "0" // string value comparison -> false
cast_as_double("04.2") = 4.2 // double value comparison -> true
The same applies for other cases, which resolve as true, e.g. 42 IN ('0042', 0), '3.00' IN (3, '1'), 0 IN (3, '0.00') etc.
Is this a MySQL-specific quirk or does this apply to other database systems?
This seems to be the case with other databases as well. If you like, you can test them online
MySQL: https://www.db-fiddle.com
PostgreSQL: https://www.db-fiddle.com
MS SQL Server 2017: http://sqlfiddle.com/#!18/ff6b8/12807
Whilst there have been a lot of lot of answers and comments that provide examples of 'unintuitive' behaviour, most of these examples seem to be explained by the standard casting rules. In other words, the results were entirely consistent with what would be returned from SELECT A = B; for the given A and B.
"Because casting" doesn't seem like a particularly satisfying explanation for the paragraph I quoted in the question. That paragraph comes after a number of paragraphs explaining how type conversion affects the IN() statement, so it seems somewhat repetitive and redundant if that is all it's referring to.
My interpretation of the quoted paragraph is that it is an explicit statement that a IN(b, c) may give different results to a = b OR a = c in situations where b and c are quoted differently.
I was therefore looking to find an example where the result couldn't be explained by the usual casting rules.
I think the reason that we haven't seen a good example yet is because most answers focussed on comparing numbers, in string and non-string representations. However, by basing the test around string values instead, I have managed to construct a non-intuitive example that is not explained by simple type conversion rules and which is not equivalent to the individual comparisons ORed together; the comparison between 'test' and 23 gives different results depending on what other values are in the IN() list:
SELECT 'test' IN('fish'); --> 0
SELECT 'test' IN(23); --> 0
SELECT 'test' IN('fish', 23); --> 1 !!!
I have yet to come up with a good explanation about what is happening here - is there some rule being followed, or is it just a MySQL quirk? I also haven't got an answer to the second question, as that somewhat depends on the reason for the behaviour (e.g. if it is defined by the standard or is an artefact of an obvious optimisation, vs. just being a MySQL-specific quirk) but I guess this could be figured out by running the above test on other RDBMSs.
Any comments to help flesh this out (or answers that cover the missing elements) will be appreciated - I will update this answer with any further details that I manage to deduce and don't plan on accepting any answer (including my own) until I understand what's going on a little bit better.

SQL Query giving wrong results

The query executed should match the story_id with the provided string but when I execute the query it's giving me a wrong result. Please refer to the screenshot.
story_id column in your case is of INT (or numeric) datatype.
MySQL does automatic typecasting in this case. So, 5bff82... gets typecasted to 5 and thus you get the row corresponding to story_id = 5
Type Conversion in Expression Evaluation
When an operator is used with operands of different types, type
conversion occurs to make the operands compatible. Some conversions
occur implicitly. For example, MySQL automatically converts strings to
numbers as necessary, and vice versa.
Now, ideally your application code should be robust enough to handle this input. If you expect the input to be numeric only, then your application code can use validation operations on the data (to ensure that it is only a number, without typecasting) before sending it to MySQL server.
Another way would be to explicitly typecast story_id as string datatype and then perform the comparison. However this is not recommended approach as this would not be able to utilize Indexing.
SELECT * FROM story
WHERE (CAST story_id AS CHAR(12)) = '5bff82...'
If you run the above query, you would get no results.
you can also use smth like this:
SELECT * FROM story
WHERE regexp_like(story_id,'^[1-5]{1}(.*)$');
for any story_ids starting with any number and matching any no of charatcers after that it wont match with story_id=5;
AND if you explicitly want to match it with a string;

Strange behaviour of BIT(64) column in MySQL

Can anyone help me understand the following problem with a BIT(64) column in MySQL (5.7.19).
This simple example works fine and returns the record from the temporary table:
CREATE TEMPORARY TABLE test (v bit(64));
INSERT INTO test values (b'111');
SELECT * FROM test WHERE v = b'111';
-- Returns the record as expected
When using all the 64 bits of the column it no longer works:
CREATE TEMPORARY TABLE test (v bit(64));
INSERT INTO test values (b'1111111111111111111111111111111111111111111111111111111111111111');
SELECT * FROM test WHERE v = b'1111111111111111111111111111111111111111111111111111111111111111';
-- Does NOT return the record
This only happens when using a value with 64 bits. But I would expect that to be possible.
Can anyone explain this to me?
Please do not respond by advising me not to use BIT columns. I am working on a database tool that should be able to handle all the data types of MySQL.
The problem seems to be, that the value b'11..11' in the WHERE clause is considered to be a SIGNED BIGINT which is -1 and is compared to the value in your table which is considered to be an UNSIGNED BIGINT which is 18446744073709551615. This is always an issue when the first of 64 bits is 1. IMHO this is a bug or a design flaw, because I expect an expression in the WHERE clause to match a row if the same expression has been used in the INSERT satement (at least in this case).
One workaround would be to cast the value to UNSIGNED:
SELECT *
FROM test
WHERE v = CAST(b'1111111111111111111111111111111111111111111111111111111111111111' as UNSIGNED);
Or (if your application language supports it) convert it to something like long uint or decimal:
SELECT * FROM test WHERE v = 18446744073709551615;
Bits are returned as binary, so to display them, either add 0, or use a function such as HEX, OCT or BIN to convert them https://mariadb.com/kb/en/library/bit/ or Bit values in result sets are returned as binary values, which may not display well. To convert a bit value to printable form, use it in numeric context or use a conversion function such as BIN() or HEX(). High-order 0 digits are not displayed in the converted value. https://dev.mysql.com/doc/refman/8.0/en/bit-value-literals.html

Equivalent of MySQL HEX / UNHEX function in SQL Server?

Does SQL Server has an equivalent to the HEX and UNHEX MySQl functions?
Update 2022: This is outdated, read about CONVERT() together with binary types.
What are you going to do?
Something like a script generation?
There might be better approaches for your issue, but you do not provide many details...
Not quite as slim but this would work
--This will show up like needed, but it will not be a string
SELECT CAST('abc' AS VARBINARY(MAX))
--this is the string equivalent
SELECT sys.fn_varbintohexstr(CAST('abc' AS VARBINARY(MAX)));
--This will turn the string equivalent back to varbinary
SELECT sys.fn_cdc_hexstrtobin('0x616263')
--And this will return the original string
SELECT CAST(sys.fn_cdc_hexstrtobin('0x616263') AS VARCHAR(MAX));
###One hint
If you deal with UNICODE you can check the changes if you set N'abc' instead of 'abc' and in den final line you'd have to convert '0x610062006300' to NVARCHAR.
###Another hint
If you need this more often you might put this into an UDF, than it is as eays as with MySQL :-)

how to query a blob data

I have a table like :
------------------------------
Test_Id Test_data
(String) (blob)
------------------------------
I want a query to retrieve all the Test_Id's for a matching Test_data.
To achieve something like : select * from test_table where Test_data = blobObject;
How can we do above ??
First: there's no such thing as a string in MySQL. Only char/varchar/text.
Well you could cast it as char for comparison like this:
select * from test_table where Test_data = CAST( blobObject AS CHAR );
what's probably better is to convert your string to a binary string, but this might not give you the right comparison if you expect string comparison behaviour... well best you have a look at the char functions here:
http://dev.mysql.com/doc/refman/5.0/en/cast-functions.html
You can use a hash function such as MD5
SELECT * FROM example_table WHERE MD5(blob_column) = 'a6a7c0ce5a93f77cf3be0980da5f7da3';
MySQL has data types which can store binary data. Not only char/varchar/text, but also BINARY/VARBINARY/BLOB.
See http://dev.mysql.com/doc/refman/5.5/en/blob.html
And it's usage is as simple as normal string type. But, escaping is required. and query length is must specified because binary data may contain NULL character in their contents.
Before MySQL 3.23 (I guess), There were only mysql_query(), mysql_escape_string(). Those function has no capability specifying query length. after BLOB has been introduced in MySQL, mysql_real_query() and mysql_real_escape_string() supported.
I found some examples for you. May this links help you!
http://zetcode.com/db/mysqlc/
http://bytes.com/topic/c/answers/558973-c-client-load-binary-data-mysql