Select with join outputs not exact match [MySQL] - mysql

Can anyone explain why query:
select rma.id, history_transactions.reference
from history_transactions
join rma on history_transactions.reference = rma.id
Returns:
id | reference
100144 | 100144
102299 | 102299a
100316 | 100316AFEN1
Can't get it to show only 100% matched, so only first row. If someone can explain why it happens it would be great.

Evidently rma.id column is numeric (integer), while the reference field is textual, since it contains text as well.
As MySQL documentation on Type Conversion in Expression Evaluation describes, if you compare text with number, the comparison is one as floating point numbers, meaning that the reference field is converted to a number.
MySQL converts a string to number by evaluating its characters left to right, as long as the charters can be interpreted as a number. If it encounters a character that cannot be evaluated as a number, then MySQL stops the evaluation and returns the previous characters as the numeric value.
In case of the 2nd record, the letter a is the 1st character that cannot be evaluated as number, therefore the numeric value of '102299a' string is 102299. The same logic applies to the 3rd record.
To force MySQL to return exact matches only, explicitly convert rma.id to string using cast() or convert() functions in the query. This way the comparison would be done as strings, not as floating point numbers.

The reason is most probably implicit type conversion. My guess is that id field is of integer type, whereas reference field is of varchar type. Hence, when comparing MySQL converts varchar to a number. So, e.g. value '10299a' is converted to 10299 and is then compared to the corresponding value of id field.
Live demo of the issue

Implicit datatype conversion.
I suspect the id column is declared as numeric datatype. Most likely INT.
The reference column is declared as character type. Most likely VARCHAR.
To do the equality comparison, MySQL can't compare a "string" with a "number".
So MySQL is implicitly doing a conversion of the string value into a number, and then doing the comparison of the numeric values.
Other databases would throw an error given a string that isn't a valid number.
But MySQL allows the conversion (without error or warning.)
As a demonstration, these expressions cause MySQL to do implicit conversion of a string to a number:
SELECT '123ABC' + 0
, '4D5E6F' + 0
, 'G7H8I9' + 0
Given your values, for example
SELECT '100316AFEN1' + 0
We see that MySQL returns a numeric value of 100316 which it uses to compare.

Related

Why does a query return a result if the field is numeric and the WHERE clause is a string?

I am running a query on a db table which is returning one record when I expect it to return no records.
SELECT yeargroupID FROM tbl_yeargroup WHERE yeargroup='S' AND schoolID=2.
The yeargroup field is a tinyint field. Thefore the WHERE clause is looking the letter 'S' in the numeric field, so should not find anything. Yet it returns the record with the yeargroup = 0, and yeargroupID=17 (the bottom record in the table)
I'm confused as to why it is returning this record and how to avoid it.
Thanks
This logic, as you have pointed out, is comparing a number and a string:
WHERE yeargroup = 'S'
Handling such situations is an important part of most SQL compilers, and it is well documented. The solution is to implicitly convert values to "conforming" types. This is sad. My preference would be for the compiler to generate an error and force the user to use correct types. I find that implicit conversion creates more problems than it solves.
In any case, the rules in this case are pretty simple. The string is converted to an integer. But, how is a string with no digits converted? Well, the rule in MySQL is that the leading digits are converted to a number. And if there are none, the value is 0. So, this turns into:
where yeargroup = 0
You can see the results more clearly if you run:
select 'S', 'S' + 0
Note that most databases would return an error in this case (a type conversion error). But even those would accept the string if it looked like a number, so this would be allowed:
where yeargroup = '5'
What is the proper solution? Never mix types. Do not construct queries by munging constant values. Instead, queries from an application should always be using parameters.

Strange answer behavior with SELECT

I have a simple table using a non auto inc INT as primary key.
When querying the table with condition e.g. WHERE id='2,5,6' (unintentionally!) it returns a result set!
Ok, it works, but why?
id is an integer and you compare it with a string '2,5,6'. MySQL converts the string to a number in order to compare the two.
Well, '2,5,6' isn't a number and other DBMS would throw an error. But MySQL uses another approach: it converts character per character until the string is ended or the character is not numeric. So it sees the 2 then the comma. Depending on your settings the comma is the dicimal separater or not. So MySQL either converts to 2 or to 2.5.
Here is the documentation on implicit conversions in MySQL: https://dev.mysql.com/doc/refman/5.5/en/type-conversion.html.
The algorithm on how to convert a string to a number is not explicitly described there, but they say for instance
there are many different strings that may convert to the value 1, such as '1', ' 1', or '1a'.
They also point out in that document that implicit conversion is dangerous, because strings are not converted to DECIMAL (as I would have thought), but to the approximate datatype DOUBLE. So in MySQL we should always avoid implicit conversion from string to number.

Select statement returns data although given value in the where clause is false

I have a table on my MySQL db named membertable. The table consists of two fields which are memberid and membername. The memberid field has the type of integer and uses auto_increment function starting from 2001. The membername table has the type of varchar.
The membertable has two records with the same order as described above. The records look like this :
memberid : 2001
membername : john smith
memberid : 2002
membername : will smith
I found something weird when I ran a SELECT statement against the memberid field. Running the following statement :
SELECT * FROM `membertable` WHERE `memberid` = '2001somecharacter'
It returned the first data.
Why did that happen? There's no record with memberid = 2001somecharacter. It looks like MySQL only search the first 4 character (2001) and when It's found related data, which is the returned data above, it denies the remaining characters.
How could this happen? And is there any way to turn off this behavior?
--
membertable uses innodb engine
This happens because mysql tries to convert "2001somecharacter" into a number which returns 2001.
Since you're comparing a number to a string, you should use
SELECT * FROM `membertable` WHERE CONVERT(`memberid`,CHAR) = '2001somecharacter';
to avoid this behavior.
OR to do it properly, is NOT put your search variable in quotes so that it has to be a number otherwise it'll blow up because of syntax error and then in front end making sure it's a number before passing in the query.
sqlfiddle
Your finding is an expexted MySQL behaviour.
MySQL converts a varchar to an integer starting from the beginning. As long as there are numeric characters wich can easily be converted, they are icluded in the conversion process. If there's a letter, the conversion stops returning the integer value of the numeric string read so far...
Here's some description of this behavior on the MySQL documentation Site. Unfortunately, it's not mentioned directly in the text, but there's an example which exactly shows this behaviour.
MySQL is very liberal in converting string values to numeric values when evaluated in numeric context.
As a demonstration, adding 0 causes the string to evaluated in a numeric context:
SELECT '2001foo' + 0 --> 2001
, '01.2-3E' + 0 --> 1.2
, 'abc567g' + 0 --> 0
When a string is evaluated in a numeric context, MySQL reads the string character by character, until it encounters a character where the string can no longer be interpreted as a numeric value, or until it reaches the end of the string.
I don't know of a way to "turn off" or disable this behavior. (There may be a setting of sql_mode that changes this behavior, but likely that change will impact other SQL statements that are working, which may stop working if that change is made.
Typically, this kind of check of the arguments is done in the application.
But if you need to do this in the SELECT statement, one option would be cast/convert the column as a character string, and then do the comparison.
But that can have some significant performance consequences. If we do a cast or convert (or any function) on a column that's in a condition in the WHERE clause, MySQL will not be able to use a range scan operation on a suitable index. We're forcing MySQL to perform the cast/convert operation on every row in the table, and compare the result to the literal.
So, that's not the best pattern.
If I needed to perform a check like that within the SQL statement, I would do something like this:
WHERE t.memberid = '2001foo' + 0
AND CAST('2001foo' + 0 AS CHAR) = '2001foo'
The first line is doing the same thing as the current query. And that can take advantage of a suitable index.
The second condition is converting the same value to a numeric, then casting that back to character, and then comparing the result to the original. With the values shown here, it will evaluate to FALSE, and the query will not return any rows.
This will also not return a row if the string value has a leading space, ' 2001'. The second condition is going to evaluate as FALSE.
When comparing an INT to a 'string', the string is converted to a number.
Converting a string to a number takes as many of the leading characters as it can and still be a number. So '2001character' is treated as the number 2001.
If you want non-numeric characters in member_id, make it VARCHAR.
If you want only numeric ids, then reject '200.1character'

mysql SUM of VARCHAR fields without using CAST

When SUM is used in query on field of type VARCHAR in MySql database, does SUM automatically convert it into number ?
I tried this by using
SELECT SUM(parametervalue) FROM table
and it reveals that MySql returns the sum although I expected to throw it an error as "parametervalue" field is of VARCHAR type
MySQL does silent conversion for a string in a numeric context. Because it expects a number for the sum(), MySQL simply does the conversion using the leading "numbers" from a string. Note that this include decimal points, minus sign, and even e representing scientific notation. So, '1e6' is interpreted as a number.
In code, I personally would make the conversion explicit by adding 0:
SELECT SUM(parametervalue + 0) FROM table
Ironically, the cast() might return an error if the string is not in a numeric format, but this doesn't return an error in that case.

mysql only compares first numerical part of string to int column in where

This is more a "why" or "any pointers to documentations" question.
Today I realised that a comparison in WHERE clause takes only the first numerical part of the string if compared to an INT column. (My own conclusion)
For example:
SELECT * FROM companies c WHERE id = '817m5'
will still return companies with id = 817.
It does make sense but I'm failing to find this documented anywhere to see if my conclusion is correct and if there are additional things to know about this exact behaviour. WHERE, INT type, comparison documentation? Where? How is it called?
Thanks!
This is the comparison:
WHERE id = '817m5'
and id is an integer.
So, MySQL has to compare an integer to a string. Well, it can't do that directly. It either has to convert the string to a number or the number to the string. MySQL converts the string to a number, which it does by converting the leading numeric characters. So, the '817m5' becomes 817.
Here is the exact quote:
To cast a string to a numeric value in numeric context, you normally
do not have to do anything other than to use the string value as
though it were a number:
mysql> SELECT 1+'1'; -> 2