Changing order of where clauses breaks the query - mysql

I ran into this case using MySQL 5.6:
This query works and returns expected results:
select *
from some_table
where a = 'b'
and metadata->>"$.country" is not null;
However, this query (the only difference is the order of where clauses) returns an error
select *
from some_table
where metadata->>"$.country" is not null
and a = 'b';
The error MySQL returns is
Invalid JSON text in argument 1 to function json_extract: "Invalid value." at position 0.
Why?

The value of the metdata column contains malformed JSON for at least one row in the table.
We'd expect that entirely removing the a = 'b' condition, we would also observe the same error.
I suspect that the difference in behavior is due to the order of operations being performed. When the a = 'b' condition is evaluated first, that excludes rows before the JSON_EXTRACT(metadata) expression is evaluated. Since the row doesn't match the a = 'b' condition, MySQL takes a shortcut, it doesn't evaluate the JSON_EXTRACT, it already knows the row is going to be excluded.
When the comparisons are done in a different order, with the JSON_EXTRACT function executed first, the error is raised when the expression is evaluated for a rows with invalid JSON in metadata.
Summary:
There's at least one row in the table that has malformed JSON stored in metadata column.
The difference in the observed behavior of the two queries is due to a different order of operations.
Suggestion:
Consider using the JSON_VALID function to identify rows with invalid values.
Excerpt from MySQL Reference Manual
JSON_EXTRACT
Returns data from a JSON document, selected from the parts of the document matched by the path arguments. Returns NULL if any argument is NULL or no paths locate a value in the document. An error occurs if the json_doc argument is not a valid JSON document or any path argument is not a valid path expression.
https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-extract
JSON_VALID
https://dev.mysql.com/doc/refman/5.7/en/json-attribute-functions.html#function_json-valid

Related

Why does a query return a result if the field is numeric and the WHERE clause is a string?

I am running a query on a db table which is returning one record when I expect it to return no records.
SELECT yeargroupID FROM tbl_yeargroup WHERE yeargroup='S' AND schoolID=2.
The yeargroup field is a tinyint field. Thefore the WHERE clause is looking the letter 'S' in the numeric field, so should not find anything. Yet it returns the record with the yeargroup = 0, and yeargroupID=17 (the bottom record in the table)
I'm confused as to why it is returning this record and how to avoid it.
Thanks
This logic, as you have pointed out, is comparing a number and a string:
WHERE yeargroup = 'S'
Handling such situations is an important part of most SQL compilers, and it is well documented. The solution is to implicitly convert values to "conforming" types. This is sad. My preference would be for the compiler to generate an error and force the user to use correct types. I find that implicit conversion creates more problems than it solves.
In any case, the rules in this case are pretty simple. The string is converted to an integer. But, how is a string with no digits converted? Well, the rule in MySQL is that the leading digits are converted to a number. And if there are none, the value is 0. So, this turns into:
where yeargroup = 0
You can see the results more clearly if you run:
select 'S', 'S' + 0
Note that most databases would return an error in this case (a type conversion error). But even those would accept the string if it looked like a number, so this would be allowed:
where yeargroup = '5'
What is the proper solution? Never mix types. Do not construct queries by munging constant values. Instead, queries from an application should always be using parameters.

SQL Query giving wrong results

The query executed should match the story_id with the provided string but when I execute the query it's giving me a wrong result. Please refer to the screenshot.
story_id column in your case is of INT (or numeric) datatype.
MySQL does automatic typecasting in this case. So, 5bff82... gets typecasted to 5 and thus you get the row corresponding to story_id = 5
Type Conversion in Expression Evaluation
When an operator is used with operands of different types, type
conversion occurs to make the operands compatible. Some conversions
occur implicitly. For example, MySQL automatically converts strings to
numbers as necessary, and vice versa.
Now, ideally your application code should be robust enough to handle this input. If you expect the input to be numeric only, then your application code can use validation operations on the data (to ensure that it is only a number, without typecasting) before sending it to MySQL server.
Another way would be to explicitly typecast story_id as string datatype and then perform the comparison. However this is not recommended approach as this would not be able to utilize Indexing.
SELECT * FROM story
WHERE (CAST story_id AS CHAR(12)) = '5bff82...'
If you run the above query, you would get no results.
you can also use smth like this:
SELECT * FROM story
WHERE regexp_like(story_id,'^[1-5]{1}(.*)$');
for any story_ids starting with any number and matching any no of charatcers after that it wont match with story_id=5;
AND if you explicitly want to match it with a string;

PGSQL - No function matches the given name and argument types. You might need to add explicit type casts

This code gives an error. I have looked similar type questions and couldn't find the answer.
sum(COALESCE(((rpt.report_target_data::json->>'itemQuantity')::int)::int),0) as itemQuantity,
report_target_data is a json object and 'itemQuantity' is an element of that json. Sometimes that field contains an empty value. So when I try to get the sum it gives an error because postgres cannot get the sum if a column had a empty value. What is the wrong with the above code. It there a way to walk around that matter? Is there a way to calculate sum even if some rows contain empty values?
Here is the error of the above code ->
No function matches the given name and argument types. You might need to add explicit type casts.
In my case, it was not a COALESCE problem but I ended up in this question.
I noticed that my column values were characters (the varchar type) so what I did is:
select sum(cast(num_suf as int)) as total from results;
Just in case someone lands in this question again :)

Unexpected behaviour of SELECT when value being looked for is number followed by letters

I have a simple table:
Entity
ID : int
Name : varchar(10)
I was looking up entities by their ID and found a result that surprised me. Let's assume that an entity with ID = 10 exists. When I run the following queries, I get the following results:
SELECT * from Entity WHERE ID = 10 Found Entity 10 (as expected)
SELECT * from Entity WHERE ID = '10' Found Entity 10 (as expected)
SELECT * from Entity WHERE ID = A Syntax error (as expected)
SELECT * from Entity WHERE ID = 'A' Zero records found (as expected)
SELECT * from Entity WHERE ID = 10A Syntax error (as expected)
SELECT * from Entity WHERE ID = '10A' Found Entity 10 (WTF)
The final query would appear to ignore the 'A' and evaluate the query as if I had just passed in 10. This is not what I expected.
Is this standard behaviour? I cannot find any doco either way.
Yes, it's standard behavior for MySQL.
It's documented under Type Conversion for Expression Evaluation.
When an operator is used with operands of different types, type conversion occurs to make the operands compatible. Some conversions occur implicitly. For example, MySQL automatically converts numbers to strings as necessary, and vice versa.
Casting a string to a number results in truncating it at the first non-numeric character, or 0 if the first character is not numeric.
See also: Can I configure MySQL's typecasting to consider 0 != 'foo'?
Note also that the '10A' query (and the 'A' one also) should have thrown a warning. SHOW WARNINGS; after the query to see it. Your client should have alerted you to the fact that a warning was thrown. If it didn't, you should complain loudly to the vendor, because that's broken behavior.
When casting a text value to an int, as MySQL does here with your text literals before looking up the id, its behaviour is to use all the numbers up to the first non-number.
Text values that don't have any numbers before the first non number (ie they start with a non number) get cast to 0.
I couldn't find a reference that declared this behaviour as a contract, but here's an SQLFiddle that shows it in action.

Select statement returns data although given value in the where clause is false

I have a table on my MySQL db named membertable. The table consists of two fields which are memberid and membername. The memberid field has the type of integer and uses auto_increment function starting from 2001. The membername table has the type of varchar.
The membertable has two records with the same order as described above. The records look like this :
memberid : 2001
membername : john smith
memberid : 2002
membername : will smith
I found something weird when I ran a SELECT statement against the memberid field. Running the following statement :
SELECT * FROM `membertable` WHERE `memberid` = '2001somecharacter'
It returned the first data.
Why did that happen? There's no record with memberid = 2001somecharacter. It looks like MySQL only search the first 4 character (2001) and when It's found related data, which is the returned data above, it denies the remaining characters.
How could this happen? And is there any way to turn off this behavior?
--
membertable uses innodb engine
This happens because mysql tries to convert "2001somecharacter" into a number which returns 2001.
Since you're comparing a number to a string, you should use
SELECT * FROM `membertable` WHERE CONVERT(`memberid`,CHAR) = '2001somecharacter';
to avoid this behavior.
OR to do it properly, is NOT put your search variable in quotes so that it has to be a number otherwise it'll blow up because of syntax error and then in front end making sure it's a number before passing in the query.
sqlfiddle
Your finding is an expexted MySQL behaviour.
MySQL converts a varchar to an integer starting from the beginning. As long as there are numeric characters wich can easily be converted, they are icluded in the conversion process. If there's a letter, the conversion stops returning the integer value of the numeric string read so far...
Here's some description of this behavior on the MySQL documentation Site. Unfortunately, it's not mentioned directly in the text, but there's an example which exactly shows this behaviour.
MySQL is very liberal in converting string values to numeric values when evaluated in numeric context.
As a demonstration, adding 0 causes the string to evaluated in a numeric context:
SELECT '2001foo' + 0 --> 2001
, '01.2-3E' + 0 --> 1.2
, 'abc567g' + 0 --> 0
When a string is evaluated in a numeric context, MySQL reads the string character by character, until it encounters a character where the string can no longer be interpreted as a numeric value, or until it reaches the end of the string.
I don't know of a way to "turn off" or disable this behavior. (There may be a setting of sql_mode that changes this behavior, but likely that change will impact other SQL statements that are working, which may stop working if that change is made.
Typically, this kind of check of the arguments is done in the application.
But if you need to do this in the SELECT statement, one option would be cast/convert the column as a character string, and then do the comparison.
But that can have some significant performance consequences. If we do a cast or convert (or any function) on a column that's in a condition in the WHERE clause, MySQL will not be able to use a range scan operation on a suitable index. We're forcing MySQL to perform the cast/convert operation on every row in the table, and compare the result to the literal.
So, that's not the best pattern.
If I needed to perform a check like that within the SQL statement, I would do something like this:
WHERE t.memberid = '2001foo' + 0
AND CAST('2001foo' + 0 AS CHAR) = '2001foo'
The first line is doing the same thing as the current query. And that can take advantage of a suitable index.
The second condition is converting the same value to a numeric, then casting that back to character, and then comparing the result to the original. With the values shown here, it will evaluate to FALSE, and the query will not return any rows.
This will also not return a row if the string value has a leading space, ' 2001'. The second condition is going to evaluate as FALSE.
When comparing an INT to a 'string', the string is converted to a number.
Converting a string to a number takes as many of the leading characters as it can and still be a number. So '2001character' is treated as the number 2001.
If you want non-numeric characters in member_id, make it VARCHAR.
If you want only numeric ids, then reject '200.1character'