Why does MySQL equal sign matches wrong entries? - mysql

I'm running a select query on two tables and searching the matching entries with an equal sign. In my understanding, MySQL should only return entries exactly matching the WHERE condition, however it's returns entries like when I use the LIKE statement:
Any explanations why would the first row be returned as a result of the query?
EDIT:
Here's the query:
SELECT `ts`.`ticker_symbol`, `sm`.`id` AS `matchescount`, `sm`.`ticker_symbol_ids`
FROM `mk_ticker_symbols` `ts`, `mk_submissions` `sm`
WHERE `sm`.`ticker_symbol_ids` = `ts`.`id` AND `ts`.`id` = "1506"
EDIT 2:
Here's the SQL Fiddle:
http://sqlfiddle.com/#!9/5550b/1/0
EDIT 3:
Here's the SQL Fiddle with JOINs:
http://sqlfiddle.com/#!9/5550b/2/0

Piero,
The one with JOINs can be corrected. CAST() within JOIN will fix the issue.
INNER JOIN `mk_submissions` `sm`
ON `sm`.`ticker_symbol_ids` = CAST(`ts`.`id` AS CHAR(10))
I know you are not looking for solution, but I still post it.
The problem is VERY interesting.
I searched online, and did some trial-error on my DB. I have no explanations....
I tried to put 1506, in the second, or third place in comma separated list - the query works fine.
So, I have a feeling, that in case of JOIN with comma-separated list, comma gets treated as wildcard 'end of string'...
If you ever find an explanation, please post it here.

When evaluating expressions, MySQL converts both arguments (in this case) to floating point numbers to compare them. This is because one is a string, and one is an integer, which results in the final condition in the link above being applied.
In all other cases, the arguments are compared as floating-point
(real) numbers.
So what is the floating point equivalent of the string "1506,..."?
Running the following on my test server:
SELECT "1506,3101,26673,26745,2277,1216,26847,26865,20711,1468,26947,233,20539,26985"+0.0
Results in:
1506
Which of course equals the floating point version of the integer 1506.
So, everything is behaving as expected. At least, assuming you expect this floating point comparison to be happening.

I can't give a full explanation for the problem, but I have a solution.
ts.id is likely and INTEGER so your where clause should be
`ts`.`id` = 1506
(remove quotes from the number).
Also you should use a join instead of a where clause to match the tables:
FROM `mk_ticker_symbols` `ts`
JOIN `mk_submissions` `sm` on sm.ticker_symbol_ids = ts.id

I found the answer to this issue. I was comparing string to integer here:
`sm`.`ticker_symbol_ids` = `ts`.`id` AND `ts`.`id` = "1506"
The problem is that, this is converted to integer internally for comparison:
1506,3101,26673,26745,2277,1216,26847,26865,20711,1468,26947,233,20539,26985
Because of the comma, MySQL thinks it's a decimal or float with the floating point, and everything after the comma is omitted for comparison. So it becomes 1506 instead of 1506,3101,26673,26745,2277,1216,26847,26865,20711,1468,26947,233,20539,26985, and that matches the WHERE condition.
#cyadvert and #Willem_Renzema were absolutely correct.
To resolve the issue
I only needed to:
CAST(`ts`.`id` AS CHAR)

Related

Join syntax on MYSQL

For MYSQL syntax I understand the Joins are FROM table JOIN table ON table.column = table.column I have come across other forms of joins that seam to not only not follow that syntax but the two columns do not relate rather they compliment each-other example below
from coordinates as cod join
geofences as geo
on st_contains(geo.simplified_shape, cod.request_point)
For context this is saying st_contains where A contains B so essentially is this satisfying the join if indeed the request point is in the geo fence shape? I know this is a valid syntax this question is more on if someone can illuminate not only the joins within a parentheses and when that can be applicable rather than the = sign is it only in these specific instances and if my line of thinking is correct that the tables can join not because the values are equivalent but because it satisfies the st_contains condition so for example if you used something else other then st_contains how would that look?
This is really equivalent to:
on st_contains(geo.simplified_shape, cod.request_point) <> 0
What is happening here is that MySQL is converting the result to a "boolean". If the function returns a number, then any non-zero number is "true" and zero is "false".
If the returned value is a string, then the string is converted to a number, based on the leading digits. If there are no leading digits, the value is zero. Then this is treated as a boolean.

MySQL different counts between "where =" and "where like"

1. select count(*) from tableX where code = "XYZ";
2. select count(*) from tableX where code like "%XYZ";
Result for query 1 is 18734. <== Not Correct
Result for query 2 is 93003. <== Correct
We know that query 2's count is correct based on independent verification.
We expect these two queries to have the exact same count for each because we know that no rows in tableX have a code that ends with "XYZ", so the wildcard at the beginning shouldn't affect the query.
Why would these queries produce different counts?
We have already researched the differences between "=" comparison and "like" string comparison, but based on all our verification checks, we still don't understand why this would give us different counts
We have confirmed the following:
There are no leading or trailing characters in the "code" field
There are no hidden characters (tried all found here: How can I find non-ASCII characters in MySQL?)
The collation is "utf8_unicode_ci"
We are using MySQL version 5.5.40-0ubuntu0.12.04.1.
Try this in order to get your answer:
SELECT code
FROM tableX
WHERE code LIKE "%XYZ"
AND code <> "XYZ"
LIMIT 10
My guess is that some of your codes end with a lowercase xyz, and since LIKE is case-insensitive, it matched these where = did not.
where code = "XYZ"; gives exact match whereas where code LIKE "%XYZ"; includes partial match as well. In your case, there could be an extra space present which is giving wrong count. Consider trimming before comparing like
where UPPER(TRIM(code)) = 'XYZ';
We restarted the server that the database resides on, we re-ran the queries, and now they all are producing the expected, correct results...
We'll have to look into possibilities for why this "fixed" the issue.

how do I inner-join an integer substring of a URL to an integer?

I have two MySQL tables in Joomla: categories and Menu.
The field menu.link has values like index.php?option=com_content&view=category&id=175.
The number after the very last equal sign is equal to the field categories.id.
I would like to create INNER JOIN between two tables so that categories.id will be equal to the number in menu.link.
I understand I have to remove all before the number, but how shall I do that?
It seems you are looking for a SQL expression that will extract the id value from your URL string. This is always a dicey proposition because it depends on unpredictable details of the format of the URL.
It's a doubly dicey proposition in MySQL because there aren't any regexp functions that return actual string values. They only return true/false. So you need to use non-regexp string processing functions to extract your data.
That being said, let us hack away. This expression will get that number.
CAST(SUBSTRING_INDEX(menu.link,'view=category&id=',-1) AS INT) AS cat_id
The heart of this string-processing hack is the string 'view=category&id='. The SUBSTRING_INDEX function retrieves everything to the right of that string, and the CAST operation takes just the integer.
If the substring is not found, the expression returns zero. That might or might not be what you want. (I said this was dicey!)
So, to perform the join you'd do something like this:
SELECT Menu.whatever,
categories.whatever
FROM Menu
JOIN categories
ON categories.id = CAST(SUBSTRING_INDEX(menu.link,'view=category&id=',-1) AS INT)
This will perform poorly. But that's probably OK because you won't have tens of thousands of rows in either table.

Using REGEX to alter field data in a mysql query

I have two databases, both containing phone numbers. I need to find all instances of duplicate phone numbers, but the formats of database 1 vary wildly from the format of database 2.
I'd like to strip out all non-digit characters and just compare the two 10-digit strings to determine if it's a duplicate, something like:
SELECT b.phone as barPhone, sp.phone as SPPhone FROM bars b JOIN single_platform_bars sp ON sp.phone.REGEX = b.phone.REGEX
Is such a thing even possible in a mysql query? If so, how do I go about accomplishing this?
EDIT: Looks like it is, in fact, a thing you can do! Hooray! The following query returned exactly what I needed:
SELECT b.phone, b.id, sp.phone, sp.id
FROM bars b JOIN single_platform_bars sp ON REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(b.phone,' ',''),'-',''),'(',''),')',''),'.','') = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')',''),'.','')
MySQL doesn't support returning the "match" of a regular expression. The MySQL REGEXP function returns a 1 or 0, depending on whether an expression matched a regular expression test or not.
You can use the REPLACE function to replace a specific character, and you can nest those. But it would be unwieldy for all "non-digit" characters. If you want to remove spaces, dashes, open and close parens e.g.
REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')','')
One approach is to create user defined function to return just the digits from a string. But if you don't want to create a user defined function...
This can be done in native MySQL. This approach is a bit unwieldy, but it is workable for strings of "reasonable" length.
SELECT CONCAT(IF(SUBSTR(sp.phone,1,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,1,1),'')
,IF(SUBSTR(sp.phone,2,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,2,1),'')
,IF(SUBSTR(sp.phone,3,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,3,1),'')
,IF(SUBSTR(sp.phone,4,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,4,1),'')
,IF(SUBSTR(sp.phone,5,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,5,1),'')
) AS phone_digits
FROM sp
To unpack that a bit... we extract a single character from the first position in the string, check if it's a digit, if it is a digit, we return the character, otherwise we return an empty string. We repeat this for the second, third, etc. characters in the string. We concatenate all of the returned characters and empty strings back into a single string.
Obviously, the expression above is checking only the first five characters of the string, you would need to extend this, basically adding a line for each position you want to check...
And unwieldy expressions like this can be included in a predicate (in a WHERE clause). (I've just shown it in the SELECT list for convenience.)
MySQL doesn't support such string operations natively. You will either need to use a UDF like this, or else create a stored function that iterates over a string parameter concatenating to its return value every digit that it encounters.

What's the difference between '=' operator and LIKE when not using wildcards

I do this question, because I can't found a question with the same reason. The reason is when I use LIKE, I get CONSISTENT RESULTS, and when I use (=) operator I get INCONSISTENT RESULTS.
THE CASE
I have a BIG VIEW (viewX) with multiple inner joins and left joins, where some columns have null values, because the database definition allows for that.
When I open this VIEW I see for example: 8 rows as result.
When I run for example: select * from viewX where column_int = 34 and type_string = 'xyz', this query shows me 100 rows, that aren't defined in the result of the view. [INCONSISTENT]
BUT
When I run select * from viewX where column_int = 34 and type_string like 'xyz', this query show me only 4 rows, that is defined in the view when I opened (see 1.) [CONSISTENT]
Does anyone idea, of what is happening here?
From the documentation.....
'Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator: '
more importantly (when using LIKE):
'string comparisons are not case sensitive unless one of the operands is a binary string'
from :
http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html
Per the MySQL documentation LIKE does function differently than =, especially when you have trailing or leading spaces.
You need to post your actual query but I'm guessing it's related to the known variances.