Unexpected result in WHERE clause on AI ID field - mysql

I have a table which's name is users in my MySQL database, and I am using this DB with Ruby on Rails application with ORM structure for years. The table has id field and this field is configured as AI (auto-increment), BIGINT.
Example of my users table;
+----+---------+
| id | name |
+----+---------+
| 1 | John |
| 2 | Tommy |
| 3 | ... |
| 4 | ... |
| 5 | ... |
| 6 | ... |
+----+---------+
The problem I am facing is when I execute the following query I get unexpected rows.
SELECT * FROM users WHERE id = '1AW3F4SEFR';
This query is returning the exact same value with the following query,
SELECT * FROM users WHERE id = 1;
I do not know why SQL let me use strings in WHERE clause on a data type INT. And as we can see from the example, my DB converts the strings I gave to the integer at position 0. I mean, I search for 1AW3F4SEFR and I expect not to get any result. But SQL statement returns the results for id = 1.
In Oracle SQL, the behavior of this exact same query is completely different. So, I believe there is something different on MySQL. But I am not sure about what causes this.

As has been explained in the request comments, MySQL has a weird way of converting strings to numbers. It simply takes as much of a string from the left as is numeric and ignores the rest. If the string doesn't start with a number the conversion defaults to 0.
Examples: '123' => 123, '12.3' => 12.3, '.123' => 0.123, '12A3' => 12, 'A123' => 0, '.1A1.' => 0.1
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=55cd18865fad4738d03bf28082217ca8
That MySQL doesn't raise an error here as other DBMS do, can easily lead to undesired query results that get a long time undetected.
The solution is easy though: Don't let this happen. Don't compare a numeric column with a string. If the ID '1AW3F4SEFR' is entered in some app, raise an error in the app or even prevent this value from being entered. When running the SQL query, make sure to pass a numeric value, so '1AW3F4SEFR' cannot even make it into the DBMS. (Look up how to use prepared statements and pass parameters of different types to the database system in your programming language.)
If for some reason you want to pass a string for the ID instead (I cannot think of any such reason though) and want to make your query fail-safe by not returning any row in case of an ID like '1AW3F4SEFR', check whether the ID string represents an integer value in the query. You can use REGEXP for this.
SELECT * FROM users WHERE id = #id AND #id REGEXP '^[0-9]+$';
Thus you only consider integer ID strings and still enable the DBMS to use an index when looking up the ID.
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=56f8ee902342752933c20b8762f14dbb

Related

MySQL select from INT column

I was doing some system testing and expecting empty results from MySQL(5.7.21) but got surprised to get results.
My transactions table looks like this:
Column Data type
----------------------------
id | INT
fullnames | VARCHAR(40)
---------------------------
And I have some records
--------------------------------
id | fullnames
--------------------------------
20 | Mutinda Boniface
21 | Boniface M
22 | Some-other Guy
-------------------------------
My sample queries:
select * from transactions where id = "20"; -- gives me 1 record which is fine
select * from transactions where id = 20; -- gives me 1 record - FINE as well
Now it gets interesting when I try with these:
select * from transactions where id = "20xxx"; -- gives me 1 record - what is happening here?
What does MySQL do here??
MySQL plays fast and loose with type conversions. When implicitly converting a char to a number, it will take characters from the beginning of the string as long as they are digits, and ignore the rest. In your example, xxx aren't digits, so MySQL only takes the initial "20".
One way around this (which is horrible for performance, since you lose the usage on the index you may have on your column), is to explicitly cast the numeric side to a character:
SELECT * FROM transactions WHARE (CAST id AS CHAR) = 20;
EDIT:
Referencing the discussion about performance from the comments - performing the cast to a number on the client-side is probably the best approach, as it will allow you to avoid sending queries to the database when you know no rows should be returned (i.e., when your input is not a valid number, such as "20x").
An alternative hack could be to cast the input to a number and back again to a string, and compare the lengths. If the lengths are the same it means the input string was fully converted into a number and no characters were omitted. This should be OK WRT performance, since this comparison is performed on an inputted string, not on a value from the column, and the column's index can still be used if the condition passes the short-circuit evaluation of the input:
SELECT *
FROM transactions
WHERE LENGTH(:input) = LENGTH(CAST(:input AS SIGNED)) AND id = :input;

How to store and evaluate dynamic expressions in MySQL(or any other SQL)

Best way to store a dynamic expression in a table for each row for a searching module.
The expression is dynamic and can have multiple fields which are being compared.
I considered creating a separate column for each type of field and fattening out complex nested logic by getting all possible combinations using dnf and storing them in my table. The disadvantages of doing that is for every new logic and expression, a new column has to be created which would lead to a large table which has too many NULLS in it and also adding a new column would take time & refactoring(we are talking about more than 800 columns here).
The alternate approach which I think would work better is below->
I want to discuss if there are better way to this, and if not, how can we improve and achieve the below suggested approach.
| id | expression | diagnosis |
|------|------------------------------------------------|-------------|
| 1 |`p.age>12 and p.gender==Male` | diseaseA |
| 2 |`p.age>50 and p.bp>20` | diseaseB |
| 3 |`p.age<20 and p.bp<20` | diseaseC |
| 4 |`p.age<30 and p.age>20 and (p.bp<30 or p.bp>50)`| diseaseD |
I want to search in this table, for a patient p with certain properties (age=*something*,bp=*something*,etc).
The resulting rows should return all rows which satisfy the expression and also rows which partially match the expression(i.e the rows which are using properties not supplied in the search criteria).
For example for a search for patient p(age=22,bp=15), the search result should be
| id | disease |
|------|-------------|
| 1 | diseaseA |
| 3 | diseaseC |
| 4 | diseaseD |
Since I am new to SQL, the (newbie) way I think I can do this is
First get all the rows(in-memory would be costly, lets discuss what is best possible way to execute the below said functionality in point 2 row-by-row)
Then row-by-row transform the expression to a logical executable expression(which is later executed using eval) using regex matching & replacement(I hope there is a better way than this) for the search criteria(i.e. substituting the patient details) [in my example for the 2nd row, the expression p.age>50 and p.bp>20 gets converted to "22>50 && 15>20"]
All the rows for which the result of transforming & executing the result was true(or partially matched) should be returned.
The language is not an issue as I would be starting this project from scratch and can use any language
I can answer for MySQL.
First of all, you'll have to write all of your sql code inside sql procedure.
Generally you are interestedin dynamic SQL
https://dev.mysql.com/doc/refman/5.7/en/sql-syntax-prepared-statements.html
So a straight-forward approach is to open a a cursor for your table with expressions and for each expression replace p.age with it's actual value and then execute dynamic SQL. (select 22 > 50 and 15 > 20)
Another approach is to loop through expression table (open cursor for it) and as you probably have patient id (not only it's field values) just generate normal sql that selects from patient table (select patient_id from patients where [expression_from_expression_table] and patient_id = [your_known_patient_id])
And the third one that I can imagine is generating a big single query from whole expression table
select group_concat(concat('if(', expression, ',"', diagnosis, '", "") as ', diagnosis) separator ',') from expressions into somevar;
and then doing replace of p.* with actual values and executing second query:
set somevar = replace(somevar, 'p.age', '15');
...
#qry = concat('select ', somevar);
PREPARE qry FROM #qry;
EXECUTE qry;
The third approach is fastest to my mind but will require aditional work on client as you will recieve diagnosis as columns, not as rows.
But hope you get the general idea.

Get row from mysql using specific value with regexp ( json string )

I'm storing permissions into DB with Array JSON String, and i want select them by permission specific permission. at this time I'm selecting them like this:
1 | Dog | [3,4]
2 | Cat | [33,4]
3 | Tiger | [5,33,4]
4 | wolf | [3,5]
SELECT * FROM `pages` WHERE access REGEXP '([^"])3([^"])'
it works but not as it should work. This query gives me all records which contains 3 but also it gives which contains 33. my question is how i must format my regexp to get row by specific value into json string.
p.s i have mysql 5.5 so as i know on this version json functions is not supported
If you only have numbers in the fields, you can alter your regexp to only take values where the string you are looking for (here the '3') does not have another number immediately close to it :
SELECT * FROM `pages` WHERE access REGEXP '([^"0-9])3([^"0-9])'
REGEXP '[[:<:]]3[[:>:]]'
That is, use the "word boundary" thingies.

Compare two DNA-like strings with MySQL

I'm trying to find a way to compare two DNA-like strings with MySQL, stored functions are no problem. Also the string may be changed, but needs to have the following format: [code][id]-[value] like C1-4. (- may be changed aswell)
Example of the string:
C1-4,C2-5,C3-9,S5-2,S8-3,L2-4
If a value not exists in the other string, for example S3-1 it will score 10 (max value). If the asked string has C1-4 and the given string has C1-5 the score has to be 4 - 5 = -1 and if the asked string is C1-4 and the given string has C1-2 the score has to be 4 - 2 = 2.
The reason for a this is that my realtime algorithm is getting slow with 10.000 results. (already optimized with stored functions, indexes, query optimalizations) Because 10.000 x small and quick queries will make a lot.
And the score has to be calculated before I can order my query and get the right limit.
Thanks and if you have any questions let me know by comment.
** EDIT **
I'm thinking that it's also possible to not use a string but a table where the DNA-bits are stored as a 1-n relation table.
ID | CODE | ID | VALUE
----------------------
1. | C... | 2. | 4....

Mysql: how to structure this data and search it

I'm new to mysql. Right now, I have this kind of structure in mysql database:
| keyID | Param | Value
| 123 | Location | Canada
| 123 | Cost | 34
| 123 | TransportMethod | Boat
...
...
I have probably like 20 params with unique values for each Key ID. I want to be able to search in mysql given the 20 params with each of the values and figure out which keyID.
Firstly, how should I even restructure mysql database? Should I have 20 param columns + keyID?
Secondly, (relates to first question), how would I do the query to find the keyID?
If your params are identical across different keys (or all params are a subset of some set of params that the objects may have), you should structure the database so that each column is a param, and the row corresponds to one KeyID and the values of its params.
|keyID|Location|Cost|TransportMethod|...|...
|123 |Canada |34 |Boat ...
|124 | ...
...
Then to query for the keyID you would use a SELECT, FROM, and WHERE statement, such as,
SELECT keyID
FROM key_table
WHERE Location='Canada'
AND Cost=34
AND TransportMethod='Boat'
...
for more info see http://www.w3schools.com/php/php_mysql_where.asp
edit: if your params change across different objects (keyIDs) this will require a different approach I think
The design you show is called Entity-Attribute-Value. It breaks many rules of relational database design, and it's very hard to use with SQL.
In a relational database, you should have a separate column for each attribute type.
CREATE TABLE MyTable (
keyID SERIAL PRIMARY KEY,
Location VARCHAR(20),
Cost NUMERIC(9,2),
TransportMethod VARCHAR(10)
);
I agree that Nick's answer is probably best, but if you really want to keep your key/value format, you could accomplish what you want with a view (this is in PostgreSQL syntax, because that's what I'm familiar with, but the concept is the same for MySQL):
CREATE OR REPLACE VIEW myview AS
SELECT keyID,
MAX(CASE WHEN Param = 'Location' THEN Value END) AS Location,
MAX(CASE WHEN Param = 'Cost' THEN Value END) AS Cost,
....
FROM mytable;
Performance here is likely to be dismal, but if your queries are not frequent, it could get the job done.