SQLite ORDER BY variable from form [duplicate] - mysql

The standard approach for using variable values in SQLite queries is the "question mark style", like this:
import sqlite3
with sqlite3.connect(":memory:") as connection:
connection.execute("CREATE TABLE foo(bar)")
connection.execute("INSERT INTO foo(bar) VALUES (?)", ("cow",))
print(list(connection.execute("SELECT * from foo")))
# prints [(u'cow',)]
However, this only works for substituting values into queries. It fails when used for table or column names:
import sqlite3
with sqlite3.connect(":memory:") as connection:
connection.execute("CREATE TABLE foo(?)", ("bar",))
# raises sqlite3.OperationalError: near "?": syntax error
Neither the sqlite3 module nor PEP 249 mention a function for escaping names or values. Presumably this is to discourage users from assembling their queries with strings, but it leaves me at a loss.
What function or technique is most appropriate for using variable names for columns or tables in SQLite? I'd would strongly prefer to do able to do this without any other dependencies, since I'll be using it in my own wrapper.
I looked for but couldn't find a clear and complete description of the relevant part of SQLite's syntax, to use to write my own function. I want to be sure this will work for any identifier permitted by SQLite, so a trial-and-error solution is too uncertain for me.
SQLite uses " to quote identifiers but I'm not sure that just escaping them is sufficient. PHP's sqlite_escape_string function's documentation suggests that certain binary data may need to be escaped as well, but that may be a quirk of the PHP library.

To convert any string into a SQLite identifier:
Ensure the string can be encoded as UTF-8.
Ensure the string does not include any NUL characters.
Replace all " with "".
Wrap the entire thing in double quotes.
Implementation
import codecs
def quote_identifier(s, errors="strict"):
encodable = s.encode("utf-8", errors).decode("utf-8")
nul_index = encodable.find("\x00")
if nul_index >= 0:
error = UnicodeEncodeError("NUL-terminated utf-8", encodable,
nul_index, nul_index + 1, "NUL not allowed")
error_handler = codecs.lookup_error(errors)
replacement, _ = error_handler(error)
encodable = encodable.replace("\x00", replacement)
return "\"" + encodable.replace("\"", "\"\"") + "\""
Given a string single argument, it will escape and quote it correctly or raise an exception. The second argument can be used to specify any error handler registered in the codecs module. The built-in ones are:
'strict': raise an exception in case of an encoding error
'replace': replace malformed data with a suitable replacement marker, such as '?' or '\ufffd'
'ignore': ignore malformed data and continue without further notice
'xmlcharrefreplace': replace with the appropriate XML character reference (for encoding only)
'backslashreplace': replace with backslashed escape sequences (for encoding only)
This doesn't check for reserved identifiers, so if you try to create a new SQLITE_MASTER table it won't stop you.
Example Usage
import sqlite3
def test_identifier(identifier):
"Tests an identifier to ensure it's handled properly."
with sqlite3.connect(":memory:") as c:
c.execute("CREATE TABLE " + quote_identifier(identifier) + " (foo)")
assert identifier == c.execute("SELECT name FROM SQLITE_MASTER").fetchone()[0]
test_identifier("'Héllo?'\\\n\r\t\"Hello!\" -☃") # works
test_identifier("北方话") # works
test_identifier(chr(0x20000)) # works
print(quote_identifier("Fo\x00o!", "replace")) # prints "Fo?o!"
print(quote_identifier("Fo\x00o!", "ignore")) # prints "Foo!"
print(quote_identifier("Fo\x00o!")) # raises UnicodeEncodeError
print(quote_identifier(chr(0xD800))) # raises UnicodeEncodeError
Observations and References
SQLite identifiers are TEXT, not binary.
SQLITE_MASTER schema in the FAQ
Python 2 SQLite API yelled at me when I gave it bytes it couldn't decode as text.
Python 3 SQLite API requires queries be strs, not bytes.
SQLite identifiers are quoted using double-quotes.
SQL as Understood by SQLite
Double-quotes in SQLite identifiers are escaped as two double quotes.
SQLite identifiers preserve case, but they are case-insensitive towards ASCII letters. It is possible to enable unicode-aware case-insensitivity.
SQLite FAQ Question #18
SQLite does not support the NUL character in strings or identifiers.
SQLite Ticket 57c971fc74
sqlite3 can handle any other unicode string as long as it can be properly encoded to UTF-8. Invalid strings could cause crashes between Python 3.0 and Python 3.1.2 or thereabouts. Python 2 accepted these invalid strings, but this is considered a bug.
Python Issue #12569
Modules/_sqlite/cursor.c
I tested it a bunch.

The psycopg2 documentation explicitly recommends using normal python % or {} formatting to substitute in table and column names (or other bits of dynamic syntax), and then using the parameter mechanism to substitute values into the query.
I disagree with everyone who is saying "don't ever use dynamic table/column names, you're doing something wrong if you need to". I write programs to automate stuff with databases every day, and I do it all the time. We have lots of databases with lots of tables, but they are all built on repeated patterns, so generic code to handle them is extremely useful. Hand-writing the queries every time would be far more error prone and dangerous.
It comes down to what "safe" means. The conventional wisdom is that using normal python string manipulation to put values into your queries is not "safe". This is because there are all sorts of things that can go wrong if you do that, and such data very often comes from the user and is not in your control. You need a 100% reliable way of escaping these values properly so that a user cannot inject SQL in a data value and have the database execute it. So the library writers do this job; you never should.
If, however, you're writing generic helper code to operate on things in databases, then these considerations don't apply as much. You are implicitly giving anyone who can call such code access to everything in the database; that's the point of the helper code. So now the safety concern is making sure that user-generated data can never be used in such code. This is a general security issue in coding, and is just the same problem as blindly execing a user-input string. It's a distinct issue from inserting values into your queries, because there you want to be able to safely handle user-input data.
So my recommendation is: do whatever you want to dynamically assemble your queries. Use normal python string templating to sub in table and column names, glue on where clauses and joins, all the good (and horrible to debug) stuff. But make sure you're aware that whatever values such code touches has to come from you, not your users[1]. Then you use SQLite's parameter substitution functionality to safely insert user-input values into your queries as values.
[1] If (as is the case for a lot of the code I write) your users are the people who have full access to databases anyway and the code is to simplify their work, then this consideration doesn't really apply; you probably are assembling queries on user-specified tables. But you should still use SQLite's parameter substitution to save yourself from the inevitable genuine value that eventually contains quotes or percent signs.

If you're quite certain that you need to specify column names dynamically, you should use a library that can do so safely (and complains about things that are wrong). SQLAlchemy is very good at that.
>>> import sqlalchemy
>>> from sqlalchemy import *
>>> metadata = MetaData()
>>> dynamic_column = "cow"
>>> foo_table = Table('foo', metadata,
... Column(dynamic_column, Integer))
>>>
foo_table now represents the table with the dynamic schema, but you can only use it in the context of an actual database connection (so that sqlalchemy knows the dialect, and what to do with the generated sql).
>>> metadata.bind = create_engine('sqlite:///:memory:', echo=True)
You can then issue the CREATE TABLE .... with echo=True, sqlalchemy will log the generated sql, but in general, sqlalchemy goes out of its way to keep the generated sql out of your hands (lest you consider using it for evil purposes).
>>> foo_table.create()
2011-06-28 21:54:54,040 INFO sqlalchemy.engine.base.Engine.0x...2f4c
CREATE TABLE foo (
cow INTEGER
)
2011-06-28 21:54:54,040 INFO sqlalchemy.engine.base.Engine.0x...2f4c ()
2011-06-28 21:54:54,041 INFO sqlalchemy.engine.base.Engine.0x...2f4c COMMIT
>>>
and yes, sqlalchemy will take care of any column names that need special handling, like when the column name is a sql reserved word
>>> dynamic_column = "order"
>>> metadata = MetaData()
>>> foo_table = Table('foo', metadata,
... Column(dynamic_column, Integer))
>>> metadata.bind = create_engine('sqlite:///:memory:', echo=True)
>>> foo_table.create()
2011-06-28 22:00:56,267 INFO sqlalchemy.engine.base.Engine.0x...aa8c
CREATE TABLE foo (
"order" INTEGER
)
2011-06-28 22:00:56,267 INFO sqlalchemy.engine.base.Engine.0x...aa8c ()
2011-06-28 22:00:56,268 INFO sqlalchemy.engine.base.Engine.0x...aa8c COMMIT
>>>
and can save you from possible badness:
>>> dynamic_column = "); drop table users; -- the evil bobby tables!"
>>> metadata = MetaData()
>>> foo_table = Table('foo', metadata,
... Column(dynamic_column, Integer))
>>> metadata.bind = create_engine('sqlite:///:memory:', echo=True)
>>> foo_table.create()
2011-06-28 22:04:22,051 INFO sqlalchemy.engine.base.Engine.0x...05ec
CREATE TABLE foo (
"); drop table users; -- the evil bobby tables!" INTEGER
)
2011-06-28 22:04:22,051 INFO sqlalchemy.engine.base.Engine.0x...05ec ()
2011-06-28 22:04:22,051 INFO sqlalchemy.engine.base.Engine.0x...05ec COMMIT
>>>
(apparently, some strange things are perfectly legal identifiers in sqlite)

The first thing to understand is that table/column names cannot be escaped in the same sense than you can escape strings stored as database values.
The reason is that you either have to:
accept/reject the potential table/column name, i.e. it is not guaranteed that a string is an acceptable column/table name, contrarily to a string to be stored in some database; or,
sanitize the string which will have the same effect as creating a digest: the function used is surjective, not bijective (once again, the inverse is true for a string that is to be stored in some database); so not only can't you be certain of going from the sanitized name back to the original name, but you are at risk of unintentionally trying to create two columns or tables with the same name.
Having understood that, the second thing to understand is that how you will end up "escaping" table/column names depends on your specific context, and so there is more than one way to do this, but whatever the way, you'll need to dig up to figure out exactly what is or is not an acceptable column/table name in sqlite.
To get you started, here is one condition:
Table names that begin with "sqlite_" are reserved for internal use. It is an error to attempt to create a table with a name that starts with "sqlite_".
Even better, using certain column names can have unintended side effects:
Every row of every SQLite table has a 64-bit signed integer key that
uniquely identifies the row within its table. This integer is usually
called the "rowid". The rowid value can be accessed using one of the
special case-independent names "rowid", "oid", or "rowid" in place
of a column name. If a table contains a user defined column named
"rowid", "oid" or "rowid", then that name always refers the
explicitly declared column and cannot be used to retrieve the integer
rowid value.
Both quoted texts are from http://www.sqlite.org/lang_createtable.html

From the sqlite faq, question 24 (the formulation of the question of course does not give a clue that the answer may be useful to your question):
SQL uses double-quotes around identifiers (column or table names) that contains special characters or which are keywords. So double-quotes are a way of escaping identifier names.
If the name itself contains double quotes, escape that double quote with another one.

Placeholders are only for values. Column and table names are structural, and are akin to variable names; you can't use placeholders to fill them in.
You have three options:
Appropriately escape/quote the column name everywhere you use it. This is fragile and dangerous.
Use an ORM like SQLAlchemy, which will take care of escaping/quoting for you.
Ideally, just don't have dynamic column names. Tables and columns are for structure; anything dynamic is data and should be in the table rather than part of it.

I made some research because I was unsatisfied with the current unsafe answers, and I would recommend using the internal printf function of sqlite to do that. It is made to escape any identifier (table name, column table...) and make it safe for concatenation.
In python, it should be something like that (I'm not a python user, so there may be mistakes, but the logic itself works):
table = "bar"
escaped_table = connection.execute("SELECT printf('%w', ?)", (table,)).fetchone()[0]
connection.execute("CREATE TABLE \""+escaped_table+"\" (bar TEXT)")
According to the documentation of %w:
This substitution works like %q except that it doubles all double-quote characters (") instead of single-quotes, making the result suitable for using with a double-quoted identifier name in an SQL statement.
The %w substitution is an SQLite enhancements, not found in most other printf() implementations.
Which means you can alternatively do the same with single quotes using %q:
table = "bar"
escaped_table = connection.execute("SELECT printf('%q', ?)", (table,)).fetchone()[0]
connection.execute("CREATE TABLE '"+escaped_table+"' (bar TEXT)")

If you find that you need a variable entity name (either relvar or field) then you probably are doing something wrong. an alternative pattern would be to use a property map, something like:
CREATE TABLE foo_properties(
id INTEGER NOT NULL,
name VARCHAR NOT NULL,
value VARCHAR,
PRIMARY KEY(id, name)
);
Then, you just specify the name dynamically when doing an insert instead of a column.

Related

Using django query set values() to index into JSONField

I am using django with postgres, and have a bunch of JSON fields (some of them quite large and detailed) within my model. I'm in the process of switching from char based ones to jsonb fields, which allows me to filter on a key within the field, and I'm wondering if there is any way to get the equivalent benefit out of a call to the query set values method.
Example:
What I would like to do, given a Car model with options JSONField, is something like
qset = Car.objects.filter(options__interior__color='red')
vals = qset.values('options__interior__material')
Please excuse the lame toy problem, but hopefully it gets the idea across. Here the filter call does exactly what I want, but the call to values does not seem to be aware of the special nature of the JSON field. I get an error because values can't find the field called "interior" to join on. Is there some other syntax or option that I am missing that will make this work?
Seems like a pretty obvious extension to the existing functionality, but I have so far failed to find any reference to something similar in the docs or through stack overflow or google searches.
Edit - a workaround:
After playing around, looks like this could be fudged by inserting the following in between the two lines of code above:
qset=qset.annotate(options__interior__material=RawSQL("SELECT options->'interior'->'material'",()))
I say "fudged" because it seems like an abuse of notation and would require special treatment for integer indices.
Still hoping for a better answer.
I can suggest a bit cleaner way with using:
django's Func
https://docs.djangoproject.com/en/2.0/ref/models/expressions/#func-expressions
and postgres function jsonb_extract_path_text https://www.postgresql.org/docs/9.5/static/functions-json.html
from django.db.models import F, Func, CharField, Value
Car.objects.all().annotate(options__interior__material =
Func(
F('options'),
Value('interior'),
Value('material'),
function='jsonb_extract_path_text'
),
)
Perhaps a better solution (for Django >= 1.11) is to use something like this:
from django.contrib.postgres.fields.jsonb import KeyTextTransform
Car.objects.filter(options__interior__color='red').annotate(
interior_material=KeyTextTransform('material', KeyTextTransform('interior', 'options'))
).values('interior_material')
Note that you can nest KeyTextTransform expressions to pull out the value(s) you need.
Car.objects.extra(select={'interior_material': "options#>'{interior, material}'"})
.filter(options__interior__color='red')
.values('interior_material')
You can utilize .extra() and add postgres jsonb operators
Postgres jsonb operators: https://www.postgresql.org/docs/9.5/static/functions-json.html#FUNCTIONS-JSON-OP-TABLE

Badly Formed hexadecimal uuid string error in Django fixture; json uuid conversion fails issue

File "/home/malikarumi/Projects/cannon/local/lib/python2.7/site-packages/django/db/models/fields/__init__.py", line 2390, in get_db_prep_value
value = uuid.UUID(value)
File "/usr/lib/python2.7/uuid.py", line 134, in __init__
raise ValueError('badly formed hexadecimal UUID string')
ValueError: Problem installing fixture '/home/malikarumi/Projects/cannon/jamf/essell/fixtures/test22byhand.json': badly formed hexadecimal UUID string
I've found the following links so far:
https://github.com/dcramer/django-uuidfield/issues/40
https://github.com/dcramer/django-uuidfield/commit/caae1bc4e45445a06dd11bb22da6a9f07395f78a
Django UUIDField modelfield causes error in Django admin: badly formed hexadecimal UUID string
Django Primary Key: badly formed hexadecimal UUID string
I counted my uuidfield value. It is len=36, because it has dashes in it. At least the string representation I can see is that way. So I replaced it with the same alphanumeric without dashes, as suggested as a test by the bugfix, but I still got the same result.
I checked the model, but there is no max length on any uuid field, nor on the fk link back to the uuid. There's nothing on the fk to suggest it is, or should be limited to, chars, ints, uuids, etc.
Then I found this: http://arthurpemberton.com/2015/04/fixing-uuid-is-not-json-serializable which I hacked into /python2.7/site-packages/django/core/serializers/python.py. The blogger had put it into models.py. But I got the same error, before realizing it was NOT coming from serializers/python.py, as it was yesterday, but from /usr/lib/python2.7/uuid.py, line 134, in init. the relevant portions of that code are:
if hex is not None:
hex = hex.replace('urn:', '').replace('uuid:', '')
hex = hex.strip('{}').replace('-', '')
if len(hex) != 32:
raise ValueError('badly formed hexadecimal UUID string')
int = long(hex, 16)
Rather than try to hack more core code, given that the indication is the problem is json, not Python, I left this alone for now.
Finally, I looked at this:
https://code.djangoproject.com/ticket/24012
It is stated a couple of times here that Django's "UUIDField generates UUIDs in Python". Now here is some history. I created one row, a single instance of Model A into Django with a fixture that had no uuid and no datefield and had no issues. (The uuidfield is on an abstract model, so it is created when the object is created). I did that because I needed the uuid of that Model A instance for a fk field in Model B, which is the one I am struggling with now. I did that by copy pasting the Model A uuid into the fk field on Model B in a csv file which I then converted to json in order to use it as a fixture.
Is it possible that the uuid ran into problems in this copy paste maneuver, before the conversion to json?
If not, that means even though it was an acceptable Python object when it was created, going thru the json conversion messed it up, correct?
If that's the case, what is a workaround?
Can the Arthur Pemberton code be made to work somewhere else in this process?
If I leave the uuid off, I can probably make this work, but then I have to go back and put the all the fk uuid's in manually. Is there a better solution? Maybe a bulk insert of that field alone?
This may be a recurring issue for me, because I am also using Scrapy, which supports but does not require json. None of my scraped items will come with uuid, but how do I automate adding their fk's into my process in order to get them into Django?
Or is all of this a good reason to forget uuids altogether?
Thanks.
EDIT/UPDATE per #rolf:
Since I just discovered that the django shell differs more than I realized (the shell can find settings, the regular interpreter can't) I decided to run this once in each one, but the results were the same.
(cannon)malikarumi#Tetuoan2:~/Projects/cannon/jamf$ python manage.py shell
Python 2.7.10 (default, Oct 14 2015, 16:09:02)
IPython 4.0.3 -- An enhanced Interactive Python.
In [1]: uuid.UUID(a82857b6-e336-4c6c-8499-47601770b39d)
File "<ipython-input-1-e282858da374>", line 1
uuid.UUID(a82857b6-e336-4c6c-8499-47601770b39d)
^
SyntaxError: invalid syntax
In [2]: uuid.UUID(a0a69415-6627-43db-8c7a-b57d0c4cefe2)
File "<ipython-input-2-befebf1573ba>", line 1
uuid.UUID(a0a69415-6627-43db-8c7a-b57d0c4cefe2)
^
SyntaxError: invalid syntax
In [3]: uuid.UUID(e6e11b06-ea3b-4e98-a31f-9a83447ad884)
File "<ipython-input-3-a59ea095e61a>", line 1
uuid.UUID(e6e11b06-ea3b-4e98-a31f-9a83447ad884)
^
SyntaxError: invalid syntax
In [4]: uuid.UUID(bd116432-65d7-4612-abfe-9a99dcaf5cad)
File "<ipython-input-4-c4a04434aa3c>", line 1
uuid.UUID(bd116432-65d7-4612-abfe-9a99dcaf5cad)
^
SyntaxError: invalid syntax
Now that I have posted this, I notice that even Stack Overflow treats these uuid differently, i.e., the way they are colored, if that's relevant and meaningful here.
But now that we know this, what do we do with / about it?
2nd Update
This morning I thought, what about a uuid that had never been anywhere but in Django? So here's what I did:
In [5]: e.uuid
Out[5]: UUID('61877565-5fe5-4175-9f2b-d24704df0b74')
In [6]: uuid.UUID(61877565-5fe5-4175-9f2b-d24704df0b74)
File "<ipython-input-6-56137f5f4eb6>", line 1
uuid.UUID(61877565-5fe5-4175-9f2b-d24704df0b74)
^
SyntaxError: invalid syntax
In [7]: uuid.UUID('61877565-5fe5-4175-9f2b-d24704df0b74')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-3b4d3e5bd156> in <module>()
----> 1 uuid.UUID('61877565-5fe5-4175-9f2b-d24704df0b74')
NameError: name 'uuid' is not defined
This is apparently because I left the quote around the alphanumeric, but why that would generate a uuid not defined error, instead of 'string type' or some such error is beyond me.
In [8]: uuid.UUID(61877565-5fe5-4175-9f2b-d24704df0b74)
File "<ipython-input-8-56137f5f4eb6>", line 1
uuid.UUID(61877565-5fe5-4175-9f2b-d24704df0b74)
^
SyntaxError: invalid syntax
The first time I keyed in the characters by hand. I decided to repeat the test by copying and pasting, but as you can see, it made no difference. If there was something weird about the way only the 5 that the caret is pointing to was generated, we might be on to something, but if so, why do I get the same error in the same place when I typed it in by hand myself?
This no longer seems like a json issue to me, since – as far as I know – json has never touched this uuid, unless it did somehow in the internal workings of Django.
Instead, there is either
1. something wrong with the way uuid.UUID generates uuids, or
2. the way it generates them on my system, (Ubuntu 15.10, Django 1.9.1, Python 2.7.10) or
3. the way it reads and evaluates them when they come back, like in uuid.UUID() or being input outside the internal, automatic uuid generation process.
But that also means people using uuid.UUID() to generate uuids will never know there is an issue unless they do what I did, which is try to bring them in from outside. I remember reading somewhere that all uuids are supposed to be compatible. So, unless someone here has a better insight, I think we might be up for a bug report. But is it a Python bug, a Django bug, or both?
Your syntax is wrong:
uuid.UUID('61877565-5fe5-4175-9f2b-d24704df0b74') # note the quotes

MUMPS can't format Number to String

I am trying to convert larg number to string in MUMPS but I can't.
Let me explain what I would like to do :
s A="TEST_STRING#12168013110012340000000001"
s B=$P(A,"#",2)
s TAB(B)=1
s TAB(B)=1
I would like create an array TAB where variable B will be a primary key for array TAB.
When I do ZWR I will get
A="TEST_STRING#12168013110012340000000001"
B="12168013110012340000000001"
TAB(12168013110012340000000000)=1
TAB("12168013110012340000000001")=1
as you can see first SET recognize variable B as a number (wrongly converted) and second SET recognize variable B as a string ( as I would like to see ).
My question is how to write SET command to recognize variable B as a string instead of number ( which is wrong in my opinion ).
Any advice/explanation will be helpful.
This may be a limitation of sorting/storage mechanism built into MUMPS and is different between different MUMPS implementations. The cause is that while variable values in MUMPS are non typed, index values are -- and numeric indices are sorted before string ones. When converting a large string to number, rounding errors may occur. To prevent this from happening, you need to add a space before number in your index to explicitly treat it as string:
s TAB(" "_B)=1
As far as I know, Intersystems Cache doesn't have this limitation -- at least your code works fine in Cache and in documentation they claim to support up to 309 digits:
http://docs.intersystems.com/cache20141/csp/docbook/DocBook.UI.Page.cls?KEY=GGBL_structure#GGBL_C12648
I've tried to recreate your scenario, but I am not seeing the issue you're experiencing.
It actually is not possible ( in my opinion ) for the same command executed immediately ( one execution after another) to produce two different results.
s TAB(B)=1
s TAB(B)=1
for as long the value of B did not change between the executions, the result should be:
TAB("12168013110012340000000001")=1
Example of what GT.M implementation of MUMPS returns in your case

Insert a Binary string into MYSQL varchar column

As we all know, we will use the mysql_query api to send a query to the server, and the query are passed by a string as the parameter. And we will have to formulate the string outside the mysql_query called by some C functions like sprintf.For example,
sprintf(buffer, “insert into table(describe) values(‘%s’)”, strA);
mysql_query(..., buffer);
The ‘describe’ is a VARCHAR(150).
In some special cases, one of our functions will cat several C style string into a long one remaining all the ending ‘\0’ to form a binary, ie in C form catting “abc” and “efg” into “abc\0efg\0”, of course with the length given out to the caller(in this case, it is 8). However, the out binary can NEVER be used in the sprintf above as strA, as the C functions will truncate the string by meeting the first ‘\0’.
Is there anything special we can do to fulfill our needs? We want to insert a binary into a column defined as VARCHAR. We have tried to change all the ‘\0’ into ‘\0’ literally, which seems to work good but time and codes consuming. Is there any alternative easier method?
Thanks in advance.
you should use mysql_real_escape_string() to escape this string.

MySQL integer comparison ignores trailing alpha characters

So lets just say I have a table with just an ID is an int. I have discovered that running:
SELECT *
FROM table
WHERE ID = '32anystring';
Returns the row where id = 32. Clearly 32 != '32anystring'.
This seems very strange. Why does it do this? Can it be turned off? What is the best workaround?
It is common behavior in most programming languages to interpret leading numerals as a number when converting a string to a number.
There are a couple of ways to handle this:
Use prepared statements, and define the placeholder where you are putting the value to be of a numeric type. This will prevent strings from being put in there at all.
Check at a higher layer of the application to validate input and make sure it is numeric.
Use the BINARY keyword in mysql (I'm just guessing that this would work, have never actually tried it as I've always just implemented a proper validation system before running a query) -
SELECT *
FROM table
WHERE BINARY ID = '32anystring';
You need to read this
http://dev.mysql.com/doc/refman/5.1/en/type-conversion.html
When you work on, or compare two different types, one of them will be converted to the other.
MySQL conversion of string->number (you can do '1.23def' * 2 => 2.46) parses as much as possible of the string as long as it is still a valid number. Where the first letter cannot be part of a number, the result becomes 0