Ways MySQL/MariaDB could silently be changing values when storing - mysql

I'm searching for cases in MySQL/MariaDB where the value transmitted when storing will differ from the value that can be retrieved later on. I'm only interested in fields with non-binary string data types like VARCHAR and *TEXT.
I'd like to get a more comprehensive understanding on how much a stored value can be trusted. This would especially be interesting for cases where the output just lacks certain characters (like with the escape character example below) as this is specifically dangerous when validating.
So, this boils down to: Can you create an input string (and/or define an environment) where this doesn't output <value> in the second statement?
INSERT INTO t SET v = <value>, id = 1; // success
SELECT v FROM t WHERE id = 1;
Things I can think of:
strings containing escaping (\a → a)
truncated if too long
character encoding of the table not supporting the input
If something fails silently probably also depends on how strict the SQL mode is set (like with the last two examples).
Thanks a lot in advance for your input!

you can trust that all databases do, what the standards purpose, with strings and integer it is simple, because it saves the binary representation of that number or character in your choosen character set.
Decimal Double and single values are different, because the can't be saved directly and so it comes to fractals see decimal representation
That also follows standards, but you have to account with it.

Related

How to prevent hexadecimal value being converted to scientific notation in MySQL

When saving certain hexadecimal values to our database, they are being converted to what I think is scientific (E) notation. e.g. 558E74 becomes 5.6e76.
I can understand that the number 558E74 is also represented as 5.6e76, but the value is not intended to be a number, so the conversion is not what we're after!
We're currently using:
MySQL 5.7
Column type is char with a max length of 6
How can we prevent the values from being converted? Is there something we can change about the way they are saved? Or should we be using a different column type?
I feel like I might be missing something fairly obvious as this seems like quite a basic question (apologies if so!), but having searched around I was struggling to find any answers.
If its a character destination, quote it, otherwise hexadecimal literal like 0x558E74

Is char or varchar better for a database with utf-8 encoding?

I am making a table of users where I will store all their info: username, password, etc. My question is: Is it better to store usernames in VARCHAR with a utf-8 encoded table or in CHAR. I am asking because char is only 1 byte and utf-8 encodes up to 3 bytes for some characters and I do not know whether I might lose data. Is it even possible to use CHAR in that case or do I have to use VARCHAR?
In general, the rule is to use CHAR encoding under the following circumstances:
You have short codes that are the same length (think state abbreviations).
Sometimes when you have short code that might differ in length, but you can count the characters on one hand.
When powers-that-be say you have to use CHAR.
When you want to demonstrate how padding with spaces at the end of the string causes unexpected behavior.
In other cases, use VARCHAR(). In practice, users of the database don't expect a bunch of spaces at the end of strings.

mysql single quote in arithmatic functions

In mysql, if I do something like
round((amount * '0.75'),2)
it seem to work just fine like without single quotes for 0.75. Is there a difference in how mysql process this?
In the hope to close out this question, here's a link that explains type conversion in expression evaluation: https://dev.mysql.com/doc/refman/5.5/en/type-conversion.html
When an operator is used with operands of different types, type
conversion occurs to make the operands compatible. Some conversions
occur implicitly. For example, MySQL automatically converts numbers to
strings as necessary, and vice versa.
mysql> SELECT 1+'1';
-> 2
In your case, MySQL sees arithmetic and performs implicit conversion on any string contained in the expression. There is going to be an overheard in converting a string to number, but it's negligible. My preference is to explicitly type out a number instead of quoting it. That method has helped me in code clarity and maintainability.

Unsigned Integer to signed Integer in SSIS

I have an SSIS Package which should take data from a Flat File (txt).
One of the fields should be an Unsigned Integer and i should load it to an SQL table.
In the "Flat File Connection Manager Editor" i set the "Format" of the flat file to Fixed width (don't have any delimiters only a spec file with columns lengths.
The field i am talking about should be 4 chars long (according to the definition).
but in some values i get the "}" sign on the 4th char, for example: "010}"
I trusted the definition and tried to load this value into an unsigned integer with no luck.
Does anyone recognize such a formatting?
If you do, how can i load it into the proper data type?
thank you in advanced.
Oren.
There are several things that could be going wrong on your import. First you have to know the encoding of your original file:
How can I detect the encoding/codepage of a text file
The encoding will determine the actual size in bytes of your char and more importantly how each character is stored. You see, a unicode string of four chars can be anywhere from four bytes to 16 bytes(maybe more if you have compound characters) depending on the encoding. An int is usually four bytes(DT_I 4) but ssis offers you up to 32(I think). So when you're loading your unknown number of bytes into the predetermined unsigned int some stuff might be getting truncated and you end up with garbage values.
If your dont know or can't find the encoding I would assume it is UTF-8, but thats really not good practice. This is a little bit about it: http://en.wikipedia.org/wiki/UTF-8
you can also take a look at the unicode character sets for different encodings(UTF-8, UTF-16..) and look for the "}" character and its matching value. That might give you a hint as to why it is showing up.
Then your flat file source should match that type of encoding. Check(or uncheck) the Unicode check box to set it/ or pick a "Code Page" . Then load the value of that column into a string(of the right encoding), not an unsigned int.
Finaly when you know you have the right value, you can use a "Data Conversion" to cast it to an unsigned int, or anything else really.
EDIT: The "Data Conversion" will, as per its name, convert your imported value. That may not work, depending on how the original file was written. The "derived column" cast will be your other option, which won't change the actual value, just tell the compiler to interpret those bits as another type (unsigned int).
If I understand your question right .
one of the ways is to use a "derived Column transformation". Choose to Add a new column in it.
If the data you are fetching is in DT_WSTR data type, you can use the following expression to get rid of '}' by ''. Then type cast it according to the field you want to load. Here I am using (DT_I4)
Then Map the new column to the destination.
(DT_I4) REPLACE(character_expression,searchstring,replacementstring)
Hope It helps.

SQL storing MD5 in char column

I have a column of type char(32) where I want to store an MD5 hash key. The problem is i've used SQL to update the existing records using HashBytes() function which creates values like
:›=k! ©úw"5Ýâ‘<\
but when I do the insert via .NET it comes through as
3A9B3D6B2120A9FA772235DDE2913C5C
What do I need to do to get these to match up? Is it the encoding?
HashKey isn't a SQL function, did you mean HASHBYTES? Some actual code would help. SQL appears to be computing the raw binary hash and displaying it as ASCII characters.
.NET is computing the hash, then converting it to hexadecimal (or so it appears). CHAR(32) isn't a good way to store raw binary data, you would want to use the BINARY type.
An Example in SQL:
SELECT SUBSTRING(sys.fn_varbintohexstr(HASHBYTES('MD5',0x2040)),3, 32)
And an Example in .NET:
using (MD5 md5 = MD5.Create())
{
var data = new byte[] { 0x20, 0x40 };
var hashed = md5.ComputeHash(data);
var hexHash = BitConverter.ToString(hashed).Replace("-", "");
Console.Out.WriteLine("hexHash = {0}", hexHash);
}
These will both produce the same value. (Where 0x2040 is sample data).
You can either store the hexadecimal data as CHAR(32), or as BINARY(16). Storing the Binary data is twice as space efficient than storing it as hex. What you should not be doing is storing the binary data as CHAR(16).
It's not clear what you mean by "when I do the insert via .NET" - but you shouldn't be storing binary data just in a raw form, as it looks like your'e doing using HashKey(). (Do you definitely mean HashKey by the way? I can't find a reference for it, but there's HashBytes...)
Two common options are to encode the raw binary data as hex - which it looks like you're doing in the second case - or to use base64. Either way should be easy from .NET (Base64 marginally easier, using Convert.ToBase64String) and you probably just need to find the equivalent SQL Server function.
MD5 is typically stored as in hex encoding. I'd guess that your hashkey() SQL function is not hex encoding the MD5 hash, rather it's just returning the ASCII characters representing the hash. But your .NET method is HEX encoding. If you store your MD5 hashing consistently as HEX (or not - up to you but usually stored as HEX), then the results between the two should always be consistent.
For example, the : symbol from your SQL hash is the first character returned from HashKey(). In the .NET method, the first 2 characters are 3A. 31 is 51 in decimal. ASCII code 51 is the colon (:) character. Similarly, you can work your way through each other character, and do the HEX conversion.
See any ASCII codes table for reference, i.e. http://www.asciitable.com/