Changing nvarchar(MAX) to nvarchar(n) in database - mysql

I just changed a nvarchar(MAX) field in a table to nvarchar(250).
Could someone please tell me what happens to the data if there was an entry larger than 250 characters?
My concern is not with the visible data, but what happens behind the scenes:
What is done to the data which overshoots the limit of that container
of data?
I read in a few places that the table has to be deleted and re created again. Is this true and why? I didn't see any errors which the others received.
Is there a way to recover the truncated data after making this change? (I dont want to do it, but I'm curious)

If you have altered/changed column nvarchar(MAX) field into nvarchar(250) and you did not receive any error, it means that none on rows contains the data more than 250 characters that why SQL server successfully changed the column length and your data is accurate/complete.
If any of row contains more than 250 characters then SQL server will give you an error and alter statement will be failed. It means that data type length will not be changed.
Msg 8152, Level 16, State 13, Line 12 String or binary data would be
truncated. The statement has been terminated.
While altering column length if SET ANSI_WARNINGS OFF then SQL server will change the column length without any warning and extra data will be truncated.
By Default, it is SET ANSI_WARNINGS ON to warn the user.
I think Once data is truncated it can't be recovered later.

The system should prevent you or at least warn you of possible data loss when changing column length if any row exceeds the new length.
Depending on DBMS and version, you may even not be able to change column length.
However, if you don't have any rows exceeding 250, as you said, then there should be no problem.
There is no way to recover truncated data unless you have access to a database backup that's just before the change
On a side note, regardless of what you intend to do with that change, I should suggest to avoid columns of variable-length
MySQL automatically reserves maximum-possible length for a variable-length column, regardless of whether a row is 15 characters or 45 or 250.
This, as you can imagine, eventually leads to bottlenecks in the system.
(Maybe you don't have a database large enough for this to show effects, but my motto is "forewarned is forearmed" )

Related

Best methods to avoid MySQL 1406 errors on VARCHAR

My server is using a MySQL DB, connecting to it via the C++ connector. I'm nearing production and I've been spending some time trying to break things as part of hardening the server.
One action item I had was to see what would happen if I execute a statement with a string that is longer than VARCHAR. For example, if I have a column defined as VARCHAR(4) and then set it to the string "hello".
This of course throws an exception with the error code 1406 (Data too long for column).
What I was wondering was if there was a good or standard way to defend against this? Obviously one thing is to check against the string length and truncate manually. I can do this, however there are many tables and several columns with VARCHAR. So my worry is updating server code if one of the columns using VARCHAR has its length increased (i.e. code maintainability)
Note that the server does do some validation up front. I'm just trying to defend against a subtle bug or corner case that lets something slip through.
A couple of other options on the table are to disable strict so it will give a warning and truncate or to convert VARCHAR to TEXT.
I was wondering a few things.
Is there a recommended method to handle this situation?
What are the disadvantages of disabling strict?
Is it worth (and is it possible) to query the DB at runtime the VARCHAR lengths? Note that I'm using the C++ connector. I suppose I could also write a tool that is run before compiling which would extract out VARCHAR lengths from the SQL code used to generate tables. But that then makes me wonder is I'm over engineering this.
I'm just sorting through the possible approaches now and thought I'd seek advice from those with more experience with MySQL.
As an experience database engineer I would recommend a combination of the follow two strategies:
1) If you that know that a there is a chance, however small, that data for your varchar(4) could go higher than 4 then make the varchar field larger than 4. For example, if you expect that the field can go as high as 8 then set the field to varchar(10). The beauty of using a varchar field instead of a char is that a varchar will only use whatever storage it needs.
2) If there is a real issue with data constantly being larger than the varchar field length then you should right your own exception handler to trap for the 1406 error. For the exception to work properly you will need to come up with some type of strategy on exactly how you want to handle the exception. For example, you could send an error to the user and ask them to fix the problem, you could accept the data but truncated it so it fits into the field, or you could send the error to a log file to get fixed at a later time.

Can I check if data was truncated after query?

If I have a table with some varchar columns, whose lengths will obviously be limited, then I would have to show on the front-end whenever insertion of too large values fails. For example, if the limit on the name column is 20, but someone enters a name that is 30 characters long, I should notify them of the error.
This gets to be a lot of work when the application becomes big.
What I would like, to make life a bit easier, and skip taking care of individual limits for every step of the users' journey, is to just carry on with the normal functioning of the application, but show them a warning that their data was not saved in entirety because it was too long. So if MySQL would provide some method that would allow me to ask if all data was saved in its entirety, or some strings were truncated due to their respective varchar fields being shorter (or maybe a property on the MySQLi object that I can check), then my main method for saving data in the database could always check that after any inserts or updates have been executed and just issue a warning on the next page load.
Does MySQL provide such functionality?
Sure you can. MySQL throws a warning, when data is truncated.
You can check is any warning occured by checking ##warning_count
SELECT ##warning_count;
Or
SHOW COUNT(*) WARNINGS;
To check what warning has occured:
SHOW WARNINGS [LIMIT [offset,] row_count]
More info:
http://dev.mysql.com/doc/refman/5.0/en/show-warnings.html

maximum character size that a mysql query can handle

I am trying to run this query but there are no values being retrieved , I tried to find out the length of characters till which values are returned. Length was 76 characters.
Any suggestions?
SELECT tokenid FROM tokeninfo where tokenNumber = 'tUyXl/Z2Kpua1AvIjcY5tMG+KlEhnt+V/YfnszF5m1+q8ngYvw%L3ZKrq2Kmtz5B8z7fH5BGQXTWAoqFNY8buAhTzjyLFUS64juuvVVzI7Af5UAVOj79JcjKgdNV4KncdcqaijPQAmy9fP1w9ITj7NA==%';
The problem is not the length of the characters you select, but in the characters, which are stored in database field itself. Check the tokenNumber field in your database schema - if it is varchar, or blob or whatever type, what is the length, etc...
You can insert/select pretty much more than 76 characters in any database, but you can get less that 76, as in your case, it depend on how you handle the field they are stored in.
A quick way to see the tokeninfo table properties is to run this query:
SHOW COLUMNS FROM tokeninfo;
If the data types differ from what you expect them to be based on a CREATE TABLE statement, note that MySQL sometimes changes data types when you create or alter a table. The conditions under which this occurs are described in Section 13.1.10.3, Silent Column Specification Changes.
the max size would be limited by the variable max_allowed_packet
so, if you do a
show variables like 'max_allowed_packet'
it will show you the limit. By default, it is set to
1047552 bytes.
If you want to increase that, add a line to the server's
my.cnf file, in the [mysqld] section :
set-variable = max_allowed_packet=2M
and restart mysql server.

Trying to put too much data into mysql TEXT data type

Let's say that I have a html form (actually I have an editor - TinyMCE) which through PHP inserts a bunch of text into Mysql table.
I want to know the following:
If I have TINYTEXT data type in Mysql column - what happens if the user tries to put more text than 255 bytes into Mysql table??
Does the application save first 255 bytes and "cuts off" the rest? Or does nothing get saved into Mysql table and mysql issues a warning?? Or none of the above?
Actually, what I want and intend to do is the following:
Limit the size of user form input by setting the column data type in Mysql to TEXT data type, which can hold maximum of 64 KB of text. I want to limit the amount of text that gets passed from user to database, so that user can't put too much data to the server at once.
So, basically, I want to know what happens, if the user puts more text through TinyMCE editor than 65535 bytes, assuming TEXT data type in mysql table.
MySQL, by default, truncates the data if it's too long, and sends a warning.
SHOW WARNINGS;
Data truncated for foo ..
Just to be clear: the data will be saved, but you will be missing the part that was too large.
Default mysql configuration truncate the data if the value is greater than the maximum table field definition size, this will produce a non blocking warning.
If you want a blocking error you have to set the sql_mode to STRICT_ALL_TABLES
dev.mysql.com/doc/refman/5.0/en/server-sql-mode.html#sqlmode_strict_all_tables
IMHO the best way is to manage this error via applicatin software.
Hope this helps
If you enter too much data to a TEXT field in MySQL it will insert the row anyway but with that field truncated to the maximum length, and issue a warning.
Even if MySQL did prevent the row from being added it would not be a good way of limiting the length of data that a user can enter. You should check the length of the POSTed string in PHP, and not run the query at all if it is too long - and perhaps tell the user why their data wasn't entered.
As well as this you can prevent the user from entering too many characters at the client side (although you should always do the check server side as well because someone could bypass the client side limit). It appears that there is no built-in way of doing this in TinyMCE, but it is possible by writing a callback: Limit the number of character in tinyMCE

What will happen to existing data if I change the collation of a column in MySQL?

I am running a production application with MySQL database server. I forget to set column's collation from latin to utf8_unicode, which results in strange data when saving to the column with multi-language data.
My question is, what will happen with my existing data if I change my collation to utf8_unicode now? Will it destroy or corrupt the existing data or will the data remain, but the new data will be saved as utf8 as it should?
I will change with phpMyAdmin web client.
The article http://mysqldump.azundris.com/archives/60-Handling-character-sets.html discusses this at length and also shows what will happen.
Please note that you are mixing up a CHARACTER SET (actually an encoding) with a COLLATION.
A character set defines the physical representation of a string in bytes on disk. You can make this visible, using the HEX() function, for example SELECT HEX(str) FROM t WHERE id = 1 to see how MySQL stores the bytes of your string. What MySQL delivers to you may be different, depending on the character set of your connection, defined with SET NAMES .....
A collation is a sort order. It is dependent on the character set. For example, your data may be in the latin1 character set, but it may be ordered according to either of the two german sort orders latin1_german1_ci or latin1_german2_ci. Depending on your choice, Umlauts such as ö will either sort as oe or as o.
When you are changing a character set, the data in your table needs to be rewritten. MySQL will read all data and all indexes in the table, make a hidden copy of the table which temporarily takes up disk space, then moves the old table into a hidden location, moves the hidden table into place and then drops the old data, freeing up disk space. For some time inbetween, you will need two times the storage for that.
When you are changing a collation, the sort order of the data changes but not the data itself. If the column you are changing is not part of an index, nothing needs to be done besides rewriting the frm file, and sufficiently recent versions of MySQL should not do more.
When you are changing a collation of a column that is part of an index, the index needs to be rewritten, as an index is a sorted excerpt of a table. This will again trigger the ALTER TABLE table copy logic outlined above.
MySQL tries to preserve data doing this: As long as the data you have can be represented in the target character set, the conversion will not be lossy. Warnings will be printed if there is data truncation going on, and data which cannot be represented in the target character set will be replaced by ?
Running a quick test in MySQL 5.1 with a VARCHAR column set to latin1_bin I inserted some non-latin chars
INSERT INTO Test VALUES ('英國華僑');
I select them and get rubbish (as expected).
SELECT text from Test;
gives
text
????
I then changed the collation of the column to utf8_unicode and re-ran the SELECT and it shows the same result
text
????
This is what I would expect - It will keep the data and the data will remain rubbish, because when the data was inserted the column lost the extra character information and just inserted a ? for each non-latin character and there is no way for the ???? to again become 英國華僑.
Your data will stay in place but it won't be fixed.
Valid data will be properly converted:
When you change a data type using
CHANGE or MODIFY, MySQL tries to
convert existing column values to the
new type as well as possible. Warning:
This conversion may result in
alteration of data.
http://dev.mysql.com/doc/refman/5.5/en/alter-table.html
... and more specifically:
To convert a binary or nonbinary
string column to use a particular
character set, use ALTER TABLE. For
successful conversion to occur, one of
the following conditions must
apply:[...] If the column has a
nonbinary data type (CHAR, VARCHAR,
TEXT), its contents should be encoded
in the column character set, not some
other character set. If the contents
are encoded in a different character
set, you can convert the column to use
a binary data type first, and then to
a nonbinary column with the desired
character set.
http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html
So your problem is invalid data, e.g., data encoded in a different character set. I've tried the tip suggested by the documentation and it basically ruined my data, but the reason is that my data was already lost: running SELECT column, HEX(column) FROM table showed that multibyte chars had been inserted as 0x3F (i.e., the ? symbol in Latin1). My MySQL stack had been smart enough to detect that input data was not Latin1 and convert it into something "compatible". And once data is gone, you can't get it back.
To sum up:
Use HEX() to find out if you still have your data.
Make your tests in a copy of your table.
My question is, what will happen with my existing data if I change my
collation to utf8_unicode now?
Answer: If you change to utf8_unicode_ci, nonthing will happen to your existing data (which is already corrupt and remain corrupt till you modify it).
Will it destroy or corrupt the existing data or will the data remain,
but the new data will be saved as utf8 as it should?
Answer: After you change to utf8_unicode_ci, existing data will not be destroyed. It will remain the same like before (something like ????). However, if you insert new data containing Unicode characters, it will be stored correctly.
I will change with phpMyAdmin web client.
Answer: Sure, you can change collation with phpMyAdmin by going to Operations > Table options
CAUTION! Some problems are solved via
ALTER TABLE ... CONVERT TO ...
Some are solved via a 2-step process
ALTER TABLE ... MODIFY ... VARBINARY...
ALTER TABLE ... MODIFY ... VARCHAR...
If you do the wrong one, you will have a worse mess!
Do SELECT HEX(col), col ... to see what you really have.
Study this to see what case you have: Trouble with utf8 characters; what I see is not what I stored
Perform the correct fix, based on these cases: http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases