MySQL - refusing to run set names? - mysql

Quick question as I have never run into this before.
On a webhost I am running the query:
SET NAMES 'utf8'
This is returning the following error:
Error: Unknown system variable 'NAMES'
I haven't run across this before. I get similar errors when trying to specify CURRENT_TIMESTAMP as a default column value as well as setting the collation of a table.
The MySQL queries I am running have worked on literally hundreds of hosting accounts before this one. On contacting the host I was fobbed off saying it was probably my code.
Is the likely hood that this is a dodgy MySQL install? Host says they are running MySQL5

SET NAMES is available since MySQL 4.1, which brought large scale changes to character set handling and full UTF-8 support. Quite sure you have a MySQL version <4.1 in front of you. Try a
SELECT VERSION();
as a1ex07 has recommended.
Older versions of MySQL can only handle 8-bit character data. They can still store UTF-8 data as byte sequences, but they are not aware of it. There are several backdraws to storing UTF-8 in MySQL <4.1. For example string lengths can exceed given column limits although the number of characters should fit. Also the modern string comparison functions do not exist (they correctly compare special characters and different ways to write them, i.e. "ß" vs. "ss" in German).

Related

MySQL to postgres migration issue

I want to migrate my project from MySQL to postgres, I have one table in MySQL, in which utf8mb4 set for particular column in a table, what alternative is there in postgres to set in column for encoding?
utf8mb4 is MySQL's way to represent 4-byte UTF characters, however, as the documentation clarifies:
Requires a maximum of four bytes per multibyte character.
So, actually not all characters are stored in four bytes. The OS is also not using up all the 4 available bytes for each characters, so you should be able to migrate your utf8mb4 characters into a UTF-8 encoded target field (MySQL - PostgreSQL) without problems, at least in theory.
But you never know whether this fits practice, so it is advisable to first create a backup of your MySQL database (so you will not be afraid of doing changes to it if for some reason you decide that the initial database needs some changes), export your database and modify your table's/column's definition to no longer use utf8mb4 as an encoding, but rather leave it unspecified (if you can rely on the fact that PostgreSQL has UTF-8 as the default encoding) or specify a UTF-8 encoding explicitly and run the inserts. Take a few samples of data from the original database and compare them to what PostgreSQL returns to them. If it works out of the box, then the theory was fitting the practice. If not, then you will need to research for the cause of the problem you experience.

Possible to have only some fields use utf8mb4 with Laravel, Doctrine, and MySQL?

We have an application running Laravel 5.6 on MySQL 5.6. We can't upgrade those yet. We're hoping to fix an issue with accepting "special characters" in a form, but without upgrading MySQL yet.
I've changed the collation and and character set of select relevant columns, and also tried updating the whole table thusly though other columns are still mysql's "utf8" (aka utf8mb3)... but the special characters are not persisting. We're getting mojibake ("garbled text that is the result of text being decoded using an unintended character encoding")—for example, when we should have 𝘈Ḇ𝖢𝕯٤ḞԍНǏ𝙅ƘԸⲘ𝙉০Ρ𝗤Ɍ𝓢ȚЦ𝒱Ѡ𝓧ƳȤѧᖯć𝗱ễ𝑓𝙜Ⴙ𝞲𝑗𝒌ļṃʼnо𝞎𝒒ᵲꜱ𝙩ừ𝗏ŵ𝒙𝒚ź, we instead get ????Ḇ????????٤ḞԍНǏ????ƘԸⲘ????০Ρ????Ɍ????ȚЦ????Ѡ.
The "special characters" (𝘈Ḇ𝖢𝕯٤Ḟ…) are being passed—intact—from the front-end to the backend, and then along in the backend—intact—as the stack sets the information as the value of a Doctrine Entity property—still intact. Then doctrine persists the info… and things get complicated as we move deeper into the lower levels of the ORM. I've yet to debug it that deep, but may have to because so far the database saves the mojibake, not the intact "special" characters.
I've also added character set and collation optional values to property declarations on the relevant Doctrine entity.
Laravel's database configuration has charset and collation settings too, but while I do need utf8mb4 for these few fields, the rest are still using utf8mb3, so I'm unsure about how setting the laravel config values might affect things.
I've tried various permutations of the settings, but not all yet. But, away for a couple of days I wanted to post this question in the hope of perhaps somebody else having some helpful advice or links to information. I've found helpful information about converting your application or database to utf8mb4, but nothing about a "partial conversion" like I'm trying here.
So, my question is this: Has anybody here come across this before? Trying to set just some fields to use utf8mb4 but without upgrading everything?
I don't have an easy-to-replicate example of the failure, but produce one in several days if need be.
Otherwise, thanks for reading.

MySQL Workbench Connection Encoding

While testing some code I stumbled on the following MySQL error:
Error Code: 1267. Illegal mix of collations (utf8_general_ci,IMPLICIT) and ( utf8mb4_general_ci,COERCIBLE) for operation '='
I was using a WHERE statement on a standard MySQL UTF-8 collation column which contained a character using 4 bytes. Unless I misunderstood, while reading, I found the following information:
MySQL's original UTF-8 implementation was incomplete (supporting maximum 3 bytes)
The way to solve this is a new collation called utf8mb4 which by no means a new encoding but only used by MySQL to patch their original mistake.
On my end I see no reasons to use the original MySQL UTF-8 implementation since it's incomplete. So I did a few server side configuration to make sure all defaults were pointing to utf8mb4. Everything seemed fine but now on my application: I can use 🐼 characters in my form without having to worry about MySQL.
My problem now remains that when I connect with MySQL Workbench, it seems that the encoding is being forced to UTF-8. So even if my application works correctly, if I want to run tests directly in MySQL Workbench, I get the "Illegal mix of collation" error unless I run this fix (in Workbench) after starting the application:
SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'
I found this old question (MySQL Workbench charset) where it seemed impossible to overwrite the setting but even after I spent too much time searching for the config, I cannot believe this is still the case??
For now, I'm afraid, you will have to live with that. There's a WL for MySQL to rename that encoding to utf8 (throwing out the existing 3 byte variant). So it makes sense to keep utf8 in MySQL Workbench or we have to use different settings for different servers, which makes things more complicated.

MySQL truncates latin-1 data when inserting to UTF-8 column. How to get an error?

Ok, had a problem when I was trying to insert an string that was believed came in UTF-8 format when truly was latin-1. Problem was, behaviour between different servers varied like:
Windows will give an error when confronted when a non-UTF8 char: "Incorrect string value: '\xD1OL S....' for column [...]"
Linux will quietly insert whatever I give it until it finds the first char it doesn't like, and then truncate the rest of the string.
For once, I find that the windows behaviour is the one I'd rather have. I've been looking and haven't found some kind of option to make the mysql linux server more strict.
Do you now of one?.
Thanks!.
Well, that was quick. Problem was, we had a 5.6 Linux server that was not in strict mode, though our DBA assured me that it was, making us think that it was a Linux problem.
So, just activate STRICT_ALL_TABLES. MySQL modes and how to activate them are here:
http://dev.mysql.com/doc/refman/5.6/en/sql-mode.html

run mysql without collation (utf-8 only)

I run a sqlite3 database with utf8-strings from many languages. For various reasons I want to move to mysql, but I constantly run into trouble because of the mysql-collation feature.
One problem is that I am not even able to reliably know what is in my database. (For example I get "?" for non-latin characters and "�" for latin-based characters like öé, etc. - but I have absolutely no idea whether the problem lies in the import from sqlite3 to mysql or in reading from the mysql-database.)
Is there a way to get rid of this "feature" and let mysql do what I tell it without trying to be smart? I use UTF-8 everywhere and I never need any mangling of strings: Input is always UTF-8 and output should be always UTF-8. Also I really would like to know what really is stored in the database - i.e. without a collation-feature corrupting the data during readout.
You could use the MySQL VARBINARY column type, which stores a sequence of arbitrary bytes without interpreting them in any particular charset (or maybe VARCHAR BINARY, which is subtly different).
MySQL uses latin1_swedish_ci unless you specify something different explicitly. That's the opposite of smart. You have to be smart and change that default. This can be done with e.g. the --character-set-server and --collation-server command line options. See Specifying Character Sets and Collations for other means and further options.