Replace specific character MySQL search on utf8mb4_unicode_ci

Replace specific character MySQL search on utf8mb4_unicode_ci - mysql

I want to search my database for the character İ - "latin capital letter i with dot above (U+0130)" - and replace it with a regular I (U+0049).
For example, I want to transform "SİNG" to "SING".
The database collation is utf8mb4_unicode_ci.
I can find the characters using COLLATE utf8mb4_bin
SELECT * FROM `benches` WHERE `inscription` LIKE '%İ%' COLLATE utf8mb4_bin
But I can't replace it.
UPDATE `benches` SET inscription = REPLACE(inscription, 'İ', 'I') WHERE INSTR(inscription, 'İ') > 0 COLLATE utf8mb4_bin
I get the error
#1253 - COLLATION 'utf8mb4_bin' is not valid for CHARACTER SET 'latin1'
Which is weird because the database and column are definitely utf8mb4_unicode_ci
So, what magic invocation do I need to search and replace a specific Unicode character from within a string?

The quick fix might be
UPDATE `benches`
SET inscription = REPLACE(inscription, _utf8mb4'İ' COLLATE utf8mb4_bin, 'I')
WHERE INSTR(inscription, _utf8mb4'İ' COLLATE utf8mb4_bin) > 0
A better fix might be to execute this after connecting:
SET NAMES utf8mb4;
If neither of these work, please provide a test case that includes creating and populating a table, plus the UPDATE. It may take some experimentation to conjure up another potential solution.

I have had success with a query like:
UPDATE `benches`
SET inscription = REPLACE(inscription, 'İ', 'I')
WHERE inscription LIKE '%İ%' COLLATE utf8mb4_bin;

Related

COLLATION 'utf8_bin' is not valid for CHARACTER SET 'utf8mb4'

I am trying to add emoji like 😋 in my application. In order to make it work, I have to use charset : 'utf8mb4' in database connection. But then my other search query doesn't work and throughs error like this
select id, full_name, profile_pic from users where school_id = 1 and exists (select * from groupmembers where (group_id = '110') and (users.id = groupmembers.user_id)) and full_name like '%d%' COLLATE utf8_bin - ER_COLLATION_CHARSET_MISMATCH: COLLATION 'utf8_bin' is not valid for CHARACTER SET 'utf8mb4
How can I make both works together? I am using adonis.js framework and it uses knex query/

Collations are specific to a character set. If your character set is utf8mb4, then you could use collation utf8mb4_bin but not utf8_bin.
This will become a bit more confusing someday, because MySQL has the intention to change the name of utf8mb4 to utf8. Cf. https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8.html
You can check all the collations allowed by this query:
select collation_name
from information_schema.collation_character_set_applicability
where character_set_name = ##character_set_connection;

Mysql where clause exact match returns not exact - "sàvko" = "savko"

Field have charset "utf8mb4_unicode_ci"
How I can get only "savko" when I doing where field = 'savko'?

If you generally like utf8mb4_unicode_ci on your field but just want to do this one test in a different collation,
... WHERE field COLLATE utf8mb4_bin = 'savko'

solved it after change field collation to 'utf8mb4_bin'

MySQL Diacritics insensitive search

I have a Romanian dictionary database. The word table has a column named Word which is utf8_romanian_ci collation. In this column I keep all the words. Most of the Romanian words have diacritics: acasă, mâine ...etc.
I try to run a query which ignores the diacritics. Something like:
SELECT * FROM WordList where Word = 'acasa'
should return the word acasă
I tried:
SET NAMES utf8;
before the query, but it does not work.
I also tried
SELECT * FROM WordList where Word = 'acasa' COLLATE utf8_bin
It does not work either.
Any idea that it might work?

Try to add COLLATE utf8_unicode_ci to query:
SELECT *
FROM WordList
WHERE Word = _utf8 'acasa' COLLATE utf8_unicode_ci
Test on SQL Fiddle
More info:
MySQL: Unicode Character Sets
MySQL: Using COLLATE in SQL Statements

Special Characters and a simple select query

I have got a problem with a simple Select Query and special chars. I want to select the name Änne.
SELECT * FROM `names` WHERE `name` = 'Änne'
utf8_general_ci
Änne
Anne
okay, ...
utf8 general ci is a very simple collation. What it does it just
removes all accents then converts to upper case and uses the code of this sort of "base letter" result letter to compare.
http://forums.mysql.com/read.php?103,187048,188748
utf8_unicode_ci
Änne
Anne
why?
utf8_bin
Änne
utf8_bin seems to be the right choice at this point, but i have to do my search case insensitiv.
SELECT * FROM `names` WHERE `name` = 'änne'
utf8_bin
none
Is there no way to do so?
I could use php ucwords() to uppercase the first letters, but i would prefer to find a DB solution.
edit: ucwords('änne') = änne, so i cant use that too
SELECT * FROM `names` WHERE lower(`name`) = 'änne'
is working for me, because i don't have a difference between 'Änne' and 'änne' in my DB.

what about:
SELECT * FROM `names` WHERE upper(`name`) = upper("änne")
Quoting doc:
The default character set and collation are latin1 and
latin1_swedish_ci, so nonbinary string comparisons are case
insensitive by default. This means that if you search with col_name
LIKE 'a%', you get all column values that start with A or a. To make
this search case sensitive, make sure that one of the operands has a
case sensitive or binary collation
That means that case sensitive results are because you have set a binary collation. You can set collation column to utf8_general_ci and change it on searchs:
col_name COLLATE latin1_general_cs LIKE 'a%'

There is an error in your MySQL code:
SELECT * FROM names WHERE name = "Änne"
Remove the quotes around the table name and the field name.

Looking for case insensitive MySQL collation where "a" != "ä"

I'm looking for a MySQL collation for UTF8 which is case insensitive and distinguishes between "a" and "ä" (or more generally, between umlauted / accented characters and their "pure" form). utf8_general_ci does the former, utf8_bin the latter, bot none does both. If there is no such collation, what can I do to get as close as possible in a WHERE clause?

My recommendation would be to use utf8_bin and in your WHERE clause, force both sides of your comparison to upper or lower case.

It works fine here with utf8_german2_ci as collation:
SELECT * FROM tablename WHERE fieldname LIKE "würz%" COLLATE utf8_german2_ci

I checked utf8_bin like this
CREATE TABLE tmp2 (utf8_bin VARCHAR(20) CHARACTER SET utf8 COLLATE utf8_bin);
INSERT INTO tmp2 VALUES ('nói');
select * from tmp2 where utf8_bin='noi';

You could try utf8_swedish_ci, it's both case insensitive and distinguishes between a and ä (but treats e.g. ü like y).
Collations are language-dependent, and it seems German doesn't have its own collation in MySQL. (I had a look at your profile, which says you're German.)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Replace specific character MySQL search on utf8mb4_unicode_ci - mysql

I have had success with a query like: UPDATE `benches` SET inscription = REPLACE(inscription, 'İ', 'I') WHERE inscription LIKE '%İ%' COLLATE utf8mb4_bin;

Related

COLLATION 'utf8_bin' is not valid for CHARACTER SET 'utf8mb4'

Mysql where clause exact match returns not exact - "sàvko" = "savko"

MySQL Diacritics insensitive search

Special Characters and a simple select query

Looking for case insensitive MySQL collation where "a" != "ä"

Categories

Resources