MySQL: 'group by' without losing marks - mysql

I have a table like this:
name
Smith
Smith
Perez
Pérez
I would like to eliminate duplicates like Smith but preserve both Perez and Pérez (e and é).
If I use 'group by' I get two rows (Smith and one of the two Perez/Pérez), but I would like to get three rows (Smith, Perez, Pérez).
It happens the same with Sjögren and Sjogren, etc.
Thanks

1)First check your table if it has utf8 charset encoding with
select table_name,engine
from information_schema.tables
where table_schema = 'your_database'
2)Secondly , if it is not than (else skip to 3rd step), ALTER your table (utf8 character set encoding, so it will support special character)
ALTER TABLE `name` CHARACTER SET utf8;
3) SELECT from your db with utf8 charset
select * from your_table group by name collate utf8_general_ci

Try using utf8_unicode_ci rather than utf8_general_ci - it uses a more accurate comparison algorithm.

Related

Special characters select issues with MySQL

Based on http://www.i18nqa.com/debug/utf8-debug.html I want to perform a search in my MySQL table to see if I have rows that have encoding problems.
If I run the following query :
select t.col1 from table t where t.col1 like '%Ú%'
it will bring all the t.col1 that have 'as' characters in them.
How can I change the query to make it fetch only the rows containing '%Ú%', and not all that contain '%as%'.
try this if you are using collation latin1_swedish_ci
select t.col1 from table t where t.col1 regexp '^[Ú]';
With MySQL's collations, case-folding and accent-stripping go together.
If you want neither, use the ..._bin collation for the character set you are using.
WHERE foo LIKE '%Ú%' COLLATE utf8_bin
Even faster would be to declare foo to be COLLATE utf8_bin instead of whatever you have. (Note: the default for utf8 is utf8_general_ci.)

Special Characters and a simple select query

I have got a problem with a simple Select Query and special chars. I want to select the name Änne.
SELECT * FROM `names` WHERE `name` = 'Änne'
utf8_general_ci
Änne
Anne
okay, ...
utf8 general ci is a very simple collation. What it does it just
removes all accents then converts to upper case and uses the code of this sort of "base letter" result letter to compare.
http://forums.mysql.com/read.php?103,187048,188748
utf8_unicode_ci
Änne
Anne
why?
utf8_bin
Änne
utf8_bin seems to be the right choice at this point, but i have to do my search case insensitiv.
SELECT * FROM `names` WHERE `name` = 'änne'
utf8_bin
none
Is there no way to do so?
I could use php ucwords() to uppercase the first letters, but i would prefer to find a DB solution.
edit: ucwords('änne') = änne, so i cant use that too
SELECT * FROM `names` WHERE lower(`name`) = 'änne'
is working for me, because i don't have a difference between 'Änne' and 'änne' in my DB.
what about:
SELECT * FROM `names` WHERE upper(`name`) = upper("änne")
Quoting doc:
The default character set and collation are latin1 and
latin1_swedish_ci, so nonbinary string comparisons are case
insensitive by default. This means that if you search with col_name
LIKE 'a%', you get all column values that start with A or a. To make
this search case sensitive, make sure that one of the operands has a
case sensitive or binary collation
That means that case sensitive results are because you have set a binary collation. You can set collation column to utf8_general_ci and change it on searchs:
col_name COLLATE latin1_general_cs LIKE 'a%'
There is an error in your MySQL code:
SELECT * FROM names WHERE name = "Änne"
Remove the quotes around the table name and the field name.

mysql multiple collation

I have table with translations to 10 languages table(id, language_id, translation_text) it has charset utf-8 how can i query this table with different collations so in every language sorting would be according to their language rules.
You can specify a collation when you perform the query:
SELECT translation_text
FROM translation
WHERE language_id = 42
ORDER BY translation_text COLLATE latin1_german2_ci
See Using COLLATE in SQL statements.

MySQL: char_length(), wrong value for Russian

I am using char_length() to measure the size of "Русский": strangely, instead of telling me that it's 7 chars, it tells me there are 14. Interestingly if the query is simply...
SELECT CHAR_LENGTH('Русский')
...the answer is correct. However if I query the DB instead, the anser is 14:
SELECT CHAR_LENGTH(text) FROM locales WHERE lang = 'ru-RU' AND name = 'lang_name'
Anybody go any ideas what I might be doing wrong? I can confirm that the collation is utf8_general_ci and the table is MyISAM
Thanks,
Adrien
EDIT: My end objective is to be able to measure the lengths of records in a table containing single and double-byte chracters (eg. English & Russian, but not limited to these two languages only)
Because of two bytes is used for each UTF8 char.
See http://dev.mysql.com/doc/refman/5.5/en/string-functions.html#function_char-length
mysql> set names utf8;
mysql> SELECT CHAR_LENGTH('Русский'); result - 7
mysql> SELECT CHAR_LENGTH('test'); result - 4
create table test123 (
text VARCHAR(255) NOT NULL DEFAULT '',
text_text TEXT) Engine=Innodb default charset=UTF8;
insert into test123 VALUES('русский','test русский');
SELECT CHAR_LENGTH(text),CHAR_LENGTH(text_text) from test123; result - 7 and 12
I have tested work with: set names koi8r; create table and so on and got invalid result.
So the solution is recreate table and insert all data after setting set names UTF8.
the function return it's anwser guided by the most adjacent charset avaiable
in the case of a column, the column definition
in the case of a literal, the connection default
review the column charset with:
SELECT CHARACTER_SET_NAME FROM information_schema.`COLUMNS`
where table_name = 'locales'
and column_name = 'text'
be careful, it is not filtered by table_schema

Illegal mix of collations error in MySql

Just got this answer from a previous question and it works a treat!
SELECT username, (SUM(rating)/COUNT(*)) as TheAverage, Count(*) as TheCount
FROM ratings WHERE month='Aug' GROUP BY username HAVING TheCount > 4
ORDER BY TheAverage DESC, TheCount DESC
But when I stick this extra bit in it gives this error:
Documentation #1267 - Illegal mix of
collations
(latin1_swedish_ci,IMPLICIT) and
(latin1_general_ci,IMPLICIT) for
operation '='
SELECT username, (SUM(rating)/COUNT(*)) as TheAverage, Count(*) as TheCount FROM
ratings WHERE month='Aug'
**AND username IN (SELECT username FROM users WHERE gender =1)**
GROUP BY username HAVING TheCount > 4 ORDER BY TheAverage DESC, TheCount DESC
The table is:
id, username, rating, month
Here's how to check which columns are the wrong collation:
SELECT table_schema, table_name, column_name, character_set_name, collation_name
FROM information_schema.columns
WHERE collation_name = 'latin1_general_ci'
ORDER BY table_schema, table_name,ordinal_position;
And here's the query to fix it:
ALTER TABLE tbl_name CONVERT TO CHARACTER SET latin1 COLLATE 'latin1_swedish_ci';
Link
Check the collation type of each table, and make sure that they have the same collation.
After that check also the collation type of each table field that you have use in operation.
I had encountered the same error, and that tricks works on me.
[MySQL]
In these (very rare) cases:
two tables that really need different collation types
values not coming from a table, but from an explicit enumeration, for instance:
SELECT 1 AS numbers UNION ALL SELECT 2 UNION ALL SELECT 3
you can compare the values between the different tables by using CAST or CONVERT:
CAST('my text' AS CHAR CHARACTER SET utf8)
CONVERT('my text' USING utf8)
See CONVERT and CAST documentation on MySQL website.
I was getting this same error on PhpMyadmin and did the solution indicated here which worked for me
ALTER TABLE table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci
Illegal mix of collations MySQL Error
Also I would recommend going with General instead of swedish since that one is default and not to use the language unless your application is using Swedish.
I think you should convert to utf8
--set utf8 for connection
SET collation_connection = 'utf8_general_ci'
--change CHARACTER SET of DB to utf8
ALTER DATABASE dbName CHARACTER SET utf8 COLLATE utf8_general_ci
--change CHARACTER SET of table to utf8
ALTER TABLE tableName CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci
I also got same error, but in my case main problem was in where condition the parameter that i'm checking was having some unknown hidden character (+%A0)
When A0 convert I got 160 but 160 was out of the range of the character that db knows, that's why database cannot recognize it as character other thing is my table column is varchar
the solution that I did was I checked there is some characters like that and remove those before run the sql command
ex:- preg_replace('/\D/', '', $myParameter);
Check that your users.gender column is an INTEGER.
Try: alter table users convert to character set latin1 collate latin1_swedish_ci;
You need to change each column Collation from latin1_general_ci to latin1_swedish_ci
I got this same error inside a stored procedure, in the where clause. i discovered that the problem ocurred with a local declared variable, previously loaded by the same table/column.
I resolved it casting the data to single char type.
In short, this error is caused by MySQL trying to do an operation on two things which have different collation settings. If you make the settings match, the error will go away. Of course, you need to choose the right setting for your database, depending on what it is going to be used for.
Here's some good advice on choosing between two very common utf8 collations: What's the difference between utf8_general_ci and utf8_unicode_ci
If you are using phpMyAdmin you can do this systematically by working through the tables mentioned in your error message, and checking the collation type for each column. First you should check which is the overall collation setting for your database - phpMyAdmin can tell you this and change it if necessary. But each column in each table can have its own setting. Normally you will want all these to match.
In a small database this is easy enough to do by hand, and in any case if you read the error message in full it will usually point you to the right place. Don't forget to look at the 'structure' settings for columns with subtables in as well. When you find a collation that does not match you can change it using phpMyAdmin directly, no need to use the query window. Then try your operation again. If the error persists, keep looking!
The problem here mainly, just Cast the field like this cast(field as varchar) or cast(fields as date)
I had this problem not because I'm storing in different collations, but because my column type is JSON, which is binary.
Fixed it like this:
select table.field COLLATE utf8mb4_0900_ai_ci AS fieldName
Use ascii_bin where ever possible, it will match up with almost any collation.
A username seldom accepts special characters anyway.
If you want to avoid changing syntax to solve this problem, try this:
Update your MySQL to version 5.5 or greater.
This resolved the problem for me.
I have the same problem with collection warning for a field that is set from 0 to 1. All columns collections was the same. We try to change collections again but nothing fix this issue.
At the end we update the field to NULL and after that we update to 1 and this overcomes the collection problem.
Was getting Illegal mix of collations while creating a category in Bagisto. Running these commands (thank you #Quy Le) solved the issue for me:
--set utf8 for connection
SET collation_connection = 'utf8_general_ci'
--change CHARACTER SET of DB to utf8
ALTER DATABASE dbName CHARACTER SET utf8 COLLATE utf8_general_ci
--change category tables
ALTER TABLE categories CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci
ALTER TABLE category_translations CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci
In my case it was something strange. I read an api key from a file and then I send it to the server where a SQL query is made. The problem was the BOM character that the Windows notepad left, it was causing the error that says:
SQLSTATE[HY000]: General error: 1267 Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
I just removed it and everything worked like a charm
You need to set 'utf8' for all parameters in each Function. It's my case:
SELECT username, AVG(rating) as TheAverage, COUNT(*) as TheCount
FROM ratings
WHERE month='Aug'
AND username COLLATE latin1_general_ci IN
(
SELECT username
FROM users
WHERE gender = 1
)
GROUP BY
username
HAVING
TheCount > 4
ORDER BY
TheAverage DESC, TheCount DESC;
Make sure your version of MySQL supports subqueries (4.1+). Next, you could try rewriting your query to something like this:
SELECT ratings.username, (SUM(rating)/COUNT(*)) as TheAverage, Count(*) as TheCount FROM ratings, users
WHERE ratings.month='Aug' and ratings.username = users.username
AND users.gender = 1
GROUP BY ratings.username
HAVING TheCount > 4 ORDER BY TheAverage DESC, TheCount DESC