Microsoft Access query search for words with special character - ms-access

I have a table in Microsoft Access with words that contain Romanian characters. I want to query for words that contain ț (Latin small letter T with cedilla).
The following query gives back all the entries of my table "words", just as if it was another wildcard:
"SELECT words.[word]
FROM words
WHERE (((words.[word]) Like "*ț*"));
Any idea how to search for words that contain that character?
By the way, if I search for words with "ă" (Latin small letter a with breve) it works as expected.

The reason seems to be that "Like" uses ascii and interprets every character it can't understand as a question mark. Try this instead:
SELECT words.[word]
FROM words
WHERE instr(words.[word],"ț")>0

Related

How to perform a multi-byte safe SQL REGEXP query?

I have the following SQL query to find the dictionary words that contain specific letters.
It's working fine in the English dictionary:
SELECT word
FROM english_dictionary
WHERE word REGEXP '[abcdef]'
But running the same query on Slovak dictionary, which includes UTF8 special accented letters don't work.
SELECT word
FROM slocak_dictionary
WHERE word REGEXP '[áäčďéóú]'
I've searched everywhere, can't find the answer to this issue. If I use LIKE, it's working, but the query is getting very ugly:
SELECT word
FROM slocak_dictionary
WHERE
word LIKE '%á%'
AND word LIKE '%ä%'
AND word LIKE '%č%'
AND word LIKE '%ď%'
AND word LIKE '%é%'
AND word LIKE '%ó%'
AND word LIKE '%ú%'
Because I deal with many letters that need to be excluded or includes in the query, breaking it down like this is not very elegant.
Is there any way to perform a multi-byte safe SQL REGEXP query on MySQL?
MariaDB has better support of REGEXP.
In MySQL, this will test for word having any of those accented characters:
HEX(word) REGEXP '^(..)*(C3A1|C3A4|C48D|C48F|C3A9|C3B3|C3BA)'
The ^(..)* is to make sure the subsequent test is byte (2 hex chars) aligned.
You can see those utf8 encodings by doing something like
SELECT HEX('áäčďéóú');
(Your attempt with LIKE should have said OR instead of AND.)

Regular expression in mysql to find w/2

I have a query as below, Which will select all rows which contain GB:2 and the word Samsung also there with in three word.
But my data sometimes contain GB-2 or GB,3 etc. So I want to change my query as GB* so that it will take any character and alphabet after GB.
How I can put this GB* in my query?
SELECT description from mobile
where content REGEXP '[[:<:]]GB:2( [^ ]+){0,3} Samsung[[:>:]]'

Find non-ascii spaces in mysql table

I have a large database where things like Trim and functions I made to count words don't always work (some records still have 'spaces' and multi-word fields get a count of 1). Leading me to believe I have non-ascii spaces.
I tried this to find offending records:
SELECT * FROM TABLE WHERE FIELD NOT REGEXP '[A-Za-z0-9 ;,]'
in other words all letters, digits, characters I used and space.
Returns zero-set.
Is there a better way to do this (i.e. one that works)?
Your regex will match rows that have one or more characters in the set {A-Z, a-z, 0-9, space, semicolon, colon}.
Better to look specifically for non-printable characters using the POSIX [:cntrl:] character class:
SELECT * FROM TABLE WHERE FIELD REGEXP '[[:cntrl:]]'

How to detect rows with chinese characters in MySQL?

How can I detect and delete rows with Chinese characters in MySQL?
Here is the Table "Chinese_Test" Contains the Chinese Character on my PhpMyAdmin
Data:
Structure
notice my type of Collation is utf8, thus let's take a look at the Chinese Characters in utf8 table.
http://www.ansell-uebersetzungen.com/gbuni.html
Notice the Chinese Character is from E4 to E9, hence we use the code
select number
from Chinese_Test
where HEX(contents) REGEXP '^(..)*(E[4-9])';
and here is the result:
If all the other rows have alphanumeric values try the following:
DELETE FROM tableName WHERE NOT columnToCheck REGEXP '[A-Za-z0-9.,-]';
Do check the results before deletion, using the following:
SELECT * FROM tableName WHERE NOT columnToCheck REGEXP '[A-Za-z0-9.,-]';
I don't have an answer, but to provide you with a starting point: Chinese characters will occupy certain blocks in the UTF-8 character set. Example
You would have to query for rows that contain characters between the first and the last point of that block. I can't think of a way to automate this though (i.e. to query for characters inside a certain range without naming each character explicitly).
Another untested idea that comes to mind is using iconv() to convert the string to a specifically Chinese encoding, using //IGNORE, and seeing whether any data is left. If anything is left, the string may contain chinese characters.... although this would probably be disrupted by any numbers inside the string,
It's an interesting problem.

MySQL won't replace words with empty space

Basically, I have a problem with replace() function in MySQL (via phpMyAdmin). One table got messed and some special characters (+ empty space after it) appeared inside a word. So all I wanted to do was:
UPDATE myTable SET columnName =
(replace(columnName, 'Å house',
'house'))
But MySQL returns
0 row(s) affected. ( Query took 0.0107 sec )
The same is when I try to replace foreign towns with special characters in the name of a town (Swedish town, German town, etc.)
Am I doing something wrong???
Å house
Is likely to actually be:
Å house
That is, with a U+00A0 Non Break Space character and not a normal space. Of course normally you cannot see the difference, but a string replace can and won't touch it.
This was probably originally just a single non-breaking-space character, that has been mangled through a classic UTF-8-read-as-ISO-8859-1 encoding screw-up. Other non-ASCII characters in your database are likely to have been similarly messed up.