How to detect and replace non-printable characters from table? - mysql

I have imported data from xls file into table. but there are some garbage (non ascii charactors).
I want to remove those non printable characters from database.
here is the query i found which can select the entries which has non-ascii characters
select * from TABLE where COLUMN regexp '[^ -~]';
But how can i remove those characters from table using mysql query or procedure ?
Please give suggestions.
thanks in advance.

Since the question is about "detect and replace" I wouldn't suggest the Delete query from #TheWitness. Instead I would do something like this:
UPDATE some_table SET some_column = REGEXP_REPLACE(some_column, '[^ -~]', '') WHERE some_column REGEXP '[^ -~]'
The query above will use regular expression to search for the particular characters and with REGEXP_REPLACE it will replace them with empty string.
More on REGEXP_REPLACE

It's fairly simple, you just change your SELECT into a DELETE as follows
DELETE FROM TABLE WHERE COLUMN regexp '[^ -~]';

Related

Capture letters in field containing letters+numbers

I am trying to clean some data in my database.
I have a column that contains both letters and numbers.
I would like to build a query that will catch all the field for which there is more than 4 letters in a row.
1293.8093CHINA34324 -- (YES)
MY32498VN34983-294TH32498PH -- (NO)
WORLD_3244932 -- (YES)
9HEY850249.243943 -- (NO)
32484359-78049 -- (NO)
3294832.49234PROGRAMMATION -- (YES)
Thx a lot for your help.
REGEXP to the rescue
SELECT * FROM my_table WHERE my_column REGEXP '[a-zA-Z]{4}'
My Original query said [A-Z] Tim, kindly edited and added a case insensitive regex. The original regex was based on the assumption that your mysql database has a case insensitive collation (which is the default)
This is a query for sql server:-
select name from (select * from [tests].[dbo].[emplo]
where len(name)>=4) as s where name like '%[a-z][A-Z]%'

Postgresql unaccent() equivalent in MySQL

i have to compare two strings in a query like following:
SELECT *
FROM MY_TABLE
WHERE column LIKE '%keyword%';
But i want to compare unaccented values of both column and keyword. Is there an unaccent() function or other way to achieve this in MySQL?
AFAIK, No there is no unaccent() function present in MySQL. To ignore the accent you will have to set the proper collation for the column you are trying to compare. Example: How to remove accents in MySQL?

Replace word in mysql with same word but uppercase

For example:
I have text in my column like: 'some text with word to replace' and i want to replace:
word with Word
i do:
update table set column = replace(column, 'word', 'Word');
and i get error:
Mysql: #1442 - Can't update table 'table' in stored function/trigger because it is already used by statement which invoked this stored function/trigger.
If you want to change only the first letter :
UPDATE MyTable
SET myColumn = CONCAT(UCASE(LEFT(myColumn, 1)), SUBSTRING(myColumn, 2));
If you want to change all the column :
UPDATE MyTable
SET myColumn = UPPER(myColumn);
If you want to replace some words, you have to use the replace function :
UPDATE MyTable SET myColumn = replace(myColumn, 'word', 'Word');
Please to consider to accept my answer if it's OK for you.
EDIT : Adding a third example to search and replace a word in the field and replace it to another one.
Use UPPER or LOWER functions in mysql.

Replace space with underscore in table

How can I write a SQL query to replace all occurrences of space in a table with underscore and set all characters to lowercase?
To update a single column in a single table, you can use a combination of LOWER() and REPLACE():
UPDATE table_name SET column_name=LOWER(REPLACE(column_name, ' ', '_'))
To "duplicate" the existing column, and perform the updates on the duplicate (per your question in a comment), you can use MySQL's ALTER command before the UPDATE query:
ALTER TABLE table_name ADD duplicate_column_name VARCHAR(255) AFTER column_name;
UPDATE table_name SET duplicate_column_name = LOWER(REPLACE(column_name, ' ', '_'));
Just be sure to update the data-type in the ALTER command to reflect your actual data-type.
When using the UPDATE statement in SQL, always remember to include a WHERE clause -- so says MYSQL Workbench! :D
My Answer though:
REPLACE(string1, find_in_string, replacementValue);

How to check if a row in a table has muti-byte characters in it in a MySQL database?

I have a table which has column of descr which stores string values. Some of the values in descr has multi-byte characters in it and I want to know all those rows so I can remove those characters. How can I query the tables using MySQL functions to determine which rows have multi-byte characters. I am using MySQL version 5.1
SELECT ...
FROM yourtable
WHERE CHAR_LENGTH(descr) <> LENGTH(descr)
char_length is multi-byte aware and returns the actual character count. Length() is a pure byte-count, so a 3-byte char returns 3.
have you tried the collation and CHARSET functions on your descr column?
You can find the description of this functions here:
http://dev.mysql.com/doc/refman/5.1/en/information-functions.html
I think for your need it fits better the COERCIBILITY function. You can do something like:
select COERCIBILITY(descr, COLLATE utf8) from myTable;
and if this function returns 0 then you must edit the line.