I am trying to clean some data in my database.
I have a column that contains both letters and numbers.
I would like to build a query that will catch all the field for which there is more than 4 letters in a row.
1293.8093CHINA34324 -- (YES)
MY32498VN34983-294TH32498PH -- (NO)
WORLD_3244932 -- (YES)
9HEY850249.243943 -- (NO)
32484359-78049 -- (NO)
3294832.49234PROGRAMMATION -- (YES)
Thx a lot for your help.
REGEXP to the rescue
SELECT * FROM my_table WHERE my_column REGEXP '[a-zA-Z]{4}'
My Original query said [A-Z] Tim, kindly edited and added a case insensitive regex. The original regex was based on the assumption that your mysql database has a case insensitive collation (which is the default)
This is a query for sql server:-
select name from (select * from [tests].[dbo].[emplo]
where len(name)>=4) as s where name like '%[a-z][A-Z]%'
Related
I have a table with VARCHAR(191) that actually uses a unique id which is constructed of 5 chars from these options: a-z, A-Z, 0-9 (example: 7hxYy)
I am performing a simple lookup with select * from table where token = 7hxYy yet I want the token to be case sensitive since I can actually query for 7hxyy and it will work.
How can I convert this column to binary without affecting the SQL select statements?
Try to modified the data type like:
alter table modify column token varchar(191) binary;
In MySQL, char vs char binary may affect the value case sensitive.
Or you can change your query sql like:
select * from table where BINARY `token`=‘7hxYy’;
I was trying to convert an existing varchar column with a unique index on it to a case sensitive column. So to do this, I updated the collation of the particular column.
Previous value: utf8mb4_unicode_ci
Current value: utf8mb4_bin
Now I have a row in my table TEST_TABLE with test_column value is abcd.
When I try to run a simple query like SELECT * FROM TEST_TABLE WHERE test_column = 'abcd'; it returns no result.
However when I try SELECT * FROM TEST_TABLE WHERE test_column LIKE 'abcd'; it returns the data correctly.
Also when I try SELECT * FROM TEST_TABLE WHERE BINARY test_column = 'abcd'; it returns the data correctly.
One more thing I tried was creating a duplicate of the table with column collation set as utf8mb4_bin while creating itself and then copy all data from original table. Then the query SELECT * FROM TEST_TABLE WHERE test_column = 'abcd'; is working alright.
So this seems to be a problem with BINARY conversion. Is there any solution to this or Am I doing something wrong ?
This seems to be an issue with MySQL. The steps I followed to resolve this is as follows:
dropped the unique index on the column
change the collation of the column
created the unique index again
Now it is working as expected. It seems MySQL didn't rebuild unique index when collation was changed. However the above steps solved my issue.
How did you change the collation? There are about 4 ways that you might think to do it. Most do something different.
Probably ALTER TABLE ... CONVERT TO COLLATION utf8mb4_bin was what you needed.
Why "bin"? You want to match case and accents? That is "abcd" != "Abcd"?
SELECT * FROM table1 WHERE a = '1';
SELECT * FROM table1 WHERE a = CONVERT(1, CHAR);
Column a is VARCHAR type, and I have already created an index on it. The first one uses index but the second one doesn't. Any Clue on this?
I suspect that maybe MySQL takes CHAR and VARCHAR as two different types, so I changed column a to CHAR, and the index doesn't work either.
Sounds like a CHARACTER SET or COLLATION problem. Please provide:
SHOW VARIABLES LIKE 'char%';
how you are connecting to the database
SHOW CREATE TABLE table1;
Probably the answer involves changing the CONVERT:
CONVERT(1, CHAR charset utf8)
What is the real query; there may be some other approach that is better for your situation. Using CONVERT is a kludge; let's go to the need for it.
To see the conversion(s), do
EXPLAIN FORMAT=JSON SELECT ...
I have a table 'products' which has a column partnumber.
I want to ignore special characters from record while searching.
Suppose i have following 5 records in partnumber:
XP-12345
MV-334-3454
XP1-5555
VX-AP-XP-1000
VT1232223
Now, If i try to search "XP1", then Output should be come like following records
XP-12345
XP1-5555
VX-AP-XP-1000
How to write mysql query for this ?
You can achieve this functionality using concat() function. As I can review your and Jorden answer comment that you want to search string XP1 with ignore special charecter like -,_,# .
So you can use this query
SELECT partnumber FROM products
WHERE partnumber LIKE concat('%XP','_','1%')
OR partnumber LIKE '%XP1%';;
Note: Require output you can check on SQLFIDDLE and You can adjust query based on your additional requirement.
Define a MySQL function which strips the symbols from a provided string.
DELIMITER //
CREATE FUNCTION STRIP_SYMBOLS(input VARCHAR(255))
RETURNS VARCHAR(255) DETERMINISTIC NO SQL
BEGIN
DECLARE output VARCHAR(255) DEFAULT '';
DECLARE c CHAR(1);
DECLARE i INT DEFAULT 1;
WHILE i < LENGTH(input) DO
SET c = SUBSTRING(input, i, 1);
IF c REGEXP '[a-zA-Z0-9]' THEN
SET output = CONCAT(output, c);
END IF;
SET i = i + 1;
END WHILE;
RETURN output;
END//
DELIMITER ;
Then select the records from your table where the partnumber with the symbols stripped contains XP1:
SELECT * FROM products WHERE STRIP_SYMBOLS(partnumber) LIKE '%XP1%';
-- Returns: XP-12345, XP1-5555, VX-AP-XP-1000
This might be painfully slow if your table is large. In this case, look into generated columns (if you have MySQL 5.7.6 or higher) or creating a trigger (if an earlier version) to keep a column in your table updated with the partnumber with symbols stripped.
You need to use REGEXP to allow for containing searches.
EX:
SELECT partnumber
FROM partnumber_tbl
WHERE name REGEXP '[XP1]\-';
This will let it search the database to find anything containing X,P,1.
Here is a live example.
And regexp info for you to look more into. Official docs, I hate reading them, especially oracles.
.......
You could do something like:
SELECT partnumber
FROM products
WHERE REPLACE(partnumber, "_", "")) LIKE '%XP1%');
I have imported data from xls file into table. but there are some garbage (non ascii charactors).
I want to remove those non printable characters from database.
here is the query i found which can select the entries which has non-ascii characters
select * from TABLE where COLUMN regexp '[^ -~]';
But how can i remove those characters from table using mysql query or procedure ?
Please give suggestions.
thanks in advance.
Since the question is about "detect and replace" I wouldn't suggest the Delete query from #TheWitness. Instead I would do something like this:
UPDATE some_table SET some_column = REGEXP_REPLACE(some_column, '[^ -~]', '') WHERE some_column REGEXP '[^ -~]'
The query above will use regular expression to search for the particular characters and with REGEXP_REPLACE it will replace them with empty string.
More on REGEXP_REPLACE
It's fairly simple, you just change your SELECT into a DELETE as follows
DELETE FROM TABLE WHERE COLUMN regexp '[^ -~]';