SQL query to select strings that contain a "Unit Separator" character - mysql

I have table like this
I want get those record which content Unit Separator
I have try many things but not getting result.I try with char(31) and 0x1f and many other ways but not getting desired result.This is my query which i try
SELECT * FROM `submissions_answers` WHERE `question_id`=90 AND `answer` like '%0x1f%'
How can i do this? Please help me..

Problem
The expression you tried won't work because answer LIKE '%0x1f%' is looking for a string with literally '0x1f' as part of it - it doesn't get converted to an ASCII code.
Solutions
Some alternatives to this part of the expression that ought to work are:-
answer LIKE CONCAT('%', 0x1F, '%')
answer REGEXP 0x1F
INSTR(answer, 0x1F) > 0
Further consideration
If none of these work then there may be a further possibility. Are you sure the character seen in the strings is actually 0x1F? I only ask because the first thing I tried was to paste in ␟ but it turns out MySQL see this as a decimal character code of 226 rather than 31. Not sure which client you are using but if the 0x1F character is in the string, it might not actually appear in the output.
Demo
Some tests demonstrating the points above: SQL Fiddle demo

You can use:
SELECT * FROM submissions_answers WHERE question_id=90 AND instr(answer,char(31))>0
The keyword here being the INSTR MySQL function, which you can read about here. This function returns the position of the first occurrence of substring (char(31)) in the string (answer).

Yet another way...
SELECT * FROM `submissions_answers`
WHERE `question_id`=90
AND HEX(`answer`) REGEXP '^(..)*1F'
Explanation of the regexp:
^ - start matching at the beginning (of answer)
(..)* -- match any number (*) of 2-byte things (..)
then match 1F, the hex for US.

You could convert the answer column into a HEX value, and then look for values containing that hex string.
SELECT * FROM `submissions_answers`
WHERE HEX(`answer`) LIKE '%E2909F%'

Related

How to make this regex in mysql?

A few min ago I found out that mysql accepts regex, and is great becouse I think it can solve my problem, but I don't know how to write it. So, I need something like this:
SELECT name FROM products WHERE name REGEXP 'regex code'
To give a little more explanation, the name must be in this format: 123425HT and not string99-123425HT. The 123425HT and string99-123425HT is taken arbitrary
Please Help, thanks!
Something like that:
SELECT * FROM t2 WHERE str REGEXP "^([0-9]+)([a-zA-Z]{2})$";
This regexp will found strings which starting from any digits and two small or big characters, for example:
123123hd
12345435MF
6572Sg
If you want use only 6 digits change from [0-9]+ to [0-9]{6}

Having trouble matching a single character in an SQL table

I need to use the '_' wildcard to find all id that are only one letter which there are a few of. However when I run my query no rows are returned.
Heres my query:
SELECT *
FROM table
WHERE id LIKE '_';
I have a table lets call Table1 that has two columns, id and name.
id either has 1 or 2 characters to label a name. I'm trying to only find the names where the id is only one character. Heres an example of the table:
id name
A Alfred
AD Andy
B Bob
BC Bridget
I only want to return Alfred and Bob in this example.
I don't want the solution but any advice or ideas would be helpful.
Here is a screenshot of my query:
http://i.imgur.com/EWTfoVI.png?1
And here is a small example of my table:
http://i.imgur.com/urGRZeK.png?1
So in this example of my table I would ideally like only East Asia... to be returned.
I if I search specifically for the character it works but for some strange reason the '_' wildcard doesn't.
For example:
SELECT *
FROM icao
WHERE prefix_code ='Z';
This works.
Try using TRIM
Select *
FROM [Table]
where TRIM(ID) LIKE '_';
In MySQL, the underscore is used to represent a wildcard for a single character. You can read more about that Pattern Matching here.
The way you have it written, your query will pull any rows where the id column is just one single character, you don't need to change anything.
Here is an SQL Fiddle example.
EDIT
One trouble shooting tip is to be sure there is no whitespace before/after the prefix code. If there is, and you need to remove it, add TRIM():
SELECT *
FROM myTable
WHERE TRIM(id) LIKE '_';
Here is an example with TRIM.
EDIT 2
A little explanation to your weird behavior, hopefully. In MySQL, if there is trailing white space on a character, it will still match if you say id = 'Z'; as seen by this fiddle now. However, leading white space will not match this, but will still be corrected by TRIM(), because that removes white space on the front and back end of the varchar.
TL;DR You have trailing white space after Z and that's causing the problem.
The most likely explanation for the behavior you observe is trailing spaces (or other whitespace) in the value. That is, you see one character
'A'
But the value may actually be stored as two (or more) characters.
'A '
To see what's actually stored, you can use the HEX and LENGTH functions.
SELECT t.foo
, LENGTH(t.foo)
, HEX(t.foo)
FROM mytable t
WHERE t.foo LIKE 'A%'
The % is a wildcard for the LIKE operator that matches any number of characters (zero, one or more).
You can use the RTRIM() function to remove trailing spaces...
SELECT RTRIM(t.foo)
, LENGTH(RTRIM(t.foo))
, HEX(RTRIM(t.foo))
FROM mytable t
WHERE t.foo LIKE 'A%'
SELECT *
FROM table
WHERE LENGTH(id)=1
Strange..., in my case works perfectly (I am using mysql 5.5).
Please, try this:
select * from mysql.help_topic where name like '_';
What set you get?

Using MySQL LIKE operator for fields encoded in JSON

I've been trying to get a table row with this query:
SELECT * FROM `table` WHERE `field` LIKE "%\u0435\u0442\u043e\u0442%"
Field itself:
Field
--------------------------------------------------------------------
\u0435\u0442\u043e\u0442 \u0442\u0435\u043a\u0441\u0442 \u043d\u0430
Although I can't seem to get it working properly.
I've already tried experimenting with the backslash character:
LIKE "%\\u0435\\u0442\\u043e\\u0442%"
LIKE "%\\\\u0435\\\\u0442\\\\u043e\\\\u0442%"
But none of them seems to work, as well.
I'd appreciate if someone could give a hint as to what I'm doing wrong.
Thanks in advance!
EDIT
Problem solved.
Solution: even after correcting the syntax of the query, it didn't return any results. After making the field BINARY the query started working.
As documented under String Comparison Functions:
Note
Because MySQL uses C escape syntax in strings (for example, “\n” to represent a newline character), you must double any “\” that you use in LIKE strings. For example, to search for “\n”, specify it as “\\n”. To search for “\”, specify it as “\\\\”; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.
Therefore:
SELECT * FROM `table` WHERE `field` LIKE '%\\\\u0435\\\\u0442\\\\u043e\\\\u0442%'
See it on sqlfiddle.
it can be useful for those who use PHP, and it works for me
$where[] = 'organizer_info LIKE(CONCAT("%", :organizer, "%"))';
$bind['organizer'] = str_replace('"', '', quotemeta(json_encode($orgNameString)));

Rightmost occurrence string match in MySQL

I would like to extract the file extension from a field in MySQL that contains filenames. This means I need to find the final '.' character in the field and extract everything after that. The following code example partially works:
SELECT LCASE(RIGHT(filename, LENGTH(filename) - LOCATE('.', filename)))
FROM mytable;
except that it falls down for cases where the file name contains more than one '.', where it extracts too much. In most programming languages I'd expect to find a function that gives me a rightmost match, but I can't find any such thing for MySQL, nor can I find any discussion from people who have had the same problem and found a workaround.
There is the substring_index function - it does exactly what you are looking for:
SELECT substring_index(filename, '.', -1) FROM mytable
Edit:
See Martin's answer, using substring_index(), with a negative count parameter is a MUCH better approach!
I'm downvoting myself (actually that's not possible...), upvoting Martin's answer; ' wish I could pass the accepted answer to him... Maybe OP will do that.
Original answer:
The following may do the trick (ATN: length may be off by 1, also may want to deal with case of filename value without a dot character.
SELECT LCASE(RIGHT(filename, LOCATE('.', REVERSE(filename) ) ))
FROM mytable;
Beware however that this type of post-facto parsing can be quite expensive (read slow), and you may consider extracting the file extension to a separate column, at load time.

How can I find non-ASCII characters in MySQL?

I'm working with a MySQL database that has some data imported from Excel. The data contains non-ASCII characters (em dashes, etc.) as well as hidden carriage returns or line feeds. Is there a way to find these records using MySQL?
MySQL provides comprehensive character set management that can help with this kind of problem.
SELECT whatever
FROM tableName
WHERE columnToCheck <> CONVERT(columnToCheck USING ASCII)
The CONVERT(col USING charset) function turns the unconvertable characters into replacement characters. Then, the converted and unconverted text will be unequal.
See this for more discussion. https://dev.mysql.com/doc/refman/8.0/en/charset-repertoire.html
You can use any character set name you wish in place of ASCII. For example, if you want to find out which characters won't render correctly in code page 1257 (Lithuanian, Latvian, Estonian) use CONVERT(columnToCheck USING cp1257)
You can define ASCII as all characters that have a decimal value of 0 - 127 (0x00 - 0x7F) and find columns with non-ASCII characters using the following query
SELECT * FROM TABLE WHERE NOT HEX(COLUMN) REGEXP '^([0-7][0-9A-F])*$';
This was the most comprehensive query I could come up with.
It depends exactly what you're defining as "ASCII", but I would suggest trying a variant of a query like this:
SELECT * FROM tableName WHERE columnToCheck NOT REGEXP '[A-Za-z0-9]';
That query will return all rows where columnToCheck contains any non-alphanumeric characters. If you have other characters that are acceptable, add them to the character class in the regular expression. For example, if periods, commas, and hyphens are OK, change the query to:
SELECT * FROM tableName WHERE columnToCheck NOT REGEXP '[A-Za-z0-9.,-]';
The most relevant page of the MySQL documentation is probably 12.5.2 Regular Expressions.
This is probably what you're looking for:
select * from TABLE where COLUMN regexp '[^ -~]';
It should return all rows where COLUMN contains non-ASCII characters (or non-printable ASCII characters such as newline).
One missing character from everyone's examples above is the termination character (\0). This is invisible to the MySQL console output and is not discoverable by any of the queries heretofore mentioned. The query to find it is simply:
select * from TABLE where COLUMN like '%\0%';
Based on the correct answer, but taking into account ASCII control characters as well, the solution that worked for me is this:
SELECT * FROM `table` WHERE NOT `field` REGEXP "[\\x00-\\xFF]|^$";
It does the same thing: searches for violations of the ASCII range in a column, but lets you search for control characters too, since it uses hexadecimal notation for code points. Since there is no comparison or conversion (unlike #Ollie's answer), this should be significantly faster, too. (Especially if MySQL does early-termination on the regex query, which it definitely should.)
It also avoids returning fields that are zero-length. If you want a slightly-longer version that might perform better, you can use this instead:
SELECT * FROM `table` WHERE `field` <> "" AND NOT `field` REGEXP "[\\x00-\\xFF]";
It does a separate check for length to avoid zero-length results, without considering them for a regex pass. Depending on the number of zero-length entries you have, this could be significantly faster.
Note that if your default character set is something bizarre where 0x00-0xFF don't map to the same values as ASCII (is there such a character set in existence anywhere?), this would return a false positive. Otherwise, enjoy!
Try Using this query for searching special character records
SELECT *
FROM tableName
WHERE fieldName REGEXP '[^a-zA-Z0-9#:. \'\-`,\&]'
#zende's answer was the only one that covered columns with a mix of ascii and non ascii characters, but it also had that problematic hex thing. I used this:
SELECT * FROM `table` WHERE NOT `column` REGEXP '^[ -~]+$' AND `column` !=''
In Oracle we can use below.
SELECT * FROM TABLE_A WHERE ASCIISTR(COLUMN_A) <> COLUMN_A;
for this question we can also use this method :
Question from sql zoo:
Find all details of the prize won by PETER GRÜNBERG
Non-ASCII characters
ans: select*from nobel where winner like'P% GR%_%berg';