mysql concat regexp word boundary and quote - mysql

Here is my query
SELECT producer FROM producers WHERE producer REGEXP CONCAT('[[:<:]]', 'dell\'', '[[:>:]]')
I replaced mysql like with this to use word boundary from another example here. But now I am having a problem with escaped apostrophe, it doesn't find the dell' in the database even if there is a match.

select count(*) from (select 'dell\'' as c) t where c regexp '[[:<:]]dell\''; -- -> 1
select count(*) from (select 'dell\'' as c) t where c regexp '[[:<:]]dell\'[[:>:]]'; -- -> 0
So it's the trailing boundary requirement which fails. Which makes sense. Quoting from the docs:
These markers stand for word boundaries. They match the beginning and
end of words, respectively. A word is a sequence of word characters
that is not preceded by or followed by word characters. A word
character is an alphanumeric character in the alnum class or an
underscore (_).
As ' is not a word character, it cannot be the end of a word, hence [[:>:]] can't match.

I created UDF and it solve the issue just give a call to a function you can set prefix, suffix and value depend upon your condition.
DROP FUNCTION IF EXISTS pp$$
CREATE FUNCTION pp(Slist varchar(100)) RETURNS char(100) CHARSET latin1
BEGIN
Declare sregex varchar(100);
SET slist = 'G';
return Concat('[[:<:]]',Slist,'[[:>:]]');
END;

Related

Comparison with trailing spaces in MySQL

This SQL query:
select c1 from table where c1='';
returns rows that have c1=' ' (one empty space) in MySQL.
Is this intended or a bug?
EDIT: please check SQL Fiddle link here, and the number of spaces in SELECT query doesn't matter.
It's all stated there in the documentation. I've quoted the important points here. But I would suggest to go through the full documentation
VARCHAR values are not padded when they are stored. Trailing spaces
are retained when values are stored and retrieved, in conformance with
standard SQL.
On the other hand, CHAR values are padded when they are stored but
trailing spaces are ignored when retrieved.
All MySQL collations are of type PADSPACE. This means that all CHAR,
VARCHAR, and TEXT values in MySQL are compared without regard to any
trailing spaces. “Comparison” in this context does not include the
LIKE pattern-matching operator, for which trailing spaces are
significant.
Explanation: Trailing spaces are ignored while comparing strings using comparison operator ('='). But trailing spaces are significant for LIKE (pattern matching operator)
This is documented behaviour.
The MySQL documentation for LIKE mentions
trailing spaces are significant, which is not true for CHAR or
VARCHAR comparisons performed with the = operator:
SQL Server works the same way.
If your column is from type CHAR and not VARCHAR, than this is correct.
On CHAR-Fields will trailing blanks on comparing ignored!
So
field = ''
field = ' '
are the same.
This behavior is in accordance with ANSI SQL-92 standard. Any database conforming to this standard will exhibit same behavior. Quote:
3) The comparison of two character strings is determined as fol-
lows:
a) If the length in characters of X is not equal to the length
in characters of Y, then the shorter string is effectively
replaced, for the purposes of comparison, with a copy of
itself that has been extended to the length of the longer
string by concatenation on the right of one or more pad char-
acters, where the pad character is chosen based on CS. If
CS has the NO PAD attribute, then the pad character is an
implementation-dependent character different from any char-
acter in the character set of X and Y that collates less
than any string under CS. Otherwise, the pad character is a
<space>.
b) The result of the comparison of X and Y is given by the col-
lating sequence CS.
So, according to these specs 'abc' = 'abc ' and '' = ' ' evaluate to true (but '' = '\t' is false).
If c1 is CHAR(1), then this is correct, as CHAR columns are fixed width and will be filled with blanks if necessary.
So even if you put '' into a CHAR(1) field you will get ' ' upon SELECTing. Also, filtering for an empty string will yield ' '.
Please accept Martin Smith's answer, as he gave the correct hint before me.
Also, as per MySQL documentation, trailing whitespace is ignored when comparing strings with =, so if your c1 column contains only spaces (or one in your case), it will be returned even though you filter WHERE c1 = '':
In particular, trailing spaces are significant, which is not true for CHAR or VARCHAR comparisons performed with the = operator
mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
| 1 | 0 |
+------------+---------------+
1 row in set (0.00 sec)
select c1, length(c1) as l
from table t_name
group by l
(figur eht oot)
Try this -
Select case when c1 = '' then ' ' else c1 end from table ;

MySQL regex for word boundary containing '#'

I'm trying to search for an example phrase: '#test123' using regex like:
SELECT (...) WHERE x RLIKE '[[:<:]]#test123[[:>:]]'
With no luck. Probably the word boundary selector '[[:<:]]' does not count '#' as a word.
How to achieve it? How to set in MySQL regex word boundary selector but with exceptions?
MySQL 5.7 Reference Manual / ... / Regular Expressions:
[[:<:]], [[:>:]]
These markers stand for word boundaries. They match the beginning and
end of words, respectively. A word is a sequence of word characters
that is not preceded by or followed by word characters. A word
character is an alphanumeric character in the alnum class or an
underscore (_).
So, # is a word boundary, not a word character. We need to expand "word characters" class to include # too. The simplest way is to enumerate custom word characters directly a-z0-9_#:
SELECT * FROM
(
SELECT '#test123' AS x UNION ALL
SELECT 'and #test123 too' UNION ALL
SELECT 'not#test123not' UNION ALL
SELECT 'not#test123' UNION ALL
SELECT '#test123not' UNION ALL
SELECT 'not # test123' UNION ALL
SELECT 'test123' UNION ALL
SELECT '#west123'
) t
WHERE x RLIKE '([^a-z0-9_#]|^)#test123([^a-z0-9_#]|$)';
Result:
x
----------------
#test123
and #test123 too
I think you can use below expression instead:
'[.#.][[:<:]]test123[[:>:]]'
Note: don't use non-word literals inside [[:<:]][[:>:]] and use [..] for characters.
Or (with thanks to #Y.B.)
'(^|.*[^a-zA-Z0-9_])[.#.][[:<:]]test123[[:>:]]'

MySQL query, checking for single vs double digits

I have a column in my client-provided database that has values such as '2; 3; 14' or '1', etc. I am using MySQL. How do I write the query so that
1) I can check if the column contains a number (1, for example)
2) I won't get a 'hit' if I am checking for a '1' and the value is actually '14', for example.
Thanks is advance
If column is varchar and you want to return row while searching for '1' in '1;3;14' then you can use REGEXP operator for regular expression search with word boundary character .
select * from test
where col regexp '[[:<:]]1[[:>:]]'
SQL FIddle Demo
From MySQL docs
Word Boundary Markers [[:<:]], [[:>:]]
These markers stand for word boundaries.
They match the beginning and end of words, respectively.
A word is a sequence of word characters that is not preceded by or followed by word characters.
A word character is an alphanumeric character in the alnum class or an underscore (_).
mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]'; -> 1
mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]'; -> 0

How to check for uppercase letters in MySQL?

I want to check if a string consists only of uppercase letters. I know that RLIKE/REGEXP are not case sensitive in MySQL, so I tried to use the :upper: character class:
SELECT 'z' REGEXP '^[[:upper:]]+$';
This gives true, although the z is in lower case. Why is that?
REGEXP is not case sensitive, except when used with binary strings.
http://dev.mysql.com/doc/refman/5.7/en/regexp.html
So with that in mind, just do something like this:
SELECT * FROM `users` WHERE `email` REGEXP BINARY '[A-Z]';
Using the above example, you'd get a list of emails that contain one or more uppercase letters.
For me this works and is not using a regexp. It basically compares the field with itself uppercased by mysql itself.
-- will detect all names that are not in uppercase
SELECT
name, UPPER(name)
FROM table
WHERE
BINARY name <> BINARY UPPER(name)
;
change to case sensitive collation, eg.
CHARACTER SET latin1 COLLATE latin1_general_cs
then try this query,
SELECT 'z' REGEXP '^[A-Z]+$'
SQLFiddle Demo
The most voted answer doesn't work for me, I get the error:
Character set 'utf8mb4_unicode_ci' cannot be used in conjunction with 'binary' in call to regexp_like.
I used the MD5 to compare the original value and the lowercased value:
SELECT * FROM user WHERE MD5(email) <> MD5(LOWER(email));

Select trims spaces from strings - is this a bug or in the spec?

in mysql:
select 'a' = 'a ';
return 1
You're not the first to find this frustrating. In this case, use LIKE for literal string comparison:
SELECT 'a' LIKE 'a '; //returns 0
This behavior is specified in SQL-92 and SQL:2008. For the purposes of comparison, the shorter string is padded to the length of the longer string.
From the draft (8.2 <comparison predicate>):
If the length in characters of X is not equal to the length in characters of Y, then the shorter string is effectively replaced, for the purposes of comparison, with a copy of itself that has been extended to the length of the longer string by concatenation on the right of one or more pad characters, where the pad character is chosen based on CS. If CS has the NO PAD characteristic, then the pad character is an implementation-dependent character different from any character in the character set of X and Y that collates less than any string under CS. Otherwise, the pad character is a <space>.
In addition to the other excellent solutions:
select binary 'a' = 'a '
I googled for "mysql string" and found this:
In particular, trailing spaces [using LIKE] are significant, which is not true for CHAR or VARCHAR comparisons performed with the = operator
From the documentation:
All MySQL collations are of type PADSPACE. This means that all CHAR and VARCHAR values in MySQL are compared without regard to any trailing spaces
The trailing spaces are stored in VARCHAR in MySQL 5.0.3+:
CREATE TABLE t_character (cv1 CHAR(10), vv1 VARCHAR(10), cv2 CHAR(10), vv2 VARCHAR(10));
INSERT
INTO t_character
VALUES ('a', 'a', 'a ', 'a ');
SELECT CONCAT(cv1, cv1), CONCAT(vv2, vv1)
FROM t_character;
but not used in comparison.
Here's another workaround that might help:
select 'a' = 'a ' and length('a') = length('a ');
returns 0