MySQL query, checking for single vs double digits - mysql

I have a column in my client-provided database that has values such as '2; 3; 14' or '1', etc. I am using MySQL. How do I write the query so that
1) I can check if the column contains a number (1, for example)
2) I won't get a 'hit' if I am checking for a '1' and the value is actually '14', for example.
Thanks is advance

If column is varchar and you want to return row while searching for '1' in '1;3;14' then you can use REGEXP operator for regular expression search with word boundary character .
select * from test
where col regexp '[[:<:]]1[[:>:]]'
SQL FIddle Demo
From MySQL docs
Word Boundary Markers [[:<:]], [[:>:]]
These markers stand for word boundaries.
They match the beginning and end of words, respectively.
A word is a sequence of word characters that is not preceded by or followed by word characters.
A word character is an alphanumeric character in the alnum class or an underscore (_).
mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]'; -> 1
mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]'; -> 0

Related

MySQL - query to get all rows that a specific character is non-English

I have a table that has nvarchar elements.
This table has two kinds of elements:
elements with only digit characters
elements with digit characters and the 3rd character is non-English character
I want a query to get all rows that their 3rd character is non-English.
EDIT
use WHERE SUBSTRING(<table>.ColumnName, 3, 1) NOT BETWEEN '0' AND '9' worked for me either
I'd use regexp_like with a regex that the third character isn't a digit:
SELECT *
FROM mytable
WHERE REGEXP_LIKE(mycol, '..[^[:digit:]].*')
In MySQL versions older than 8.0, you could use the regexp operator:
SELECT *
FROM mytable
WHERE mycol REGEXP '..[^[:digit:]].*'
You can use RLIKE operator, below is the query for matching the third character which is not a digit and not an English alphabet
SELECT * FROM
mytable
where SUBSTR(mycol,3,1) NOT RLIKE '^[A-Za-z0-9]$';

MySQL regex for word boundary containing '#'

I'm trying to search for an example phrase: '#test123' using regex like:
SELECT (...) WHERE x RLIKE '[[:<:]]#test123[[:>:]]'
With no luck. Probably the word boundary selector '[[:<:]]' does not count '#' as a word.
How to achieve it? How to set in MySQL regex word boundary selector but with exceptions?
MySQL 5.7 Reference Manual / ... / Regular Expressions:
[[:<:]], [[:>:]]
These markers stand for word boundaries. They match the beginning and
end of words, respectively. A word is a sequence of word characters
that is not preceded by or followed by word characters. A word
character is an alphanumeric character in the alnum class or an
underscore (_).
So, # is a word boundary, not a word character. We need to expand "word characters" class to include # too. The simplest way is to enumerate custom word characters directly a-z0-9_#:
SELECT * FROM
(
SELECT '#test123' AS x UNION ALL
SELECT 'and #test123 too' UNION ALL
SELECT 'not#test123not' UNION ALL
SELECT 'not#test123' UNION ALL
SELECT '#test123not' UNION ALL
SELECT 'not # test123' UNION ALL
SELECT 'test123' UNION ALL
SELECT '#west123'
) t
WHERE x RLIKE '([^a-z0-9_#]|^)#test123([^a-z0-9_#]|$)';
Result:
x
----------------
#test123
and #test123 too
I think you can use below expression instead:
'[.#.][[:<:]]test123[[:>:]]'
Note: don't use non-word literals inside [[:<:]][[:>:]] and use [..] for characters.
Or (with thanks to #Y.B.)
'(^|.*[^a-zA-Z0-9_])[.#.][[:<:]]test123[[:>:]]'

mysql concat regexp word boundary and quote

Here is my query
SELECT producer FROM producers WHERE producer REGEXP CONCAT('[[:<:]]', 'dell\'', '[[:>:]]')
I replaced mysql like with this to use word boundary from another example here. But now I am having a problem with escaped apostrophe, it doesn't find the dell' in the database even if there is a match.
select count(*) from (select 'dell\'' as c) t where c regexp '[[:<:]]dell\''; -- -> 1
select count(*) from (select 'dell\'' as c) t where c regexp '[[:<:]]dell\'[[:>:]]'; -- -> 0
So it's the trailing boundary requirement which fails. Which makes sense. Quoting from the docs:
These markers stand for word boundaries. They match the beginning and
end of words, respectively. A word is a sequence of word characters
that is not preceded by or followed by word characters. A word
character is an alphanumeric character in the alnum class or an
underscore (_).
As ' is not a word character, it cannot be the end of a word, hence [[:>:]] can't match.
I created UDF and it solve the issue just give a call to a function you can set prefix, suffix and value depend upon your condition.
DROP FUNCTION IF EXISTS pp$$
CREATE FUNCTION pp(Slist varchar(100)) RETURNS char(100) CHARSET latin1
BEGIN
Declare sregex varchar(100);
SET slist = 'G';
return Concat('[[:<:]]',Slist,'[[:>:]]');
END;

More efficient word boundary query in mySQL

I have a table with 1/2 million phrases and I am doing word matching using this query:
SELECT * FROM `searchIndex` WHERE `indexData` RLIKE '[[:<:]]Hirt'
The indexData field has a FULLTEXT index and is datatype longtext.
I want to match on items like
"Alois M. Hirt"
"Show Biz - Al Hirt, in a new role, ..."
"Al Hirt's Sinatraville open 9 p..."
"Hirt will be playing..."
and not on "shirt" or "thirteen" or "thirty" etc.
The query is succeeding but it frequently takes 3 seconds to return and I wondered if there was a better, more efficient way of doing this word boundary match?
If I were to add another index to indexData what would be the correct keylength to use?
TIA
No need to have a FULLTEXT index. MySQL has special markers for word boundaries. From the MySQL doc:
[[:<:]], [[:>:]]
These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).
mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]'; -> 1
mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]'; -> 0
setsuna's answer worked very well:
SELECT * FROM searchIndex WHERE MATCH (indexData) AGAINST ('Hirt*' IN BOOLEAN MODE);

Select trims spaces from strings - is this a bug or in the spec?

in mysql:
select 'a' = 'a ';
return 1
You're not the first to find this frustrating. In this case, use LIKE for literal string comparison:
SELECT 'a' LIKE 'a '; //returns 0
This behavior is specified in SQL-92 and SQL:2008. For the purposes of comparison, the shorter string is padded to the length of the longer string.
From the draft (8.2 <comparison predicate>):
If the length in characters of X is not equal to the length in characters of Y, then the shorter string is effectively replaced, for the purposes of comparison, with a copy of itself that has been extended to the length of the longer string by concatenation on the right of one or more pad characters, where the pad character is chosen based on CS. If CS has the NO PAD characteristic, then the pad character is an implementation-dependent character different from any character in the character set of X and Y that collates less than any string under CS. Otherwise, the pad character is a <space>.
In addition to the other excellent solutions:
select binary 'a' = 'a '
I googled for "mysql string" and found this:
In particular, trailing spaces [using LIKE] are significant, which is not true for CHAR or VARCHAR comparisons performed with the = operator
From the documentation:
All MySQL collations are of type PADSPACE. This means that all CHAR and VARCHAR values in MySQL are compared without regard to any trailing spaces
The trailing spaces are stored in VARCHAR in MySQL 5.0.3+:
CREATE TABLE t_character (cv1 CHAR(10), vv1 VARCHAR(10), cv2 CHAR(10), vv2 VARCHAR(10));
INSERT
INTO t_character
VALUES ('a', 'a', 'a ', 'a ');
SELECT CONCAT(cv1, cv1), CONCAT(vv2, vv1)
FROM t_character;
but not used in comparison.
Here's another workaround that might help:
select 'a' = 'a ' and length('a') = length('a ');
returns 0