How to regex in a MySQL query - mysql

I have a simple task where I need to search a record starting with string characters and a single digit after them. What I'm trying is this
SELECT trecord FROM `tbl` WHERE (trecord LIKE 'ALA[d]%')
And
SELECT trecord FROM `tbl` WHERE (trecord LIKE 'ALA[0-9]%')
But both of the queries always return a null record
trecord
-------
null
Where as if I execute the following query
SELECT trecord FROM `tbl` WHERE (trecord LIKE 'ALA%')
it returns
trecord
-------
ALA0000
ALA0001
ALA0002
It means that I have records that starts with ALA and a digit after it,
EDIT
I'm doing it using PHP MySQL and innodb engine to be specific.

I think you can use REGEXP instead of LIKE
SELECT trecord FROM `tbl` WHERE (trecord REGEXP '^ALA[0-9]')

In my case (Oracle), it's WHERE REGEXP_LIKE(column, 'regex.*'). See here:
SQL Function
Description
REGEXP_LIKE
This function searches a character column for a pattern. Use this
function in the WHERE clause of a query to return rows matching the
regular expression you specify.
...
REGEXP_REPLACE
This function searches for a pattern in a character column and
replaces each occurrence of that pattern with the pattern you specify.
...
REGEXP_INSTR
This function searches a string for a given occurrence of a regular
expression pattern. You specify which occurrence you want to find and
the start position to search from. This function returns an integer
indicating the position in the string where the match is found.
...
REGEXP_SUBSTR
This function returns the actual substring matching the regular
expression pattern you specify.
(Of course, REGEXP_LIKE only matches queries containing the search string, so if you want a complete match, you'll have to use '^$' for a beginning (^) and end ($) match, e.g.: '^regex.*$'.)

Related

How to find variable pattern in MySql with Regex?

I am trying to pull a product code from a long set of string formatted like a URL address. The pattern is always 3 letters followed by 3 or 4 numbers (ex. ???### or ???####). I have tried using REGEXP and LIKE syntax, but my results are off for both/I am not sure which operators to use.
The first select statement is close to trimming the URL to show just the code, but oftentimes will show a random string of numbers it may find in the URL string.
The second select statement is more rudimentary, but I am unsure which operators to use.
Which would be the quickest solution?
SELECT columnName, SUBSTR(columnName, LOCATE(columnName REGEXP "[^=\-][a-zA-Z]{3}[\d]{3,4}", columnName), LENGTH(columnName) - LOCATE(columnName REGEXP "[^=\-][a-zA-Z]{3}[\d]{3,4}", REVERSE(columnName))) AS extractedData FROM tableName
SELECT columnName FROM tableName WHERE columnName LIKE '%___###%' OR columnName LIKE '%___####%'
-- Will take a substring of this result as well
Example Data:
randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz123&hello_world=us&etc_etc
In this case, the desired string is "xyz123" and the location of said pattern is variable based on each entry.
EDIT
SELECT column, LOCATE(column REGEXP "([a-zA-Z]{3}[0-9]{3,4}$)", column), SUBSTR(column, LOCATE(column REGEXP "([a-zA-Z]{3}[0-9]{3,4}$)", column), LENGTH(column) - LOCATE(column REGEXP "^.*[a-zA-Z]{3}[0-9]{3,4}", REVERSE(column))) AS extractData From mainTable
This expression is still not grabbing the right data, but I feel like it may get me closer.
I suggest using
REGEXP_SUBSTR(column, '(?<=[&?]random_code=[^&#]{0,256}-)[a-zA-Z]{3}[0-9]{3,4}(?![^&#])')
Details:
(?<=[&?]random_code=[^&#]{0,256}-) - immediately on the left, there must be & or &, random_code=, and then zero to 256 chars other than & and # followed with a - char
[a-zA-Z]{3} - three ASCII letters
[0-9]{3,4} - three to four ASCII digits
(?![^&#]) - that are followed either with &, # or end of string.
See the online demo:
WITH cte AS ( SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz123&hello_world=us&etc_etc' val
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz4567&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz89&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz00000&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-aaaaa11111&hello_world=us&etc_etc')
SELECT REGEXP_SUBSTR(val,'(?<=[&?]random_code=[^&#]{0,256}-)[a-zA-Z]{3}[0-9]{3,4}(?![^&#])') output
FROM cte
Output:
I'd make use of capture groups:
(?<=[=\-\\])([a-zA-Z]{3}[\d]{3,4})(?=[&])
I assume with [^=\-] you wanted to capture string with "-","\" or "=" in front but not include those chars in the result. To do that use "positive lookbehind" (?<=.
I also added a lookahead (?= for "&".
If you'd like to fidget more with regex I recommend RegExr

Why is the regular expression setminus operator not working for MySQL?

I am trying to select all client names without vowels from a table (should therefore return an empty list) using the setminus operator with regular expressions, but it is simply returning the entire column. The same happens if I try to select all client names without 'a' or 'e' or any other vowel.
This is the query I'm using:
select client_name from client
where client_name regexp '[^aeiou]';
If I try doing a condition like below, then the inside caret actually does take every character other than 'a'. I'm not sure why it doesn't work by itself though.
select client_name from client
where client_name regexp '^[^a]'
Expected - empty output
Actual Results - whole column is returned
The regular expression can match anywhere in the name. So it will match any name that has any non-vowel character, not where all the characters are not vowels. You need to anchor it and quantify it:
WHERE client_name REGEXP '^[^aeiou]*$'
This tests all the characters in the name.
Or you can negate the test:
WHERE client_name NOT REGEXP '[aeiou]'
The regexp matches a vowel anywhere in the name. Then using NOT makes this return the names that don't match.

How to use SQL to remove superfluous characters from names?

How do I remove all superfluous full-stop . and semi-colon ; characters from end of last name field values in SQL?
One way to check of the last character is a "full stop" or "semicolon" is to use a substring function to get the last character, and compare that to the characters you are looking for. (There are several ways to do this, for example, using LIKE or REGEXP operator.
If that last character matches, then lop off that last character. One way to do that is to use a substring function. (Use the CHAR_LENGTH function to return the number of characters in the string.)
For example, something like this:
UPDATE mytable t
SET t.last_name = SUBSTR(t.last_name,1,CHAR_LENGTH(t.last_name)-1)
WHERE SUBSTRING(t.last_name,CHAR_LENGTH(t.last_name),1) IN ('.',';')
But, I'd strongly recommend that you test those expressions using a SELECT statement, before running an UPDATE statement.
SELECT t.last_name AS old_val
, SUBSTR(t.last_name,1,CHAR_LENGTH(t.last_name)-1) AS new_val
FROM mytable t
WHERE SUBSTRING(t.last_name,CHAR_LENGTH(t.last_name),1) IN ('.',';')
Substring rows that have a semi-colon or dot :
update emp
set ename = substring(ename, 1, char_length(ename) - 1)
where ename REGEXP '[.;]$';

Mysql SELECT all rows where char exists in value but not the last one

I need a SELECT query in MYSQL that will retrieve all rows in one table witch field values contain "?" char with one condition: the char is not the last character
Example:
ID Field
1 123??see
2 12?
3 45??78??
Returning rows would then be those from ID 1 and 3 that match the condition given
The only statement I have is:
SELECT *
FROM table
WHERE Field LIKE '%?%'
But, the MySQL query does not solve my problem..
The LIKE expressions also support a wildcard "_" which matches exactly one character.
So you can write an expression like the example below, and know that your "?" will not be the last character in the string. There must be at least one more character.
WHERE intrebare LIKE '%?_%'
Re comment from #JohnRuddell,
Yes, that's true, this will match the string "??" because a "?" exists in a position that is not the last character.
It depends whether the OP means for that to be a match or not. The OP says the string "45??78??" is a match, but it's not clear if they would intend that "4578??" to be a match.
An alternative is to use a regular expression, but this is a little more tricky because you have to escape a literal "?", so it won't be interpreted as a regexp metacharacter. Then also escape the escape character.
WHERE intrebare REGEXP '\\?[^?]'
you can just add an additional where where the last character is not a ?
SELECT *
FROM intrebari
WHERE intrebare LIKE '%?%' AND intrebare NOT LIKE '%?'
you could also do it like this
SELECT *
FROM intrebari
WHERE intrebare LIKE '%?%' AND RIGHT(intrebare,1) <> '?'
DEMO

Find MySQL DB rows with a match in a pipe delimited column

I'm querying a table that has a column with member_ids stuffed in a pipe delimited string. I need to return all rows where there is an 'exact' match for a specific member_id. How do I deal with other IDs in the string which might match 'part' of my ID?
I might have some rows as follows:
1|34|11|23
1011
23|1
5|1|36
64|23
If I want to return all rows with the member_id '1' (row 1, 3 and 4) is that possible without having to extract all rows and explode the column to check if any of the items in the resulting array match.
MySQL's regular expressions support a metacharacter class that matches word boundaries:
SELECT ...
FROM mytable
WHERE member_ids REGEXP '[[:<:]]1[[:>:]]'
See http://dev.mysql.com/doc/refman/5.6/en/regexp.html
If you don't like that, you can search using a simpler regular expression, but you have to escape the pipes because they have special meaning to regular expressions. And you also have to escape the escaping backslashes so you get literal backslashes through to the regular expression parser.
SELECT ...
FROM mytable
WHERE member_ids REGEXP '\\|1\\|'
You can do this in one expression if you modify your strings to include a delimiter at the start and the end of the string. Then you don't have to add special cases for beginning of string and end of string.
Note this is bound to do a table-scan. There's no way to index a regular expression match in MySQL. I agree with #MichaelBerkowski, you would be better off storing the member id's in a subordinate table, one id per row. Then you could search and sort and all sorts of other things that the pipe-delimited string makes awkward, inefficient, or impossible. See also my answer to Is storing a delimited list in a database column really that bad?
'|' has a specific meaning in REGEXP. So suppose that the ids are separated by another delimiter like '~'.
Then you can run this code:
SELECT * FROM `t1`
where (Address Regexp '^1~') or
(Address Regexp '~1$') or
(Address Regexp '^1$') or
(Address Regexp '~1~')