SQL - find exact match inside a string - mysql

I am using mySql Data base I have a table with a column for account hierarchy, the values in this column look like this
/home/first level hierarchy / second level hierarchy / third level hierarchy/
(some have only one level and some have 3 levels)
I want to get only the first level hierarchy items
my SQL query looks like this
SELECT Account.name FROM Account WHERE account_hierarchy like "/home/%/"
the problem is that i am getting in the result all the levels after home since they all ends with /
is there any way in SQL to get only the first level

I think you want exactly three slashes. You can get that using:
where hierarchy like '/home/%/' and
hierarchy not like '/home/%/%/'
In databases that support regular expressions, there are alternative solutions.
Actually, if you want just a beginning and ending slash, then there is no slash in the middle with other characters around it. So you can use:
where hierarchy not like '/home/%_/_%'
This assumes that all begin with '/home/'.

In MySQL, you can use REGEXP_SUBSTR
Example:
SELECT regexp_substr('/a/b/c','[a-z]+',3), returns b
Because 'b' is the piece after the second '/'.
You might need to change the regular expression '[a-z]+'. This on only matches lower-case paths. If you also need upper-case matches, change it to [a-zA-Z]+.
All the other posibilitues of regular expressions (in MySQL) are explained here.

Related

Difference between two seemingly similar regular expressions in MySQL - identical outputs

I have been trying to teach myself MySQL, and was wondering if somebody could please explain the difference between the use of the regex metacharacters '*' and '?'. The book I am using describes them both as "matching (0) or one instances of the strings preceding it". I tried using both while looking for the same thing in a practice table I created and got exactly the same output, so if one of the operators is supposed to be greedy and the other not, it doesn't look like that is always the case with every table.
Edit 1: I'm including a screenshot of the output I got from '*' that shows it matching a statement of the form 'ax*' to just a.
Edit 2: regex101.com does not list MySQL as a "flavor" and when I try to do 'al*' to Alexandra there, it says no match for any of the flavors. Is the fact that MySQL Workbench is returning Alexandra as one of the outputs something specific to MySQL that does not apply to any of these other languages?
They are not quite the same. "*" means "0 or more". "?" means "0 or 1", like "optional". So, given "ax*b" and "ax?b":
Neither will match "a"
Both will match "ab"
Both will match "axb"
Only the first will match "axxb" or "axxxxxxb"

Searching for entry with id in comma separated list in mysql

I want to get entries from a mysql table, which contain a given id within a comma separated list. I want to use regular expressions and the LIKE selector.
My current approach looks like this
SELECT * FROM table WHERE list LIKE '%,0,%';
with the problem being that this ignores the first and last element in a list like '0,1,2,3'.
I've tried using the | or operator to test for all possible cases.
SELECT * FROM table WHERE list LIKE '(%,0,%)|(^0,%)';
I've tried this with and without the ^ character and with and without the parenthesis, but in all cases this approach didn't even match the characters in the middle. In fact, the or operator doesn't seem to be working in even the simplest expressions like
SELECT * FROM table WHERE list LIKE '%(1|2)%';
You should fix your data model! DO not store lists of things -- especially numbers -- in a string. SQL has a great data model for storing lists: it is called a table.
If you are stuck with someone else's really, really, really bad choice of dta model, you can work around in. MySQL has a handy function, find_in_set(), that does what you want:
WHERE find_in_set('0', list) > 0
Concatenate commas to the start and the end of the list:
SELECT * FROM table WHERE concat(',', list, ',') LIKE '%,0,%';

Performance of LIKE 'xyz%' v/s LIKE '%xyz'

I was wondering how the LIKE operator actually work.
Does it simply start from first character of the string and try matching pattern, one character moving to the right? Or does it look at the placement of the %, i.e. if it finds the % to be the first character of the pattern, does it start from the right most character and starts matching, moving one character to the left on each successful match?
Not that I have any use case in my mind right now, just curious.
edit: made question narrow
If there is an index on the column, putting constant characters in the front will lead your dbms to use a more efficient searching/seeking algorithm. But even at the simplest form, the dbms has to test characters. If it is able to find it doesn't match early on, it can discard it and move onto the next test.
The LIKE search condition uses wildcards to search for patterns within a string. For example:
WHERE name LIKE 'Mickey%'
will locate all values that begin with 'Mickey' optionally followed by any number of characters. The % is not case sensitive and not accent sensitive and you can use multiple %, for example
WHERE name LIKE '%mouse%'
will return all values with 'mouse' (or 'Mouse' or 'mousé') in it.
The % is inclusive, meaning that
WHERE name like '%A%'
will return all that starts with an 'A', contain 'A' or end with 'A'.
You can use _ (underscore) for any character on a single position:
WHERE name LIKE '_at%'
will give you all values with 'a' as the second letter and 't' as the third. The first letter can be anything. For example: 'Batman'
In T-SQL, if you use [] you can find values in a range.
WHERE name LIKE '[c-f]%'
it will find any value beginning with letter between c and f, inclusive. Meaning it will return any value that start with c, d, e or f. This [] is T-SQL only. Use [^ ] to find values not in a range.
Finding all values that contain a number:
WHERE name LIKE '%[0-9]%'
returns everything that has a number in it. Example: 'Godfather2'
If you are looking for all values with the 3rd position to be a '-' (dash) use two underscores:
WHERE NAME '__-%'
It will return for example: 'Lo-Res'
Finding the values with names ends in 'xyz' use:
WHERE name LIKE '%xyz'
returns anything that ends with 'xyz'
Finding a % sign in a name use brackets:
WHERE name LIKE '%[%]%'
will return for example: 'Top%Movies'
Searching for [ use brackets around it:
WHERE name LIKE '%[[]%'
gives results as: 'New York [NY]'
The database collation's sort order determines both case sensitivety and the sort order for the range of characters. You can optionally use COLLATE to specify collation sort order used by the LIKE operator.
Usually the main performance bottleneck is IO. The efficiency of the LIKE operator can be only important if your whole table fits in the memory otherwise IO will take most of the time.
AFAIK oracle can use indexes for prefix matching. (like 'abc%'), but these index cannot be used for more complex expressions.
Anyway if you have only this kind of queries you should consider using a simple index on the related column. (Probably this is true for other RDBMS's as well.)
Otherwise LIKE operator is generally slow, but most of the RDBMS have some kind of full text searching solution. I think the main reason of the slowness is that LIKE is too general. Usually full text indexes has lots of different options which can tell the database what you really want to search for, and with these additional information the DB can do its task in a more efficient way.
As a rule of thumb I think if you want to search in a text field and you think performance can be an issue, you should consider your RDBMS's full text searching solution, or the real goal is not text searching, but this is some kind of "design side effect", for example xml/json/statuses stored in a field as text, then probably you should consider choosing a more efficient data storing option. (if there is any...)

Finding small letter between two capital letters - MySQL

I've got problem - I need to find every single phrase like AbC (small b, between two Capital letters).
For Example a statement:
Little John had a ProBlEm and need to know how to do tHiS.
I need to select ProBlEm and tHiS (you see, BlE and HiS, one small letter in between two capital).
How can I select this?
In MySQL you can use a binary (to ensure case sensitivity) regular expression to filter for those records that contain such a pattern:
WHERE my_column REGEXP BINARY '[[:upper:]][[:lower:]][[:upper:]]'
However, it is not so straightforward to extract the substrings which match such a pattern from within MySQL. One can use a UDF, e.g. lib_mysqludf_preg, but it's probably a task more suited to being performed within your application layer. In either case, regular expressions can again help to simplify this task.
Firstly you have split the String. Please refer this SO Question
and then search each retrive word like
substring(word,2) LIKE '[A-Z]' COLLATE latin1_general_cs

MySQL: REGEXP to remove part of a record

I have a table "locales" with a column named "name". The records in name always begin with a number of characters folowed by an underscore (ie "foo_", "bar_"...). The record can have more then one underscore and the pattern before the underscore may be repeated (ie "foo_bar_", "foo_foo_").
How, with a simple query, can I get rid of everything before the first underscore including the first underscore itself?
I know how to do this in PHP, but I cannot understand how to do it in MySQL.
SELECT LOCATE('_', 'foo_bar_') ... will give you the location of the first underscore and SUBSTR('foo_bar_', LOCATE('_', 'foo_bar_')) will give you the substring starting from the first underscore. If you want to get rid of that one, too, increment the locate-value by one.
If you now want to replace the values in the tables itself, you can do this with an update-statement like UPDATE table SET column = SUBSTR(column, LOCATE('_', column)).
select substring('foo_bar_text' from locate('_','foo_bar_text'))
MySQL REGEXs can only match data, they can't do replacements. You'd need to do the replacing client-side in your PHP script, or use standard string operations in MySQL to do the changes.
UPDATE sometable SET somefield=RIGHT(LENGTH(somefield) - LOCATE('_', somefield));
Probably got some off-by-one errors in there, but that's the basic way of going about it.