Regexp MySql- Only strings containing two words - mysql

I have table with rows of strings.
I'd like to search for those strings that consists of only
two words.
I tried few ways with [[:space:]] etc but mysql was returning
three, four word strings also

try this:
select * from yourTable WHERE field REGEXP('^[[:alnum:]]+[[:blank:]]+[[:alnum:]]+$');
more details in link :
http://dev.mysql.com/doc/refman/5.1/en/regexp.html

^\w+\s\w+$ should do well.
Note; what I experience more often in the last days is that close to nobody uses the ^$-operators.
They are absolutely needed if you want to tell if a string starts or ends with something or want to match the string exactly, word for word, as you. "Normal" strings, like you used (I assume you used something like \w[:space]\w match in the string, what means that they also match if the condition is true anywhere within the string!
Keep that in mind and Regex will serve you well :)

REGEXP ('^[a-z0-9]*[[:space:]][a-z0-9]*$')

Related

Match specific string before user input

I have the following strings:
SDZ420-1241242,
AS42-9639263,
SPF3-2352353
I want to "escape" the SDZ420- part while searching and only search using the last digits, so far I've tried RLIKE '^[a-zA-Z\d-]' which works but I am confused on how to add the next digits (user input, say 1241242) to it. I cannot use LIKE '%$input' since that would return a row even if I just input '242' as the search string.
In simple words, a user input of '1241242' should return the row with 'SDZ420-1241242'. Is there any other approach other than creating a separate table with the numbers only?
Note that without jumping through some crazy hoops, this search needs to hit every row in the table; if you have an index on this, it's not going to use that (an index is generally used, assuming it's of the proper kind, which they tend to be, when you search on start, and generally only when using LIKE 'needle%' and not RLIKE. If that's a problem, storing the digits separately, and then putting an index on that, is probably the simplest way to solve your problem here.
To query for the final few digits, why not:
SELECT * FROM foo WHERE colName LIKE ?
with the string made in your programming language via:
String searchTerm = "%-" + digits;
You can also pass in the number as a string and use:
where substring_index(colname, '-', -1) = ?
This does not require changing the value in the application code.

MySql Specific Search - Replace String

I need to search words that contain multiple number prefixes.
Example:
0119
0129
0139
0149
But there is other prefixes, 0155859, 0128889
Etc.
If i search 0%9 it'll come up with all the results i don't want, it'll include the 0155859, 0128889 ones
I need to search and list ONLY the ones that have 0119, etc
How do i do it ?
0XX9 ( Where XX is any strings that matches, so 0119, 0129, etc. % Lists all other characters till a 9 appears, i don't want that. )
I'm trying on my english, correct me if i did'nt expressed myself right !
In a LIKE pattern, the _ character matches any single character. So you can do:
WHERE word LIKE '0__9%'
This matches a word that begins with 0, then any two characters, then 9, then anything after that.
My gut feeling at seeing your question was to consider using REGEXP, which is MySQL's regex matching operator. Try the following query:
SELECT *
FROM yourTable
WHERE word REGEXP '0[0-9][0-9]9'
The pattern used would match any word containing a zero, followed by any two numbers, followed by a 9.

How to find words by partially strings

I have been trying to solve this problem for hours, but I dont know how to approach it, so I would need a push to a right direction.
I want to create a page where users can find the appropriate word, by providing word length and characters.
For example, user wants to find all the 5 letter words, where the second letter is R and fourth V, like this:
_R_V_
I have a table with column WORDS with words "letter", "moon", "drive", "mrive" and the query should return: "drive" and "mrive".
Is it possible to do it in MySQL?
While I was looking for the direction I found that I should create a trie structure. I dont know how to do that, but I will learn it if there is no easier way.
Yes, you can use LIKE :
SELECT * FROM YourTable t
WHERE t.word_col LIKE '_R_V_'
_ Wildcard stands for any single character. This will also force the string to be 5 characters in length, since % wildcard is not used.
You can find a great explanation about LIKE wildcards in the link above.

Performance of LIKE 'xyz%' v/s LIKE '%xyz'

I was wondering how the LIKE operator actually work.
Does it simply start from first character of the string and try matching pattern, one character moving to the right? Or does it look at the placement of the %, i.e. if it finds the % to be the first character of the pattern, does it start from the right most character and starts matching, moving one character to the left on each successful match?
Not that I have any use case in my mind right now, just curious.
edit: made question narrow
If there is an index on the column, putting constant characters in the front will lead your dbms to use a more efficient searching/seeking algorithm. But even at the simplest form, the dbms has to test characters. If it is able to find it doesn't match early on, it can discard it and move onto the next test.
The LIKE search condition uses wildcards to search for patterns within a string. For example:
WHERE name LIKE 'Mickey%'
will locate all values that begin with 'Mickey' optionally followed by any number of characters. The % is not case sensitive and not accent sensitive and you can use multiple %, for example
WHERE name LIKE '%mouse%'
will return all values with 'mouse' (or 'Mouse' or 'mousé') in it.
The % is inclusive, meaning that
WHERE name like '%A%'
will return all that starts with an 'A', contain 'A' or end with 'A'.
You can use _ (underscore) for any character on a single position:
WHERE name LIKE '_at%'
will give you all values with 'a' as the second letter and 't' as the third. The first letter can be anything. For example: 'Batman'
In T-SQL, if you use [] you can find values in a range.
WHERE name LIKE '[c-f]%'
it will find any value beginning with letter between c and f, inclusive. Meaning it will return any value that start with c, d, e or f. This [] is T-SQL only. Use [^ ] to find values not in a range.
Finding all values that contain a number:
WHERE name LIKE '%[0-9]%'
returns everything that has a number in it. Example: 'Godfather2'
If you are looking for all values with the 3rd position to be a '-' (dash) use two underscores:
WHERE NAME '__-%'
It will return for example: 'Lo-Res'
Finding the values with names ends in 'xyz' use:
WHERE name LIKE '%xyz'
returns anything that ends with 'xyz'
Finding a % sign in a name use brackets:
WHERE name LIKE '%[%]%'
will return for example: 'Top%Movies'
Searching for [ use brackets around it:
WHERE name LIKE '%[[]%'
gives results as: 'New York [NY]'
The database collation's sort order determines both case sensitivety and the sort order for the range of characters. You can optionally use COLLATE to specify collation sort order used by the LIKE operator.
Usually the main performance bottleneck is IO. The efficiency of the LIKE operator can be only important if your whole table fits in the memory otherwise IO will take most of the time.
AFAIK oracle can use indexes for prefix matching. (like 'abc%'), but these index cannot be used for more complex expressions.
Anyway if you have only this kind of queries you should consider using a simple index on the related column. (Probably this is true for other RDBMS's as well.)
Otherwise LIKE operator is generally slow, but most of the RDBMS have some kind of full text searching solution. I think the main reason of the slowness is that LIKE is too general. Usually full text indexes has lots of different options which can tell the database what you really want to search for, and with these additional information the DB can do its task in a more efficient way.
As a rule of thumb I think if you want to search in a text field and you think performance can be an issue, you should consider your RDBMS's full text searching solution, or the real goal is not text searching, but this is some kind of "design side effect", for example xml/json/statuses stored in a field as text, then probably you should consider choosing a more efficient data storing option. (if there is any...)

MySQL REGEXP matching positive numbers

I have a column varchar[25] with this data inside :
886,-886
-886
0,-1234
1234
(empty)
0
the numbers might change in size from a 1 digit to a n digits.
and I need to be able to pull any row that has at least one positive number in it
I was thinking that something like
REGEXP '[^-,][0-9]+'
but this pulls -886 as 88 matches the regexp
you probably does not require regex
COL not like '-%' AND COL not like '%,-%'
however, this is the bad example of storing into incorrect data type,
split , and store into multiple rows ...and you can save some time for handling something like this question
Try using this :
"[^-\d]\d+\b"
which should work if i understood your question correctly.
a good regex reference table : http://www.regular-expressions.info/reference.html
I was able to figure out the best solution:
`COL` NOT LIKE '%-%'
I forgot to mention that the column might also contain words like:
all,-886
none,886
0,1,2,3,none
0
etc...
Try
REGEXP '[[:<:]][^-,][0-9]+[[:>:]]'
The :<: and :>: portions indicate word boundaries.
^\b\d+\b$
will give you positive integers.
^[^-]\d+((,([^-]\d+))?)+$
will give you only the lists where all the values are positive integers
For a list with any positive integer (but all valid integers negative or positive) I thinks this will check out:
^((-\d+,)?)+[^-]\d+((,([^-]\d+))?|(,-\d+)?)+$
Here's a great site for Regular Expressions:
http://gskinner.com/RegExr/
I use it all the time for testing live.