MySQL REGEXP matching NO exclamation point and then a word - mysql

I have a problem putting together the right REGEXP in MySQL. I have a database that could have something like this:
id | geo
---+--------
1 | NL
2 | US NL
3 | !US
4 | US
these are entries for geo-targeting or geo-blocking. #3 is not US, #1 is NL only. If I want to look for everything for the US I am using:
SELECT * FROM db WHERE geo REGEXP '[[:<:]]US[[:>:]]'
This would return 2, 3 and 4, but I don't want 3. I tried this:
SELECT * FROM db WHERE geo REGEXP '^![[:<:]]US[[:>:]]'
But that looks for everything starting with an exclamation point. I'm looking for a REGEXP to have the word 'US' and NO exclamation point. I just can't figure out how to make a 'doesn't contain' instead of a 'starts with' since they're both done with '^'

You can use this regex:
SELECT * FROM db WHERE geo REGEXP '(^|[^!])[[:<:]]US[[:>:]]';
This will match any non-word character except ! before US

Related

Regex for specific pattern - String followed by numbers

We're trying to use a REGEX expression inside MySQL.
Say we have a 2-column table with 5 rows as follow:
1 marketing
2 marketing1
3 marketing12
4 office5
5 marketing44Tomorrow
I'd like to have a SELECT statement that returns: marketing, marketing1, marketing12. Meaning a string (marketing) followed by nothing or by a number only.
This statement:
select * from ddd
where column_name2 REGEXP 'marketing[0-9]'
doesn't work as it does not return "marketing" alone and it will return "marketing44Tomorrow".
You can use : marketing([0-9]+)?[[:>:]]
`marketing` - any word start with **marketing**
`([0-9]+)` - any digit where....
1. `?` - Maybe there may there not
2. `[[:>:]]` - Must be the last
Result:
SELECT * FROM ddd WHERE column_name2 REGEXP 'marketing([0-9]+)?[[:>:]]'
try this,
select * from ddd where column_name2 REGEXP 'marketing[0-9]$'
As a conclusion, the perfect answer to my question in the MySQL context is:
SELECT * FROM ddd WHERE column_name2 REGEXP 'marketing([0-9]+)?[[:>:]]'
"MJN Belief" got it almost right up here.

MySQL strip ' on where queries

I have a large table which contains, or not, records that have ' tags like (martin's, lay's, martins, lays, so on).
Actually to search the client can be write exactly text, for example: martin's, to search all records that contains "martin's" but it is complicate, then, I need the client can to search by "martins" or "martin's".
This is a simple example:
A mysql table like:
ID | Title
---------------
1 lays
2 lay's
3 some text
4 other text
5 martin's
I need a sql query to search by lays or lay's and both need show me a Result like:
ID | Title
---------------
1 lays
2 lay's
I'm tried with many post solutions but I cant do that :-(
Appreciate any help.
Just remove the single quote:
select t.*
from t
where replace(t.title, '''', '') = 'lays';
To search if the word contains:
select t.*
from t
where replace(t.title, '''', '') LIKE '%lays%';

Creating an encryption in MySQL

I am trying to create a an encryption in MySQL.
Lets say there is a string of characters.
"I can run for as many".
I want to replace each letter with its fourth letter.
For example,
'a' replaced with 'e'
'b' replaced with 'f'
and so on.
The Final output of the above would look something like this.
"M gen vyr jsr ew qerc"
The best I could come up is the below.
Since it is a nested function, it is not giving me the right result.
The below function replaces 'a' with 'e' and then again replaces 'e' with 'i'
until it reaches the end.
select messagetext,
replace(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(replace(replace(replace
(replace(replace(replace(replace(messagetext,'a','e'),
'b','f'),'c','g'),'d','h'),'e','i'),'f','j'),'g','k'),
'h','l'),'i','m'),'j','n'),'k','o'),'l','p'),'m','q'),
'n','r'),'o','s'),'p','t'),'q','u'),'r','v'),'s','w'),
't','x'),'u','y'),'v','z'),'w','a'),'x','b'),'y','c'),
'z','d')
from chat;
Any help would be much appreciated.
Reached partial solution, now stuck.
Putting it here so that others can work on this.
Query:
SELECT UNHEX(HEX(val) + REPEAT('04', LENGTH(val))) AS rot4 FROM caeser;
+--------+
| rot4 |
+--------+
| efghip |
| aq?? | <-- Need to rotate / mod hex value for this. (Stuck here)
+--------+
Table create queries:
CREATE TABLE caeser (val VARCHAR(255));
INSERT INTO caeser ('abcdef');
INSERT INTO caeser VALUES ('uvwxyz');
PS: will convert to community wiki, if others also contribute.

MySQL select UTF-8 string with '=' but not with 'LIKE'

I have a table with some words that come from medieval books and have some accented letters that doesn't exists anymore in modern latin1 alphabet. I can represent these letters easily with UTF-8 combining characters. For example, to create a "J" with a tilde, I use the UTF-8 sequence \u004A+\u0303 and the J becomes accented with a tilde.
The table uses utf8 encoding and the field collation is utf8_unicode_ci.
My problem is the following: If I try to select the entire string, I receive the correct answer. If I try to select using 'LIKE', I receive the wrong answer.
For example:
mysql> select word, hex(word) from oldword where word = 'hua';
+--------+--------------+
| word | hex(word) |
+--------+--------------+
| hũa | 6875CC8361 |
| huã | 6875C3A3 |
| hua | 687561 |
| hũã | 6875CC83C3A3 |
+--------+--------------+
4 rows in set (0,04 sec)
mysql> select word, hex(word) from oldword where word like 'hua';
+-------+------------+
| word | hex(word) |
+-------+------------+
| huã | 6875C3A3 |
| hua | 687561 |
+-------+------------+
2 rows in set (0,04 sec)
I don't want to search only the entire word. I want to search words that start with some substring. Eventually the searched word is the entire word.
How could I select the partial string using like and match all the strings?
I tried to create a custom collation using this information, but the server became unstable and only after a lot of trials and errors I was able to revert to the utf8_unicode_ci collation again and the server returned to normal condition.
EDIT: There's a problem with this site and some characters don't display correctly. Please see the results on these pastebins:
http://pastebin.com/mckJTLFX
http://pastebin.com/WP87QvgB
After seeing Marcus Adams' answer I realized that the REPLACE function could be the solution for this problem, although he didn't mentioned this function.
As I have only two different combining characters (acute and tilde), combined with other ASCII characters, for example j with tilde, j with acute, m with tilde, s with tilde, and so on. I just have to replace these two characters when using LIKE.
After searching the manual, I learned about the UNHEX function that helped me to properly represent the combining characters alone in the query to remove them.
The combining tilde is represented by CC83 in HEX code and the acute is represented by CC81 in HEX.
So, the query that solves my problem is this one.
SELECT word, REPLACE(REPLACE(word, UNHEX("CC83"), ""), UNHEX("CC81"), "")
FROM oldword WHERE REPLACE(REPLACE(word, UNHEX("CC83"), ""), UNHEX("CC81"), "")
LIKE 'hua%';`
The problem is that LIKE performs the comparison character-by-character and when using the "combining tilda", it literally is two characters, though it displays as one (assuming your client supports displaying it as such).
There will never be a case where comparing e.g. hu~a to hua character-by-character will match because it's comparing ~ with a for the third character.
Collations (and coercions) work in your favor and handle such things when comparing the string as a whole, but not when comparing character-by-character.
Even if you considered using SUBSTRING() as a hack instead of using LIKE with a wildcard % to perform a prefix search, consider the following:
SELECT SUBSTRING('hũa', 1, 3) = 'hua'
-> 0
SELECT SUBSTRING('hũa', 1, 4) = 'hua'
-> 1
You kind of have to know the length you're going for or brute force it like this:
SELECT * FROM oldword
WHERE SUBSTRING(word, 1, 3) = 'hua'
OR SUBSTRING(word, 1, 4) = 'hua'
OR SUBSTRING(word, 1, 5) = 'hua'
OR SUBSTRING(word, 1, 6) = 'hua'
According to this:
ũ collates equal to plain U in all utf8 collations on 5.6.
j́ collates equal to plain J in most collations; exceptions:
utf8_general*ci because it is actually j plus an accent. And the "general" collations only look at one character (as distinguished from byte) at a time. Most collations take into consideration multiple characters, such as ch or ll in Spanish or ss in German.
utf8_roman_ci, which is quite an oddball. j́=i=j
(LIKE does not exactly follow the regular rules of collation. I am not versed on the details, but I think that J is represented as 2 characters causes it to work differently in LIKE than in WHERE or ORDER BY. Furthermore, I don't know whether REPLACE() collates like LIKE or the other places.)
You can use the % symbol like a wildcard character. For example this:
SELECT word
FROM myTable
WHERE word LIKE 'hua%';
This will pull all records that start with hua and have 0+ characters following it. Here is an SQL Fiddle example.

Find occurrence of a substring in string in MySQL?

I'm using this query to getresults from my database:
MATCH(`Text2`) AGAINST ('$s')
I want to get only results when there is a full match of the string, like when on google when you search between quotes "".
How can I do this with Match/MySQL?
EG: Query is "ab cd"
ID | Text
1 ab cd
2 aab cda
3 aab a cd
Row 1 and 2 should be returned
SELECT FROM your_table WHERE Text2 LIKE '%yourstring%';
Try this::
If you need the wild search irrespective to the cases ::
SELECT FROM LOWER(mytable) WHERE LOWER(Text2) LIKE LOWER('%yourstring%');
You can use REGEXP for this purpose too.
SELECT *
FROM `tableName`
WHERE `columnName` REGEXP 'ab cd';