What is an efficient way to do pattern match in MySQL?

What is an efficient way to do pattern match in MySQL? - mysql

We have a huge pattern match in a query (about 12 strings to check).
Now I have to do this query in MySQL, but it seems that the query from PostgreSQL has not the same syntax.
From PostgreSQL:
SELECT value
FROM ...
WHERE ...
AND `value` !~* '.*(text|text2|text3|text4|text5).*';
How do I do this in MySQL in a efficient way? I know, that this is probably not efficient at all. What is the most efficient way to do this?
This does the trick, but is probably the worst query possible to do this:
SELECT `value`
FROM ...
WHERE ...
AND NOT (
`value` LIKE '%text%'
OR `value` LIKE '%text2%'
OR `value` LIKE '%text3%'
OR `value` LIKE '%text4%'
OR `value` LIKE '%text5%');
Is REGEXP the way to go here? Then I'll have to figure out the corresponding expression.

Yes, REGEXP or its alternative spelling RLIKE:
WHERE value NOT RLIKE 'text|text2|text3|text4|text5'
(a regexp match is not anchored to the ends of the string unless you explicitly use ^ and $, so you don't need the .*(...).*.)
You should test it to see whether it is actually faster than the version with LIKE. On my test database with four words to find, LIKE was still considerably faster. It also has the advantage of being ANSI standard SQL so you can use it on any database.
If this is a common query, you should also consider fulltext indexing.

Related

Unwanted results with REGEXP, MySQL

I am not good with REGEX, Can someone help me with optimizing my MySQL query, please?
SELECT activity
FROM activity
WHERE (LOWER( activity_name ) REGEXP '>mit|mit,|edited mit')
ORDER BY created_date DESC
When I replace '>mit|mit,|edited mit' with 'mit|mit,|edited mit', It works so fast but It displays additional records which are not needed. I tried even '/[>]/mit|mit,|edited mit', unfortunately, I got wrong result.
Thank you

Possibly the reason for the burst of speed in the second regexp was that things had been cached by the first test. Did you try both regexps twice?
This should be a little better:
WHERE activity_name LIKE '%mit%'
AND LOWER( activity_name ) REGEXP '>mit|mit,|edited mit'
LIKE is faster than REGEXP, but not nearly as powerful. So, the LIKE will filter out most rows, then the REGEXP will finish the filtering.
Another slight speedup: If activity_name has a ..._ci collation, you don't need LOWER().
Even faster would be to have a FULLTEXT index and do
WHERE MATCH(activity_name) AGAINST('+mit' IN BOOLEAN MODE)
AND REGEXP '>mit|mit,|edited mit'

Query several 'starts with' values in mySQL

I was trying to query values in a table where they started with 'a', 'b' or 'c'. I know in MS SQL you can make a [charlist] to do this:
( LIKE '[abc]%' )
I was wondering what the correct syntax was in other databases such as Oracle or mySQL.
Thanks

In MySQL you can use Regular Expressions
where some_column REGEXP '[abc].*'

MySQL follows the same syntax. Use % to wildcard, or ? single characters:
SELECT * FROM `table` WHERE `column` LIKE 'a%' OR `column` LIKE 'b%'...
Alternatively, You can use REGEXP in MySQL, but it will cause performance hits as you are unable to use indexes in order to query data.
For the SELECT LIKE 'a%' query it will be able to use the index to lookup all items starting with A, and return them efficiently. Using REGEXP will cause a row scan for each item, and performance will be greatly impacted.

MySQL LIke statement - multiple words

What would be the right SQL statement so that when I search two words, like for example 'text field' in a text box, it will return all results that has 'text' and 'field' in it using the LIKE statement? I cant find the right terms to make a search. If possible, I want to make it dynamic. Like if a user search 5 words, all 5 words would be in the Like statement. I am trying to achieve a statement something like this.
SELECT *
FROM TABLE
WHERE SEARCH (LIKE %searchterm1%)
OR (LIKE %searchterm2%)
OR (LIKE %searchterm3%) ....

Try This. http://dev.mysql.com/doc/refman/5.1/en/regexp.html#operator_regexp
SELECT *
FROM TABLE
WHERE SEARCH
REGEXP 'searchterm1|searchterm2|searchterm3'

Here's an example of a SQL SELECT statement that uses the LIKE comparison operator
SELECT t.*
FROM mytable t
WHERE t.col LIKE CONCAT('%','cdef','%')
AND t.col LIKE CONCAT('%','hijk','%')
AND t.col LIKE CONCAT('%','mnop','%')
Only rows that have a value in the col column that contains all of the strings 'cdef', 'hijk', and 'mnop' will be returned.
You specifically asked about the LIKE comparison operator. There's also a REGEXP operator that matches regular expressions. And the Full-Text search feature may be a good fit your use case.

Big MySQL table, REPLACE -> very slow query

I have a table with 17.6 million rows in a MyISAM database.
I want to searh an article number in it, but the result can not depend on special chars as dot,comma and others.
I'm using a query like this:
SELECT * FROM `table`
WHERE
replace(replace(replace( replace( `haystack` , ' ', '' ),
'/', '' ), '-', '' ), '.', '' )
LIKE 'needle'
This method is very-very slow. table has an index on haystack, but EXPLAIN shows query can not use that, That means query must scan 17.6 million rows - in 3.8 sec.
Query runs in a page multiple times (10-15x), so the page loads extremly slow.
What should i do? Is it a bad idea to use replace inside the query?

As you do the replace on the actual data in the table, MySQL can't use the index, as it doesn't have any indexed data of the result of the replace which it needs to compare to the needle.
That said, if your replace settings are static, it might be a good idea to denormalize the data and to add a new column like haystack_search which contains the data with all the replaces applied. This column could be filled during an INSERT or UPDATE. An index on this column can then effectively be used.
Note that you probably want to use % in your LIKE query as else it is effectively the same as a normal equal comparison. Now, if you use a searchterm like %needle% (that is with a variable start), MySQL again can't use the index and falls back to a table scan as it only can use the index if it sees a fixed start of the search term, i.e. something like needle%.
So in the end, you might end up having to tune your database engine so that it can held the table in memory. Another alternative with MyISAM tables (or with MySQL 5.6 and up also with InnoDB tables) is to use a fulltext index on your data which again allows rather efficient searching.

It's "bad" to apply functions to the column as it will force a scan of the column.
Perhaps this is a better method:
SELECT list
, of
, relevant
, columns
, only
FROM your_table
WHERE haystack LIKE 'two[ /-.]needles'
In this scenario we are searching for "two needles", where the space between the words could be any of the character within the square brackets i.e. "two needles", "two/needles", "two-needles" or "two.needles".

You could try using LENGTH on the column, not sure if it gives a better affect. Also, when using LIKE you should use the %
SELECT * FROM `table`
WHERE
haystack LIKE 'needle%' AND
LENGTH(haystack) - LENGTH(REPLACE(haystack,'/','')) = 0 AND
LENGTH(haystack) - LENGTH(REPLACE(haystack,'-','')) = 0 AND
LENGTH(haystack) - LENGTH(REPLACE(haystack,'.','')) = 0;
If the haystack is exactly needle then do this
SELECT * FROM `table`
WHERE
haystack='needle';

SQL SELECT LIKE (Insensitive casing)

I am trying to execute the sql query:
select * from table where column like '%value%';
But the data is saved as 'Value' ( V is capital ).
When I execute this query i don't get any rows.
How do i make the call such that, it looks for 'value' irrespective of the casing of the characters ?

use LOWER Function in both (column and search word(s)). Doing it so, you assure that the even if in the query is something like %VaLuE%, it wont matter
select qt.*
from query_table qt
where LOWER(column_name) LIKE LOWER('%vAlUe%');

If you want this column be case insensitive :
ALTER TABLE `schema`.`table`
CHANGE COLUMN `column` `column` TEXT CHARACTER SET 'utf8' COLLATE 'utf8_general_ci';
Thus, you don't have to change your query.
And the MySQL engine will process your query quicker than using lower() function or any other tricks.
And I'm not sure that using lower function will be a good solution for index searching performance.

Either use a case-insensitive collation on your table, or force the values to be lower case, e.g.
WHERE lower(column) LIKE lower('%value%');

Try using a case insensitive collation
select * from table
where column like '%value%' collate utf8_general_ci

Use the lower() function:
select t.*
from table t
where lower(column) like '%value%';

you should use either lower or upper function to ignore the case while you are searching for some field using like.
select * from student where upper(sname) like 'S%';
OR
select * from student where lower(sname) like 'S%';

If you are using PostgreSQL, a simpler solution is to use insensitive like (ILIKE):
SELECT * FROM table WHERE column ILIKE '%value%'

I know this is a very old question, but I'm posting this for posterity:
Non-binary string comparisons (including LIKE) are case-insensitive by default in MySql:
https://dev.mysql.com/doc/refman/en/case-sensitivity.html

This will eventually do the same thing. The ILIKE works, irrespective of the casing nature
SELECT *
FROM table
WHERE column_name ILIKE "%value%"

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

What is an efficient way to do pattern match in MySQL? - mysql

Related

Unwanted results with REGEXP, MySQL

Query several 'starts with' values in mySQL

MySQL LIke statement - multiple words

Big MySQL table, REPLACE -> very slow query

SQL SELECT LIKE (Insensitive casing)

Categories

Resources