Unwanted results with REGEXP, MySQL - mysql

I am not good with REGEX, Can someone help me with optimizing my MySQL query, please?
SELECT activity
FROM activity
WHERE (LOWER( activity_name ) REGEXP '>mit|mit,|edited mit')
ORDER BY created_date DESC
When I replace '>mit|mit,|edited mit' with 'mit|mit,|edited mit', It works so fast but It displays additional records which are not needed. I tried even '/[>]/mit|mit,|edited mit', unfortunately, I got wrong result.
Thank you

Possibly the reason for the burst of speed in the second regexp was that things had been cached by the first test. Did you try both regexps twice?
This should be a little better:
WHERE activity_name LIKE '%mit%'
AND LOWER( activity_name ) REGEXP '>mit|mit,|edited mit'
LIKE is faster than REGEXP, but not nearly as powerful. So, the LIKE will filter out most rows, then the REGEXP will finish the filtering.
Another slight speedup: If activity_name has a ..._ci collation, you don't need LOWER().
Even faster would be to have a FULLTEXT index and do
WHERE MATCH(activity_name) AGAINST('+mit' IN BOOLEAN MODE)
AND REGEXP '>mit|mit,|edited mit'

Related

How to convert WHERE IN clause with multiple values to RegExp or Like

I want to convert this:
SELECT id,songTitle,artistName, trackId
FROM songs
WHERE (songTitle, artistName) IN (('come together', 'the beatles'),('all the small things', 'blink-182'))
To something like this but I don't know the right syntax:
SELECT id,songTitle,artistName, trackId
FROM songs
WHERE (songTitle, artistName) IN LIKE (('%come together%', '%the beatles%'),('%all the small things%', '%blink-182%'))
Except I'm searching 100s of more songs at once. We could use REGEXP too I just don't know the right syntax for either of those.
WHERE (a,b) IN ((1,2), ...) is very poorly optimized.
Leading wild cards in LIKE prevents use of an index.
You can't do the construct you attempted.
So, performance aside, let's look at how to perform the task:
WHERE ( songTitle LIKE '%come together%' AND artistName LIKE '%the beatles%')
OR ( .... )
OR ...
Sorry, there is no short cut.
REGEXP can't help in this case.
FULLTEXT indexing is something to consider, but I don't see that it would help in this example.

ORDER BY in MySql based on LIKE condition

I am facing difficulty in sorting the result based on field in mysql. Say for example I am searching the word "Terms" then I should get the results which starts with 'Terms' first and then 'Terms and' as next and then 'Terms and conditions' and so on.
Any one please help out who to fetch the search result based on my requirements in efficient manner using mysql query.
SELECT * FROM your_table WHERE your_column LIKE "Terms%" ORDER BY your_column;
Based on the storage engine and mysql version you probably can use the full text search capabilities of MySQL. For example:
SELECT *, MATCH (your_column) AGAINST ('Terms' IN BOOLEAN MODE) AS relevance
FROM your_table
WHERE MATCH (your_column) AGAINST ('Terms' IN BOOLEAN MODE)
ORDER BY relevance
You can find more info here: http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
Or if you don't want FTS another possible solution where ordering is strictly based on the length (difference) of the strings.
SELECT * FROM your_table WHERE your_column LIKE "Terms%" ORDER BY ABS(LENGTH(your_column) - LENGTH('Terms'));
You are looking for fulltext search. Below a very simple example
SELECT id,name MATCH (name) AGAINST ('string' > 'string*' IN BOOLEAN MODE) AS score
FROM tablename WHERE MATCH (name) AGAINST ('string' > 'string*' IN BOOLEAN MODE)
ORDER BY score DESC
The advantage of this is that you can control the value of words. This is very basic, you can 'up' some matches or words (or 'down' them)
In my example an exact match ('string') would get a higher score than the string with something attached ('string*'). The following line is even one step broader:
'string' > 'string*' > '*string*'
This documentation about fulltextsearch explains allot. It's a long read, but worth it and complete.
Don't use fulltext index if you search for prefix string!
Using LIKE "Term%" the optimizer will be able to use a potential index on your_column:
SELECT * FROM your_table
WHERE your_column LIKE "Terms%"
ORDER BY CHAR_LENGTH(your_column),your_column
Note the ORDER BY clause: it first sorts by string length, and only use alphabetcal order to sort strings of equal length.
And please, use CHAR_LENGTH and not LENGTH as the first count the number of characters, whereas the later count number of bytes. Using a variable length encoding such as utf8, this would made a difference.

SQL search result ranking

I have a table call objects which there are the columns:
object_id,
name_english(vchar),
name_japanese(vchar),
name_french(vchar),
object_description
for each object.
When a user perform a search, they may enter either english, japanese or french... and my sql statement is:
SELECT
o.object_id,
o.name_english,
o.name_japanese,
o.name_french,
o.object_description
FROM
objects AS o
WHERE
o.name_english LIKE CONCAT('%',:search,'%') OR
o.name_japanese LIKE CONCAT('%',:search,'%') OR
o.name_french LIKE CONCAT('%',:search,'%')
ORDER BY
o.name_english, o.name_japanese, o.name_french ASC
And some of the entries are like:
Tin spoon,
Tin Foil,
Doctor Martin Shoes,
Martini glass,
Cutting board,
Ting Soda.
So, when the user search the word "Tin" it will return all results of these, but instead I just want to return the results which specific include the term "Tin" or displaying the result and rank them by relevance order. How can I achieve that?
Thanks.
You can use MySQL FULLTEXT indices to do that. This requires the MyISAM table type, an index on (name_english, name_japanese, name_french, object_description) or whatever fields you want to search on, and the appropriate use of the MATCH ... AGAINST operator on exactly that set of columns.
See the manual at http://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html, and the examples on the following page http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html
After running the query above , you will get all sort of results including ones that you are not interested, but you can then use regular expressions on the above results(returned by mysql server) set to filter out what u need.
This should do the trick - you may have to filter out duplicates, but the basic idea is obvious.
SELECT
`object`.`object_id`,
`object`.`name_english`,
`object`.`name_japanese`,
`object`.`name_french`,
`object`.`object_info`, 1 as ranking
FROM `objects` AS `object`
WHERE `object`.`name_english` LIKE CONCAT(:search,'%') OR `object`.`name_japanese` LIKE CONCAT(:search,'%') OR `object`.`name_french` LIKE CONCAT(:search,'%')
union
SELECT
`object`.`object_id`,
`object`.`name_english`,
`object`.`name_japanese`,
`object`.`name_french`,
`object`.`object_info`, 10 as ranking
FROM `objects` AS `object`
WHERE `object`.`name_english` LIKE CONCAT('%',:search,'%') OR `object`.`name_japanese` LIKE CONCAT('%',:search,'%') OR `object`.`name_french` LIKE CONCAT('%',:search,'%')
ORDER BY ranking, `object`.`name_english`, `object`.`name_japanese`, `object`.`name_french` ASC

Mysql multiple row index

I have a following query in mysql.
SELECT
*
FROM
Accounts AS a
WHERE
('s' IS NULL OR (a.FirstName LIKE CONCAT('s','%') OR
a.LastName LIKE CONCAT('s','%') OR
a.FullName LIKE CONCAT('s','%')
)
)
How Should I put indexes for the table?
p.s.
's' is actually a variable in stored proc, so 's' IS NULL and concat are necessary.
First of all, just a quick suggestion: do not use concat if you don't have to. Your query can be rewritten as ('s' is NULL) is always FALSE so you can will always get all rows based on the second condition anyway:
SELECT
*
FROM
Accounts AS a
WHERE
a.FirstName LIKE 's%' OR
a.LastName LIKE 's%' OR
a.FullName LIKE 's%'
Indexes that might help, but no necessarily will are:
create index idx_01 on accounts(FirstName);
create index idx_01 on accounts(LastName);
create index idx_01 on accounts(FullName);
You can also consider a FULL TEXT SEARCH index for your table.
's' IS NULL is always false
Is there any reason you're using CONCAT('s','%') instead of 's%'?
Try a composite index on (FirstName, LastName, FullName), although it might not work really well for (VAR)CHARs (or even at all it seems)
Since #3 didn't work, I can only refer you to MySQL manual now. THere's a bit about using how MySQL uses indexes with LIKE here
FOR you full text indexing is also an option
add fulltext index for 3 fields then
use
MATCH() AGAINST() syntax
Eg
SELECT * FROM articles WHERE MATCH (title,body)
AGAINST ('superb catch' IN BOOLEAN MODE);

What is an efficient way to do pattern match in MySQL?

We have a huge pattern match in a query (about 12 strings to check).
Now I have to do this query in MySQL, but it seems that the query from PostgreSQL has not the same syntax.
From PostgreSQL:
SELECT value
FROM ...
WHERE ...
AND `value` !~* '.*(text|text2|text3|text4|text5).*';
How do I do this in MySQL in a efficient way? I know, that this is probably not efficient at all. What is the most efficient way to do this?
This does the trick, but is probably the worst query possible to do this:
SELECT `value`
FROM ...
WHERE ...
AND NOT (
`value` LIKE '%text%'
OR `value` LIKE '%text2%'
OR `value` LIKE '%text3%'
OR `value` LIKE '%text4%'
OR `value` LIKE '%text5%');
Is REGEXP the way to go here? Then I'll have to figure out the corresponding expression.
Yes, REGEXP or its alternative spelling RLIKE:
WHERE value NOT RLIKE 'text|text2|text3|text4|text5'
(a regexp match is not anchored to the ends of the string unless you explicitly use ^ and $, so you don't need the .*(...).*.)
You should test it to see whether it is actually faster than the version with LIKE. On my test database with four words to find, LIKE was still considerably faster. It also has the advantage of being ANSI standard SQL so you can use it on any database.
If this is a common query, you should also consider fulltext indexing.