Boost MySQL query with lot LIKE conditions - mysql

Have a table with ~25K rows, the user can provide keywords and negative keywords to filter rows.
It slows down when a user added a lot of keywords and/or negative keywords.
The query looks like this:
SELECT id, title, description
FROM entities
WHERE
(
title LIKE '%keyword_1%' OR description LIKE '%keyword_1%'
OR title LIKE '%keyword_2%' OR description LIKE '%keyword_2%'
OR title LIKE '%keyword_3%' OR description LIKE '%keyword_3%'
)
AND
(
title NOT LIKE '%negative_keyword_1%' OR description NOT LIKE '%negative_keyword_1%'
OR title NOT LIKE '%negative_keyword_2%' OR description NOT LIKE '%negative_keyword_2%'
OR title NOT LIKE '%negative_keyword_3%' OR description NOT LIKE '%negative_keyword_3%'
)
for example, a query with 9 keywords and 130 negative keywords takes ~7 seconds.
Maybe there is a better solution to filter those rows without LIKE? maybe the whole logic is wrong.
Tried MATCH () AGAINST() - it is slower than LIKE for some reason.

For such cases like yours it might help to try to filter the positive matches in database and then test them for a presence of a negative keyword in memory. So try something like:
SELECT id, title, description
FROM entities
WHERE
(
title LIKE '%keyword_1%' OR description LIKE '%keyword_1%'
OR title LIKE '%keyword_2%' OR description LIKE '%keyword_2%'
OR title LIKE '%keyword_3%' OR description LIKE '%keyword_3%'
)
This gives you results which match positive keywords. You can then filter out the matches in memory which contain negative keywords.
Also give an index a chance to be used with LIKE. This reduces your search options from CONTAINS to STARTS WITH but it might be sufficient for your case. Syntax is LIKE 'keyword_1%'.
Fulltext search
Another option is to use MySQL fulltext search by defining a fulltext index on your title and description columns.
CREATE FULLTEXT INDEX idx_title ON entities(title);
CREATE FULLTEXT INDEX idx_description ON entities(description);
Or you can merge these two columns into a single column - for the search purposes. Then you need only one fulltext index.
Query syntax is then following:
MATCH (title) AGAINST ('keyword_1')
instead of
title LIKE '%keyword_1%'
For this search I would also recommend to filter positive matches only in a database and then filter out the matches in memory which contain negative keywords.

Related

MySql Match() Against in Boolean mode

I'm not able to figure it out why this is not working.
Here is my query which works:
SELECT
id,
title
FROM
`pages`
WHERE MATCH (title) AGAINST ('Visual*' IN BOOLEAN MODE)
ORDER BY title
LIMIT 10;
and here is my pages table:
id title
===============================
1 About Us
2 Visual Data
but this one does not return any records:
SELECT
id,
title
FROM
`pages`
WHERE MATCH (title) AGAINST ('About*' IN BOOLEAN MODE)
ORDER BY title
LIMIT 10;
here is SQL Fiddle : http://sqlfiddle.com/#!9/d264f2/2
There are several important concepts when using full text search -- and the documentation has more details.
One key concept is what defines a word. That is not important here, but MySQL lets you specify the delimiters.
Another key concept is that only some words are indexed. Two common reasons why words are not indexed are:
They are too short (or I suppose too long, but that is unusual).
They are in the stop words list.
The words in the stop words list are usually "filler" words -- such as "the", "otherwise", . . . and you might guess "about".
You can override the stop words list. You will need to provide another stop words list (or none at all). And then rebuild the index.

Searching table efficiently for specific phrase

I want to search an entire column for a specific phrase. I know I can use the SQL statement of SELECT description FROM questions WHERE description LIKE '%what%' AND description LIKE '%if%' to search for the phrase "What if" in the description column. My problem is that if I have a million entries, then searching through the column might take a while.
Is there a way to search through an entire column efficiently to check if a specific phrase exists?
Full text search provides exactly what you want. By creating a text index for your search column, mysql will look for your phrase in that column efficiently, and will return to you a score based match for your phrase, which you can effectively use to get your result.

MySql plural search without Fulltext

I want to make a plural search on my table but i don't want to use FULLTEXT.I tried FULLTEXT but my table doesn't support it.My query is like:
SELECT
*
FROM
items
WHERE
LOWER(items.`name`) LIKE '%parameter%'
OR LOWER(items.brand) LIKE '%parameter%'
OR LOWER(items.sku) LIKE '%parameter%'
When i search 'shirt' it returns good results when i search shirts i doesn't.Is there a way to make plural search without fulltext
I suggest you to create separate table items with MyIsam Engine for items
with fields you want to perform search and primary id.
Now you can do full-text search on new table and retrieve ID and based on ID you can retrieve result of fields from main items table.
The additional table for "items" needs to be updated regularly, may be though trigger or automated script.
it will match all those beginning with parameter passed.
SELECT
*
FROM
items
WHERE
LOWER(items.`name`) LIKE 'parameter%'
OR LOWER(items.brand) LIKE 'parameter%'
OR LOWER(items.sku) LIKE 'parameter%'

how to search several columns in a sql query using concat and UPPER

So I have a DB and a web page where I want to display the result of my db search
I have try a couple ways both work but they are incomplete for what I want to accomplish
I want to be able to search my table columns id,Id_Nombre,Pais,Estado,Ciudad,website all of them with a word or several words from my search text.
this code works but I have to type exactly the word it's case sensitive:
$query = "SELECT * FROM Medios_table WHERE concat(id,Id_Nombre,Pais,Estado,Ciudad,website) LIKE '%$Busqueda%'";
so as a result i I type a word like "People" in my search box and in my data base that word is type in any of those columns as "people" it wont find it.
second code I use works but only using one column
$query = "SELECT * FROM Medios_table WHERE UPPER(Id_Nombre) LIKE UPPER('%$Busqueda%')";
the result for this code is great since it will find it no matter the case used, but I need to extend this type of search to all the other columns, but so far everithing I use does not work.
I have tried:
$query = "SELECT * FROM Medios_table WHERE UPPER(id,Id_Nombre,Pais,Estado,Ciudad,website) LIKE UPPER('%$Busqueda%')";
$query = "SELECT * FROM Medios_table WHERE UPPER(concat(id,Id_Nombre,Pais,Estado,Ciudad,website)) LIKE UPPER('%$Busqueda%')";
$query = "SELECT * FROM Medios_table WHERE concat(UPPER(id,Id_Nombre,Pais,Estado,Ciudad,website)) LIKE UPPER('%$Busqueda%')";
etc.
any help is greatly appreciated, thanks.
Did you try UPPER around each column name? Like this:
$query = "SELECT * FROM Medios_table WHERE concat(UPPER(id),UPPER(Id_Nombre),UPPER(Pais),UPPER(Estado),UPPER(Ciudad),UPPER(website)) LIKE UPPER('%$Busqueda%')";
Your upper(concat(...)) version should work. If you had a problem with that, it's probably just a typo, like you left out a parenthesis or something.
You can't say upper(x,y,z) because upper takes only one parameter. You must do the concat first, then do the upper, i.e. upper(concat(x,y,z)).
That said, this query will be very slow on a big database, because the db engine has to read every record in the table, and then for each one search it character by character. If the table is small or this is done infrequently, that might be acceptable. If the table is big and/or this query will be executed often, you really need a totally different approach.
Update
If you really need to search against any text in a field, where you cannot make any assumptions about the text being searched in or the text being searched for in advance, you should investigate fulltext searches. http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html
If you will be doing these sort of searches all the time, you might want to build a dictionary, that is, build a list of all the word with all the records that that word occurs in. I think that's basically what fulltext does, so this may or may not gain you anything.
But if you have some foreknowledge, that is, if it's not really that you want to search for arbitrary text occurring anywhere in arbitrary text, than break out the things you want to search for into separate fields.
To take a simple example, suppose you have a field that contains customer full name, like "Fred Smith", "Mary Jones", etc. You want to search for someone by last name. You could search for
where full_name like '%Smith%'
But this would require reading every record in the table. If, instead, you broke the field into first name and last name, then you could search for
where last_name='Smith'
If you have an index on last_name this would be a very fast search.
If you're just trying to look for an entered value in any of several fields, it is much, much faster to do
where estado='Toledo' or ciudad='Toledo' or pais='Toledo'
(assuming that you have indexes on estado, ciudad, and pais), then
where concat(estado, ciudad, pais) like '%Toledo%'
(where no index will do you any good).
If you want to do case-insensitive searches, create an index on upper(estado) instead of on estado, etc. Then "where upper(estado)='TOLEDO'" can use the index.
Also, I'm not sure about MySql, but some database engines can use an index if the LIKE does not begin with a wildcard. That is, "somefield like '%x%'" must read every record in the table. But on some db's, "somefield like 'x%'" can use can index to get to the records where the field starts with "x" and then just process those.
You need full-text search
This link may be useful Mysql full-text search

How to write mysql query to search database?

I'm own a wallpaper website and I'm trying to write a search feature that will search the database for the terms the user is searching for. I have 2 fields in the database I'm searching against TAGS and NAME
The current way I'm doing it is I take the search term divide it up into multiple words and then search the database using those terms. So if a user searches for "New York" my query will look like this
SELECT * FROM wallpapers
WHERE tags LIKE '%New%' OR name LIKE '%new%'
or tags LIKE '%York%' OR name LIKE '%York%'
The issue with that of course is that anything with the term new in it will be pulled up also like say "new car" etc. If I replace the query above with the following code then it's too vague and only like 2 wallpapers will show up
SELECT * FROM wallpapers
WHERE tags LIKE '%New York%' OR name LIKE '%New York%'
Does anyone have a better way to write a search query?
Looks like you want to introduce the concept of relevance.
Try:
select * from (
SELECT 1 as relevance, * FROM wallpapers
WHERE tags LIKE '%New York%' OR name LIKE '%New York%'
union
select 10 as relevance, * FROM wallpapers
WHERE (tags LIKE '%New%' OR name LIKE '%new%')
and (tags LIKE '%York%' OR name LIKE '%York%')
union
select 100 as relevance, * FROM wallpapers
WHERE tags LIKE '%New%' OR name LIKE '%new%'
union
select 100 as relevance, * FROM wallpapers
WHERE tags LIKE '%York%' OR name LIKE '%York%'
)
order by relevance asc
By the way, this will perform very, very poorly if your database grows too large - you want to be formatting your columns consistently so they're all upper case (or all lower case), and you want to avoid wildcards in your where clauses if you possibly can.
Once this becomes a problem, look at full text searching.
Perhaps this is a really dumb question, but could be following possibly what you want?
SELECT * FROM wallpapers
WHERE ( tags LIKE '%New%' OR name LIKE '%new%' )
and ( tags LIKE '%York%' OR name LIKE '%York%' )
This searches for wallpapers which must have both words but anywhere.
Warning Beware of SQL injection this way, when searching for "words" like new'york or new%york. Perhaps the most easy way is to treat all nonalpha/nonnumeric characters as spaces when splitting, such that new#york and similar becomes new and york.
Notes about searching:
Searching this way in databases is plain overkill (full table scan). As long as you only have a few wallpapers this is not a bigger problem. Nearly all current cheap hardware should be able to search through a million wallpapers within a second or so, as long as the database fits into memory.
However with bigger sites where the tags and name information exceeds available RAM, you certainly get a problem. Then it is time to try some other way to search. However what to do heavily depends on the expected use pattern, so to answer that more information is needed.
You can use your first query to do a first selection and after that your can rank the results : a wallpaper with both keywords "New" and "York" will be ranked higher than a wallpaper with only one of the keywords.
If the second query only returns 2 wallpaper, can you put some example of the tags/name of the wallpaper not returned by the query but that you would like to have ?
Is it a problem of uppercase letters ? Spaces ?
I believe you may benefit from Full text indexes. Read up on mysql full text search. You will need to be using MyISAM engine for your tables.
http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html