I have a database with 75,000+ rows with 500+ entries added per day.
Each row has a title and description.
I created an RSS feed which gives you the latest entries for a specific search term (ex. http://site.com/rss.rss?q=Pizza would output an RSS for the search term "Pizza").
I was wondering what would be the best way to write the SQL query for this. Right now I have:
SELECT *
FROM 'table'
WHERE (('title' LIKE %searcherm%) OR ('description' LIKE %searcherm%))
LIMIT 20;
But the problem is it takes between 2 to 10 seconds to execute the query.
Is there a better way to write the query, do I have to cache the results (and how would I do that?) or would changing something in the database structure speed up the query (indexes?)
A relatively simple solution for this would be incorporating a FULLTEXT index on these two fields and subsequently searching by using this index.
ALTER TABLE table ADD FULLTEXT(title, description);
Then would you need to perform a search, you'd do the following:
SELECT id FROM table
WHERE MATCH (title, description) AGAINST ('keyterm');
Fulltext indexed search is the automatic solution included in most SQL databases. It's much speedier comparing to doing LIKES. This is also optimized for your specific case because you are only interested in natural language search terms.
As well, fulltext index has some limiting algorithm for detecting relevancy. You can read more about it here
EDIT
In the alter statement, I missed the fulltext index name, it should be:
ALTER TABLE table ADD FULLTEXT ft_index_name(title, description);
Try:
SELECT * FROM table
WHERE MATCH (title,description) AGAINST (searchterm);
Make sure you add a full text index on title, description together.
Dont try to reinvent the wheel. MATCH and AGAINST are provided by mysql to do exactly that and to make your life easy. However, note full text search works on MyISAM tables. You can workaround for InnoDb too. You can simply add FT index by altering table like:
ALTER TABLE table ADD FULLTEXT(title,description);
If you're using a query with LIKE '%term%' the indexes can't be used. They can be used only if you use a query like 'term%'. Think about an address book with tabs, you can find really fast contacts starting with letter L, but to find contacts with a on somewhere in the word, you've to scan the whole addressbook.
The better alternative could be to use full text indexes:
CREATE FULLTEXT INDEX title_desc
ON table (title, description)
And then in the query:
SELECT title, description FROM table
WHERE MATCH (title, description) AGAINST ('+Pizza')
I would go with JohnB's or gtr32x's answer (Full Text Indexing). To complement their answer, there's a manual way to create a simple full text index that's simple and it's super fast...
Split title and description into keywords, and place them in a Keywords table, which has a foreign key to the original RSS article. Make sure the keyword column in Keywords is indexed. The you can do something like:
SELECT DISTINCT ra.*
FROM RssArticle ra
INNER JOIN Keywords k ON k.ArticleID = ra.ArticleID
WHERE k IN ( 'SearchTerm1', 'SearchTerm2', 'SearchTerm3')
LIMIT 20;
And it's fast!
Try either of the following four queries:
select * from myTable where concat_ws(' ',title,description) like '%pizza%';
select * from myTable where concat_ws(' ',title,description) regexp '.*pizza+.*';
select title,description from myTable where concat_ws(' ',title,description) like '%pizza%';
select title,description from myTable where concat_ws(' ',title,description) regexp '.*pizza+.*';
the point is to use concat before searching
A few pointers: Drop the * in your select statement and pull only the searched criteria, and make sure to add indexes to the columns that are getting searched.
SELECT `title`,`description`
FROM `table`
WHERE `title` LIKE '%$searchterm%' OR `description` LIKE '%$searchterm%' LIMIT 25;
Did you create an index for title and for description?
You should consider Sphinx for Full Text Search capabilities.
Thanks for the comment Tyler.
I restate my answer:
1) Create an index on title and description columns, but your query would be limited to the example below, and that's not ideal for finding all relevant rows:
SELECT *
FROM 'table'
WHERE title LIKE 'searcherm%' OR description LIKE 'searcherm%'
LIMIT 20;
2) As others have mentioned, use MySQL Full-Text Search, but you are limited to MyISAM table engine, as it isn't available for InnoDB. However, you can mix engines in MySQL, so you can make this table MyISAM, even if all your other tables are InnoDB.
3) Use an external Full-Text Search engine, such as Sphinx. This will give you more relevant search results (MySQL Full-Text Search leaves much to be desired), it will perform better, and it abstracts the burden of Full-Text Searching away from your database.
Related
I am new to SQL programming.
I have a table job where the fields are id, position, category, location, salary range, description, refno.
I want to implement a keyword search from the front end. The keyword can reside in any of the fields of the above table.
This is the query I have tried but it consist of so many duplicate rows:
SELECT
a.*,
b.catname
FROM
job a,
category b
WHERE
a.catid = b.catid AND
a.jobsalrange = '15001-20000' AND
a.jobloc = 'Berkshire' AND
a.jobpos LIKE '%sales%' OR
a.jobloc LIKE '%sales%' OR
a.jobsal LIKE '%sales%' OR
a.jobref LIKE '%sales%' OR
a.jobemail LIKE '%sales%' OR
a.jobsalrange LIKE '%sales%' OR
b.catname LIKE '%sales%'
For a single keyword on VARCHAR fields you can use LIKE:
SELECT id, category, location
FROM table
WHERE
(
category LIKE '%keyword%'
OR location LIKE '%keyword%'
)
For a description you're usually better adding a full text index and doing a Full-Text Search (MyISAM only):
SELECT id, description
FROM table
WHERE MATCH (description) AGAINST('keyword1 keyword2')
SELECT
*
FROM
yourtable
WHERE
id LIKE '%keyword%'
OR position LIKE '%keyword%'
OR category LIKE '%keyword%'
OR location LIKE '%keyword%'
OR description LIKE '%keyword%'
OR refno LIKE '%keyword%';
Ideally, have a keyword table containing the fields:
Keyword
Id
Count (possibly)
with an index on Keyword. Create an insert/update/delete trigger on the other table so that, when a row is changed, every keyword is extracted and put into (or replaced in) this table.
You'll also need a table of words to not count as keywords (if, and, so, but, ...).
In this way, you'll get the best speed for queries wanting to look for the keywords and you can implement (relatively easily) more complex queries such as "contains Java and RCA1802".
"LIKE" queries will work but they won't scale as well.
Personally, I wouldn't use the LIKE string comparison on the ID field or any other numeric field. It doesn't make sense for a search for ID# "216" to return 16216, 21651, 3216087, 5321668..., and so on and so forth; likewise with salary.
Also, if you want to use prepared statements to prevent SQL injections, you would use a query string like:
SELECT * FROM job WHERE `position` LIKE CONCAT('%', ? ,'%') OR ...
I will explain the method i usally prefer:
First of all you need to take into consideration that for this method you will sacrifice memory with the aim of gaining computation speed.
Second you need to have a the right to edit the table structure.
1) Add a field (i usually call it "digest") where you store all the data from the table.
The field will look like:
"n-n1-n2-n3-n4-n5-n6-n7-n8-n9" etc.. where n is a single word
I achieve this using a regular expression thar replaces " " with "-".
This field is the result of all the table data "digested" in one sigle string.
2) Use the LIKE statement %keyword% on the digest field:
SELECT * FROM table WHERE digest LIKE %keyword%
you can even build a qUery with a little loop so you can search for multiple keywords at the same time looking like:
SELECT * FROM table WHERE
digest LIKE %keyword1% AND
digest LIKE %keyword2% AND
digest LIKE %keyword3% ...
You can find another simpler option in a thread here: Match Against.. with a more detail help in 11.9.2. Boolean Full-Text Searches
This is just in case someone need a more compact option. This will require to create an Index FULLTEXT in the table, which can be accomplish easily.
Information on how to create Indexes (MySQL): MySQL FULLTEXT Indexing and Searching
In the FULLTEXT Index you can have more than one column listed, the result would be an SQL Statement with an index named search:
SELECT *,MATCH (`column`) AGAINST('+keyword1* +keyword2* +keyword3*') as relevance FROM `documents`USE INDEX(search) WHERE MATCH (`column`) AGAINST('+keyword1* +keyword2* +keyword3*' IN BOOLEAN MODE) ORDER BY relevance;
I tried with multiple columns, with no luck. Even though multiple columns are allowed in indexes, you still need an index for each column to use with Match/Against Statement.
Depending in your criterias you can use either options.
I know this is a bit late but what I did to our application is this. Hope this will help someone tho. But it works for me:
SELECT * FROM `landmarks` WHERE `landmark_name` OR `landmark_description` OR `landmark_address` LIKE '%keyword'
OR `landmark_name` OR `landmark_description` OR `landmark_address` LIKE 'keyword%'
OR `landmark_name` OR `landmark_description` OR `landmark_address` LIKE '%keyword%'
I was trying to make a very fast & efficient approach to fetch the records using keywords as search.
Our MYSQL table MASTER tablescontains 30,000 rows and has 4 fields.
ID
title (FULLTEXT)
short_descr (FULLTEXT)
long_descr (FULLTEXT)
Can any one suggest which is one more efficient?
LIKE %
MYSQL's AGAINST
It would be nice if some one can write a SQL query for the keywords
Weight Loss Secrets
SELECT id FROM MASTER
WHERE (title LIKE '%Weight Loss Secrets%' OR
short_descr LIKE '%Weight Loss Secrets%' OR
long_descr LIKE '%Weight Loss Secrets%')
Thanks in advance
The FULLTEXT index should be faster, maybe its a good idea to add all columns into 1 fulltext index.
ALTER TABLE MASTER
ADD FULLTEXT INDEX `FullTextSearch`
(`title` ASC, `short_descr` ASC, `long_descr` ASC);
Then execute using IN BOOLEAN MODE
SELECT id FROM MASTER WHERE
MATCH (title, short_descr, long_descr)
AGAINST ('+Weight +Loss +Secrets' IN BOOLEAN MODE);
This will find rows that contains all 3 keywords.
However, this wont give you exact match the keywords just need to be present in same row.
If you also want exact match you could do like this, but its a bit hacky and would only work if your table doesnt get to big.
SELECT id FROM
(
SELECT CONCAT(title,' ',short_descr,' ', long_descr) AS SearchField
FROM MASTER WHERE
MATCH (title, short_descr, long_descr)
AGAINST ('+Weight +Loss +Secrets' IN BOOLEAN MODE)
) result WHERE SearchField LIKE '%Weight Loss Secrets%'
I am currently looking into using FULLTEXT indexes in MySQL for search functionality within a web site.
Basically, the user can go to an advanced search page, and select 1 or more columns to search against, e.g. they can search Title, Description and Comments or either only 1 column or a mixture of the three and when they perform the search these selected columns are searched for against the keywords.
I had created 1 index for the title, 1 index for the description and 1 index for the comments and then tried to run the following query:
SELECT * FROM support_calls WHERE MATCH(Title, Description) AGAINST('+these, +are, +some, +keywords')
I got an error from MySQL saying that the MATCH didn't match any fulltext indexes and I found that I need to create an index which included Title and Description together instead of having them in separate indexes.
This is going to add some complexity if this is the case as I am going to have to create an index for every single variation of what columns the user selects. Am I going about this the right away or is there a better solution?
first execute below query and then run your MATCH() query.
ALTER TABLE support_calls ADD FULLTEXT (
Title, Description
)
I'm deploying a Rails application that aggregates coupon data from various third-party providers into a searchable database. Searches are conducted across four fields for each coupon: headline, coupon code, description, and expiration date.
Because some of these third-party providers do a rather bad job of keeping their data sorted, and because I don't want duplicate coupons to creep into my database, I've implemented a unique compound index across those four columns. That prevents the same coupon from being inserted into my database more than once.
Given that I'm searching against these columns (via simple WHERE column LIKE %whatever% matching for the time being), I want these columns to each individually benefit from the speed gains to be had by indexing them.
So here's my question: will the compound index across all columns provide the same searching speed gains as if I had applied an individual index to each column? Or will it only guarantee uniqueness among the rows?
Complicating the matter somewhat is that I'm developing in Rails, so my question pertains both to SQLite3 and MySQL (and whatever we might port over to in the future), rather than one specific RDBMS.
My guess is that the indexes will speed up searching across individual columns, but I really don't have enough "under the hood" database expertise to feel confident in that judgement.
Thanks for lending your expertise.
will the compound index across all
columns provide the same searching
speed gains as if I had applied an
individual index to each column?
Nope. The order of the columns in the index is very important. Lets suppose you have an index like this: create unique index index_name on table_name (headline, coupon_code, description,expiration_date)
In this case these queries will use the index
select * from table_name where headline = 1
select * from table_name where headline = 1 and cupon_code = 2
and these queries wont use the unique index:
select * from table_name where coupon_code = 1
select * from table_name where description = 1 and cupon_code = 2
So the rule is something like this. When you have multiple fields indexed together, then you have to specify the first k field to be able to use the index.
So if you want to be able to search for any one of these fields then you should create on index on each of them separately (besides the combined unique index)
Also, be careful with the LIKE operator.
this will use index SELECT * FROM tbl_name WHERE key_col LIKE 'Patrick%';
and this will not SELECT * FROM tbl_name WHERE key_col LIKE '%Patrick%';
index usage http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
multiple column index http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html
I want to write a query to select from a table all rows with the word "piggy" in a column called Description.
SELECT * FROM table WHERE ...?
Thank you!
In MySQL, % is a wild-card. You don't use wild-cards with the = operator but with the LIKE operator.
SELECT * FROM table WHERE `Description` LIKE "%piggy%"
http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html
select * from table where description like '%piggy%'
will select and return all rows where the word piggy is part of the value of the column. If you want to count how many rows then:
select count(*) from table where description like '%piggy%'
select * from table where description like '%piggy%'
As mentioned several times you need a LIKE query for this. I would only warn that this is going to be terribly slow in case of a InnoDB table as it doesn't support fulltext scans. If you want better performance with a LIKE, then you should use MyISAM.
Anyway, if you want to implement a search engine, better look for existing API's. I don't know what programming language you're using, but if it was Java, I'd recommend Apache Lucene.