I have a table data with about 500.000 records, the table have:
id,
title,
created_date,
content...
columns. In those column, the content column contain large size text.
I've used search query:
SELECT count(*) from data WHERE content LIKE (%keyword%);
This query have execute time around 9 seconds.
I tried to use Full Text Search and using this query
SELECT count(*) from data
WHERE MATCH(content) AGAINST ('keyword*' IN BOOLEAN MODE);
The query have much shorter execute time, just about 0.4 second but the results is not the same as query which use LIKE operator.
As the MySQL documentation, the full text search query above can only return records that have type like "keyword---" and ignore "---keyword" and it can not meet my search requirement.
So, i want to ask if we have any other way to replace LIKE operator or any way to speed up the searching with LIKE ?
You can use LOCATE():
SELECT COUNT(*) FROM data WHERE LOCATE("keyword", content) > 0
Or INSTR():
SELECT COUNT(*) FROM data WHERE INSTR(content, "keyword") > 0
In case you have a lot of text in your DB and need such kind of search you could do several things:
you can configure full-text search indexes in MySQL
you can use another full-text search DB engines like Sphinx or Elastic Search. They are much more powerful and scalable comparing to native MySQL full-text search.
Also about full-text search in MySQL. Try to use + instead of *.
With * words match if they begin with the word preceding the * operator.
A leading + sign indicates that a word must be present in each row that is returned.
Related
However I found some threads about this, but nothing fits to my case.
I have a search field in my mobile app, where after text change, the real time search is running via calling my API.
The search request starts only if there are 3 or more characters entered and is searching ONLY in 1 DB column, called TITLE. So each time the user enters a letter, a query is searching for it.
Currently I have it like this (I know this solution is very bad). $searchedword is the word user entered:
if (!empty($searchedword)&&strlen($searchedword)>2 ) {$searchedword=strtolower($searchedword);
$sql = "SELECT * FROM TABLE ";$result = $mysqli->query($sql); $output='';
if ($result->num_rows > 0) {
while($data=$result->fetch_array()) {
$title=strtolower($data['title']);$content=$data['content'];
if (strpos($title,$searchedword) !== false ) {$output.=$title.','.$content;}
}}
So this just checks, if the title from DB contains the searched word. This works very well, but I think it is very bad according to performance, because each time the user enters a letter to the search field, each time all the data from the table are queried and looked for that word.
I want to recreate my code to meet the best performance.
So my first question is, should I add a FULLTEXT INDEX to the TITLE column in DB, will it help or will it just increase the disk space? As I am just searching against 1 column and in this column is just a title (1 or 2 words max).
And second question, what should be the best query for my case and of course with the best performance? As I need to search after each letter which user enters.
Can I use the search this way?
SELECT * FROM TABLE WHERE MATCH (title) AGAINST ('$searchedword' IN NATURAL LANGUAGE MODE)
However it seems, this will return only if the word completely matches the title, but returns nothing when the word is part of the title, so it is not a good solution.
The only solution which works is this:
SELECT * FROM TABLE WHERE title LIKE '%$searchedword%' "
but what about performance? And I don't understand how this works, because searchedword are converted to lowercase and I have removed the accents from that word, and the TITLE column in DB has accents and also Uppercase, but this search works very well!
If your title column has a collation like utfmb4_general_ci, you don't have to worry about dealing with upper case, lower case, and diacritical marks in your MySQL WHERE clauses. MySQL will do it for you. It is really good at handling character sets and collations in all kinds of languages. (Such things are very helpful to Swedish-language users, and the inventors of MySQL are Swedish.)
FULLTEXT with NATURAL LANGUAGE MODE is probably not the right approach for this application. It works on words, not chunks of letters. So it probably won't give you anything until your user has typed a whole word, and not a stop word. And, it is a little squirrely when you search a table with only a few rows. So, that might be a problem if you're just getting started.
It does order the results by the closeness of the match, so the most likely hit is the first one. So, if you know you have a phrase to search, it's good.
For your progressive-search application you may want to use one of these two LIKE queries.
SELECT title FROM tbl WHERE title LIKE CONCAT('$searchedword', '%') /*insecure*/
or this one which is much slower but finds your partial match anywhere in the title, not just at the beginning.
SELECT title FROM tbl WHERE title LIKE CONCAT('%', '$searchedword', '%') /*insecure*/
Avoid running these queries until you have gathered at least a few letters from your user, otherwise you'll get absurdly many results.
In these cases say SELECT title not SELECT *, and create an ordinary index on the title column. That way MySQL can satisfy the whole query from the index, which will make it much faster.
And, use MySQL's WHERE functionality to do the matching. Don't fetch the whole table from MySQL and search it in your php program.
And, use prepared statements. Because cybercreeps.
By Using Full Text Search not searching the exact value for multiple words.
This is the Query
SELECT * FROM csv WHERE match(data) against('"TMN PANTAI"' IN BOOLEAN MODE)
Its showing the result but its searching with "TMN" and "PANTAI" and "TMN PANTAI"
How can i search the exact match using "TMN PANTAI"?
FULL-TEXT doesn't take space into consideration, that is why you cannot use it.
If you still want to take advantage of full-text index, you can shorten up the resultset by filtering using match against query and add an additional LIKE condition. This will be efficient than querying with LIKE on the whole table, since the LIKE will now have lesser records to filter.
SELECT * FROM csv WHERE match(data) against('+TMN +PANTAI' IN BOOLEAN MODE)
AND data like '%TMN PANTAI%'
search term=['ISBN number on site']
the variable(column): sentence, in MySQL table. It consist many different sentence.
the sentence I want to look for is
"The AutoLink feature comes with Google's latest toolbar and provides links in a webpage to Amazon.com if it finds a book's ISBN number on the site."
However, When I use the following statement:
SELECT * FROM testtable
where Sentence like "%ISBN number on site%" ;
I am not able to get the result. This is because the search term("ISBN number on site") is lack of one word("the") compare with the sentence.
How to change my statement in order to get the sentence I want? thanks.
Assume that We do not change the search term=['ISBN number on site']
This is not a simple question. Your best bet is to use some type of fulltext search. Fulltext search can be configured to have stopwords (words that are omitted from search - like the word the) and can have a minimum word length limit as well (words with less than certain characters long are also omitted from the search.
However, if you simply use
SELECT * FROM testtable
WHERE MATCH (sentence)
AGAINST ('ISBN number on site');
Then MySQL will return not just the record with the value you were looking for, but the records that have some of the words only, and in different order. The one you showed will probably be one of the highest ranking one, but there is no guarantee that it will be highest ranked one.
You may want to use Boolean fulltext search and prepend + to every search word to force MySQL to return those records only that have all the search words present:
SELECT * FROM testtable
WHERE MATCH (sentence)
AGAINST ('+ISBN +number +on +site' IN BOOLEAN MODE);
But, on should either be a stopword (it is on the default stipword lists) or should be shorter that the minimum word length, so should be omitted from the search expression (you will not get back any results):
SELECT * FROM testtable
WHERE MATCH (sentence)
AGAINST ('+ISBN +number +site' IN BOOLEAN MODE);
I know that this requires alteration of the search expression, however this will get you the best results using MySQL's built-in functionality.
The alternative is to use other fulltext search engines, such as sphinx to perform the search for you.
Try:
SELECT * FROM testtable where Sentence like '%ISBN number on%site%' ;
The wildcard can go in the middle of a string too.
I am building a search feature for the messages part of my site, and have a messages database with a little over 9,000,000 rows, and and index on the sender, subject, and message fields. I was hoping to use the LIKE mysql clause in my query, such as (ex)
SELECT sender, subject, message FROM Messages WHERE message LIKE '%EXAMPLE_QUERY%';
to retrieve results. unfortunately, MySQL doesn't use indexes when a leading wildcard is present , and this is necessary for the search query could appear anywhere in the message (this is how the wildcards work, no?). Queries are very very slow and I cannot use a full text index either, because of the annoying 50% rule (I just can't afford to rule that much out). Is there anyway (or even, any alternative to this) to optimize a query using like and two wildcards? Any help is appreciated.
You should either use full-text indexes (you said you can't), design a full-text search by yourself or offload the search from MySQL and use Sphinx/Lucene. For Lucene you can use Zend_Search_Lucene implementation from Zend Framework or use Solr.
Normal indexes in MySQL are B+Trees, and they can't be used if the starting of the string is not known (and this is the case when you have wildcard in the beginning)
Another option is to implement search on your own, using reference table. Split text in words and create table that contains word, record_id. Then in the search you split the query in words and search for each of the words in the reference table. In this way you are not limitting yourself to the beginning of the whole text, but only to the beginning of the given word (and you'll match the rest of the words anyway)
'%EXAMPLE_QUERY%'; is a very very bad idea .. am going to give you some
A. Avoid wildcards at the start of LIKE queries use 'EXAMPLE_QUERY%'; instead
B. Create Keywords where you can easily use MATCH
If you want to stick with using MySQL, you should use FULL TEXT indexes. Full text indexes index words in a text block. You can then search on word stems and return the results in order of relevance. So you can find the word "example" within a block of text, but you still can't search efficiently on "xampl" to find "example".
MySQL's full text search is not great, but it is functional.
http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html
select * from emp where ename like '%e';
gives emp_name that ends with letter e.
select * from emp where ename like 'A%';
gives emp_name that begins with letter a.
select * from emp where ename like '_a%';
gives emp_name in which second letter is a.
I found the following query in our MySQL slow query log:
SELECT target_status
FROM link_repository
WHERE target_url LIKE CONCAT('%', 'bundle/rpi_/activity/rpi_bridge/bridge_manual.pdf')
When I pointed this out to the developer manager in a conversation about slow page loads, he stated:
come on; concat() is a simple string concatenation and '%' is the wildcard in the search string. I know that searching strings is not the fastest of operations (that's why we have lucene-like engines, but this is trivial stuff)
There's about 18k rows in link_repository, which isn't much. The documentation I'm finding is that indexing on character strings doesn't work with wildcards. Is there an alternative strategy one can use?
In order for LIKE to use index it has to start with something. MySQL search from left to rigt. So if the string star with anything then MySQL will do a table scan and no index will work.
However, if you are using InnoDB tables you can try to use Full-Text Index.
You can add a Full-Text Index on the column, then you can use MATCH AGAINST function to find what you need then you can add RIGHT() clause to only give you the results that end with your string.
CREATE FULLTEXT INDEX target_url ON target_status(target_url);
Then you can query the records like so
SELECT target_status
FROM link_repository
WHERE MATCH ('bundle/rpi_/activity/rpi_bridge/bridge_manual.pdf') AGAINST(target_url) AND RIGHT(target_url, 49) = 'bundle/rpi_/activity/rpi_bridge/bridge_manual.pdf'