Mysql query show no result with some bad words? - mysql

I am working on an adult site, for this site I have created an internal research.
For search I use this query:
SELECT SQL_CALC_FOUND_ROWS
id_photo, title, description, model, data_ins,
MATCH(title, description, model) AGAINST('".trim(strtolower(addslashes($_GET['q'])))."') as score
FROM ".$prefix."photo
WHERE MATCH(title, description, model) AGAINST('".trim(strtolower(addslashes($_GET['q'])))."')
ORDER BY score DESC LIMIT ".$start.", ".$step."
Everything works smoothly and without php or mysql errors, but the client pointed out a strange thing to me.
eg :
searching for the word starting with "c" and ending with "ck" the
query returns no results.
searching for the word starting with "d"
and ending with "ck" the query returns the correct results.
I use something similar to this to verify if there are results:
$photo_query_id = $db->prepare("my query");
$photo_query_id->execute();
if($photo_query_id->rowCount() < 1){
//...
}
The two words are both used hundreds of times in both titles and descriptions, so why does mysql sometimes prefer not to show results?
Is there a list of bad words in some mysql config file that is blocking queries? And in case where do I find it and how do I modify it?

Use a BOOLEAN MODE search or use the InnoDB database engine for your table. When you do a natural language search against a MyISAM full-text index, words that appear in more than 50% of the rows are treated as stopwords.
From the documentation:
The 50% threshold can surprise you when you first try full-text searching to see how it works, and makes InnoDB tables more suited to experimentation with full-text searches. If you create a MyISAM table and insert only one or two rows of text into it, every word in the text occurs in at least 50% of the rows. As a result, no search returns any results until the table contains more rows. Users who need to bypass the 50% limitation can build search indexes on InnoDB tables, or use the boolean search mode explained in Section 12.10.2, “Boolean Full-Text Searches”]2.

Related

Realtime search against only 1 column in Mysql - without any plugins

However I found some threads about this, but nothing fits to my case.
I have a search field in my mobile app, where after text change, the real time search is running via calling my API.
The search request starts only if there are 3 or more characters entered and is searching ONLY in 1 DB column, called TITLE. So each time the user enters a letter, a query is searching for it.
Currently I have it like this (I know this solution is very bad). $searchedword is the word user entered:
if (!empty($searchedword)&&strlen($searchedword)>2 ) {$searchedword=strtolower($searchedword);
$sql = "SELECT * FROM TABLE ";$result = $mysqli->query($sql); $output='';
if ($result->num_rows > 0) {
while($data=$result->fetch_array()) {
$title=strtolower($data['title']);$content=$data['content'];
if (strpos($title,$searchedword) !== false ) {$output.=$title.','.$content;}
}}
So this just checks, if the title from DB contains the searched word. This works very well, but I think it is very bad according to performance, because each time the user enters a letter to the search field, each time all the data from the table are queried and looked for that word.
I want to recreate my code to meet the best performance.
So my first question is, should I add a FULLTEXT INDEX to the TITLE column in DB, will it help or will it just increase the disk space? As I am just searching against 1 column and in this column is just a title (1 or 2 words max).
And second question, what should be the best query for my case and of course with the best performance? As I need to search after each letter which user enters.
Can I use the search this way?
SELECT * FROM TABLE WHERE MATCH (title) AGAINST ('$searchedword' IN NATURAL LANGUAGE MODE)
However it seems, this will return only if the word completely matches the title, but returns nothing when the word is part of the title, so it is not a good solution.
The only solution which works is this:
SELECT * FROM TABLE WHERE title LIKE '%$searchedword%' "
but what about performance? And I don't understand how this works, because searchedword are converted to lowercase and I have removed the accents from that word, and the TITLE column in DB has accents and also Uppercase, but this search works very well!
If your title column has a collation like utfmb4_general_ci, you don't have to worry about dealing with upper case, lower case, and diacritical marks in your MySQL WHERE clauses. MySQL will do it for you. It is really good at handling character sets and collations in all kinds of languages. (Such things are very helpful to Swedish-language users, and the inventors of MySQL are Swedish.)
FULLTEXT with NATURAL LANGUAGE MODE is probably not the right approach for this application. It works on words, not chunks of letters. So it probably won't give you anything until your user has typed a whole word, and not a stop word. And, it is a little squirrely when you search a table with only a few rows. So, that might be a problem if you're just getting started.
It does order the results by the closeness of the match, so the most likely hit is the first one. So, if you know you have a phrase to search, it's good.
For your progressive-search application you may want to use one of these two LIKE queries.
SELECT title FROM tbl WHERE title LIKE CONCAT('$searchedword', '%') /*insecure*/
or this one which is much slower but finds your partial match anywhere in the title, not just at the beginning.
SELECT title FROM tbl WHERE title LIKE CONCAT('%', '$searchedword', '%') /*insecure*/
Avoid running these queries until you have gathered at least a few letters from your user, otherwise you'll get absurdly many results.
In these cases say SELECT title not SELECT *, and create an ordinary index on the title column. That way MySQL can satisfy the whole query from the index, which will make it much faster.
And, use MySQL's WHERE functionality to do the matching. Don't fetch the whole table from MySQL and search it in your php program.
And, use prepared statements. Because cybercreeps.

Using Full-Text Search instead of "Like '%____%' query

I am quering a table (about 150,000 rows and growing) with a big varchar field (size 2000) which can't be indexed (and there's no point even if it could be). I am using Sql Server 2008.
The query I used till now was:
select * from tbl_name
where field_name like '%bla bla%'
("bla bla" is according to what the user searched for)
In order to improve performence, I wann'a start using the Full-Text Search feature (already defined a catalog and a text index on this field).
I am a bit confused from what I read about quering with this option.
what query should I use in order to get exactly the same results as the query I used to use before?
Comments:
I would like to get results which are not case sensative, as it worked before (meaning if the user searches for "LG" he will also get results that contains "Lg").
If user enters "Sams" he will also get "Samsung".
Thanks!
Eran.
CONTAINS() will get you the LIKE() functionality you are seeking with one exception - I noticed in the comments that you also want to match the second entry - "hhhEranttt". Unfortunately, due to the lack of suffix search this is currently not possible.
For the other entries you can run a prefix search - CONTAINS(field_name, '"eran*"') which matches all the other entries since full-text searches are case-insensitive.
HTH.

MySQL table MATCH AGAINST

Hi all,
I have this simple table created, called classics in a DB called, publications on XAMMP. I am trying to do a MATCH AGAINST search for an author name which i thought I understood.
Also, I have made sure the table is FULLTEXT indexed, both author and title columns as required. The table is of the type MyISAM also.
I tried this and it failed.
SELECT author FROM classics WHERE MATCH(author) AGAINST('Charles');
I know Charles must be present in the author column and it is as you an see but i get no rows returned.
Now if I rewerite it to any other author, it works
SELECT author FROM classics WHERE MATCH(author) AGAINST ('jane');
Here is what i get with jane...
I'm not sure but it seemed earlier i had to included both fields I'd indexed in the query, instead of just being able to search author alone. Is this correct and does anyone know why I can't get charles returned?.
Many thanks!.
It's not returning those rows because "charles" appears in 50% of the rows. This is a well-documented restriction of MySQL FULLTEXT search.
If you want to get around this restriction, you can use BOOLEAN MODE.
Here's the relevant excerpt from the manual:
A word that matches half of the rows in a table is less likely to locate relevant documents. In fact, it most likely finds plenty of irrelevant documents. We all know this happens far too often when we are trying to find something on the Internet with a search engine. It is with this reasoning that rows containing the word are assigned a low semantic value for the particular data set in which they occur. A given word may reach the 50% threshold in one data set but not another.
The 50% threshold has a significant implication when you first try full-text searching to see how it works: If you create a table and insert only one or two rows of text into it, every word in the text occurs in at least 50% of the rows. As a result, no search returns any results. Be sure to insert at least three rows, and preferably many more. Users who need to bypass the 50% limitation can use the boolean search mode; see Section 12.9.2, “Boolean Full-Text Searches”.

Aggregate most relevant results with MySQL's fulltext search across many tables

I am running fulltext queries on multiple tables on MySQL 5.5.22. The application uses innodb tables, so I have created a few MyISAM tables specifically for fulltext searches.
For example, some of my tables look like
account_search
===========
id
account_id
name
description
hobbies
interests
product_search
===========
id
product_id
name
type
description
reviews
As these tables are solely for fulltext search, they are denormalized. Data can come from multiple tables and are agregated into the search table. Besides the ID columns, the rest of the columns are assigned to 1 fulltext index.
To work around the "50%" rule with fulltext searches, I am using IN BOOLEAN MODE.
So for the above, I would run:
SELECT *, MATCH(name, type, description, reviews) AGAINST('john') as relevance
FROM product_search
WHERE MATCH(name, type, description, reviews) AGAINST('john*' IN BOOLEAN MODE) LIMIT 10
SELECT *, MATCH(name, description, hobbies, interests) AGAINST('john') as relevance
FROM account_search
WHERE MATCH(name, description, hobbies, interests) AGAINST('john*' IN BOOLEAN MODE) LIMIT 10
Let's just assume that we have products called "john" as well :P
The problem I am facing are:
To get meaningful relevance, I need to use a search without IN BOOLEAN MODE. This means that the search is subjected to the 50% rule and word length rules. So, quite often, if I most of the products in the product_search table is called john, their relevance would be returned as 0.
Relevances between multiple queries are not comparable. (I think a relevance of 14 from one query does not equal a relevance of 14 from another different query).
Searches will not be just limited to these 2 tables, there are other "object types", for example: "orders", "transactions", etc.
I would like to be able to return the top 7 most relevant results of ALL object types given a set of keywords (1 search box returns results for ALL objects).
Given the above, what are some algorithms or perhaps even better ideas for get the top 7?
I know I can use things like solr and elasticsearch, I have already tried them and am in the proces of integrating them into the application, but I would like to be able to provide search for those who only have access to MySQL.
So after thinking about this for a while, I decided that the relevance ranking has to be done with 1 query within MySQL.
This is because:
Relevance between seperate queries can't be compared.
It's hard to combine the contents of multiple searches together in meaningful ways.
I have switched to using 1 index table dedicated to search. Entries are inserted, removed, and updates depending on inserts, removals and updates to the real underlying data in the innodb tables (this is all automatic).
The table looks like this:
search
==============
id //id for the entry
type //the table the data came from
column //column the data came from
type_id //id of the row the in the original table
content //text
There's a full text index on the content column. It is important to realize that not all columns from all tables will be indexed, only things that I deem to be useful in search has been added.
Thus, it's just a simple case of running a query to match on content, retrieve what we have and do further processing. To process the final result, a few more queries would be required to ask the parent table for the title of the search result and perhaps some other meta data, but this is a workable solution.
I don't think this approach will really scale (updates and inserts will need to update this table as well), but I think it is a pretty good way to provide decent application wide search for smaller deployments of the application.
For scalability, use something like elastic search, solr or lucene.

How to perform search on MySQL table for a website

How do I perform a search similar to that of Wikipedia on a MySQL table (or several tables at a time) without crawling the database first? (Search on wikipedia used to show you the relevancy in percentage).
What I'm looking for here is how to determine relevancy of the results and sort them accordingly, especially in case where you pull data from several tables at a time.
What do you use for search function on your websites?
You can use MySQL's full-text search functionality. You need to have a FULLTEXT index on the fields to be searched. For natural language searches, it returns a relevance value which is "a similarity measure between the search string and the text in that row."
If you are searching multiple tables, the relevance value should be comparable across sets of results; you could do a UNION of individual fulltext queries on each of the tables, then sort the results of the union based on the relevance value.