How to match keywords with SQL query? - mysql

I am building a website that accepts user input keywords and output data that matches. For instance if the user specifies keyword as 'Restaurant pizza' then my database should output such record.
My current table has a column called category and five columns named from keyword1 to keyword5 which contains their specialized area, i.e. 'pizza', 'chicken' or 'bbq' etc.
But I have no idea how to write the SQL query since user may input keywords in any order: category first or with specialized area first.
so such query will surely return no result (given that user input 'Restaurant pizza' for query):
SELECT *
FROM message
WHERE category LIKE 'Restaurant pizza'
OR keyword1 LIKE 'Restaurant pizza'
OR keyword2 LIKE 'Restaurant pizza'
I guess it would be a bad idea splitting input keyword into words then running every word in the WHERE clause to database. but I really do not know how to achieve my goal.
In addition, would you please give me some advice on how to build index in this scenario?

You should create FULLTEXT index on category and keywords columns, then when querying data explode query string by delimeters (space character) and then create query something like:
SELECT * FROM items
WHERE
MATCH (category,keyword1,keyword2,keyword3,keyword4,keyword5)
AGAINST ('pizza')
AND
MATCH (category,keyword1,keyword2,keyword3,keyword4,keyword5)
AGAINST ('restaurant');

You may try this:
SELECT *
FROM message
WHERE `category` LIKE '%Restaurant%'
AND (`keyword1` IN ('Pizza','chicken','bbq')
OR `keyword2` IN ('Pizza','chicken','bbq'))

Related

Using a SQL wildcard ignoring non-alphanumeric characters in a column

I have a two-column table:
product name
cars ["bmw", "mazda", "ford", "dodge"]
fruit ["lemon", "orange", "lime", "apple"]
I'm using a wildcard to search the product's name column. My question is, is there a way to search a column only by alphanumeric characters and ignore the " and [ ]?
For example, if the user searched bmw the query would be: LIKE '%bmw%' and it would return cars, however if the user searches bmw" and the query is: LIKE '%bmw"% or they enter dodge"] and the query would be LIKE '%dodge"]%" it would want it to not return any results.
My current query:
SELECT product, name FROM `test1` WHERE name LIKE '%bmw%'
It doesn't need to be a wildcard basically, I am after the query only providing the product if the exact name is used but because of the format of the name column it's giving me problems.
You might want to clean the data before it's being added to the query.
e.g.You can use regex to replace unwanted characters / only allow certain characters in a text.
Then you can add that "cleaned" data as a parameter to that query.
As suggested JSON_SEARCH() is what I needed to achieve my desired outcome.
See: https://database.guide/json_search-find-the-path-to-a-string-in-a-json-document-in-mysql/
Specifically Example 5 – Wildcards.

Using a MySQL Select with a RegEx Expression embedded within a String

I am using PHP to access a mysql database field that contains up to 2500 characters per record.
I want to build queries that will return only the records that include a single word, like 'taco'.
Sometimes, however, the user will need to search for a word like 'jalapeno'. Except that jalapeno may exist in the database as 'jalapeno' or as 'jalapeño'. The query should return both instances.
As a further complication, the user may also need to search for a word like 'creme', which may appear as 'creme' or 'créme', but never as 'crémé'.
It seems like I should be able to construct something that uses a replace, and then a Regular Expression, so that the letter 'n' is always replaced with '[n|ñ]', and then search for a string with an embedded Regular Expression like this: 'jalape[n|ñ]o'. Except that does not work. MySQL treats the RegEx syntax as literals.
None of the following return the results that I am looking for:
SELECT id, record FROM table WHERE record like '%jalapeno%';
SELECT id, record FROM table WHERE record REGEXP 'jalapeno';
SELECT id, record FROM table WHERE record REGEXP 'jalape[n|ñ]o';
SELECT id, record FROM table WHERE REGEXP_LIKE(record, 'jalape[n|ñ]o', 'im');
Additionally, I can use PHP to do a replacement of the potential characters, but I end up with stuff like this:
SELECT id, record FROM table WHERE (record like '%creme%' || record like '%crémé%');
I would be Ok with a search like this, but it seems overly complicated to construct programmatically:
SELECT id, record FROM table WHERE (record like '%creme%' || record like '%crémé%' || record like '%cremé%' || record like '%cremé%' );
Is there a MySQL method that provides a REGEX 'OR' to be embedded within a String?
Maybe something like this:
SELECT id, record FROM table WHERE record like '%cr[e|é]m[e|é]%' ;
Or is there another solution that would not require the construction of an excessively convoluted SQL Statement?
Thanks for anyone who spent time trying to figure this out.
As I commented above, REGEXP_LIKE() does not appear to be a valid MySQL function for the current release.
Here is my solution; Note that this works for MySQL 5.7.x.
SELECT id, record FROM table WHERE record RLIKE 'jalape(n|ñ)o';

Search XML feed's description against a keyword from database

I'm working on a project where I use XML feeds to get input. I have to filter the items which title and description that matches specific keywords. If an item contains smart phone in title or description, I have to add that item in database under the category "Smart phone".
The query I use here is
$title = $item=>title;
$desc = $item->description;
SELECT cid FROM tbl_keyword WHERE MATCH(keyword) AGAINST ('".$title." ".$desc."' IN
BOOLEAN MODE);
Query returns value but it gets other rows from database like smart watch,smart toys.
I want to know, how to include space based search.
Query have to match the exact keyword.
table looks like
id cid keyword
1 6 smart phone
2 6 iphone
3 7 smart watch
When i get a title as "Smart phones are not essential", query should return only the cid 6.
How to implement it.?
As I know , you can't just search the whole sentence(Smart phones are not essential) against the db to get that exact result .
Two ways to do this :
1.*(Recommended)*You can just break the sentence with space ("Smart", "Phone", "are", "not", "essential")and then apply the following query
select * from tbl_keyword where keyword like "%smart%" or keyword like "%phones%" or keyword like "%are%" or keyword like "%not%" or keyword like "%essential%"
This query will output list of possible entries from the database. From this you will need to compare the result with your query sentence using your programming language.
2.This way will output the entry directly from the database (but this may rule out some of the important entries at worst case).
Breakdown the sentence into single word like this ("Smart", "Phones", "are", "not", "essential")
And then break this sentence with two words like this ("Smart Phones", "Phone are", "are not", "not essential","essential Smart","Smart are","Smart not","Phone not")
And use both of this to retrieve entries from the database (This process will just narrow down the filter)
select * from tbl_keyword where (keyword like "%smart%" or keyword like "%phone%" or keyword like "%are%" or keyword like "%not%" or keyword like "%essential%") and (keyword like "%smart phones%" or keyword like "%phones are%" or keyword like "%are not%" or keyword like "%not essentials%" or keyword like "%essentials smart%")
Hope this will help you ...

MySQL search within the last 5 characters in a column?

My user table has a column "name" which contains information like this:
Joe Lee
Angela White
I want to search for either first name or last name efficiently. First name is easy, I can do
SELECT * FROM user WHERE name LIKE "ABC%"
But for last name, if I do
SELECT * FROM user WHERE name LIKE "%ABC"
That would be extremely slow.
So I am thinking about counting the characters of the input, for example, "ABC" has 3 characters, and if I can search only the last three characters in name column, that would be great. So I want something like
SELECT * FROM user WHERE substring(name, end-3, end) LIKE "ABC%"
Is there anything in MySQL that can do this?
Thanks so much!
PS. I cannot do fulltext because our search engine doesn't support that.
The reason that
WHERE name LIKE '%ith'
is a slow way to look for 'John Smith' by last name is the same reason that
WHERE Right(name, InStr(name, ' ' )) LIKE 'smi%'
or any other expression on the column is slow. It defeats the use of the index for quick lookup and leaves the MySQL server doing a full table scan or full index scan.
If you were using Oracle (that is, if you worked for a formerly wealthy employer) you could use function indexes. As it is you have to add some extra columns or some other helping data to accelerate your search.
Your smartest move is to split your first and last names into separate columns. Several other people have pointed out good reasons for doing that.
If you can't do that you could try creating an extra column which contains the name string reversed, and create an index on that column. That column will have, for example, 'John Smith' stored as 'htimS nhoJ'. Then you can search as follows.
WHERE nameReversed LIKE CONCAT(REVERSE('ith'),'%')
This search will use the index and be decently fast. I've had good success with it.
You're close. In MySQL you should be able to use InStr(str, substr) and Right(str, index) to do the following:
SELECT * FROM user WHERE Right(name, InStr(name, " ")) LIKE "ABC%"
InStr(name, " ") returns the index of the Space character (you may have to play with the " " syntax). This index is then used in the Right() function to search for only the last name (basically; problems arise when you have multiple names, multiple spaces etc). LIKE "ABC%" would then search for a last name starting with ABC.
You cannot use a fixed index as names that are more than 3 or less than 3 characters long would not return properly as you suggest.
However, as Zane said, it's a much better practise to use seperate fields.
If it is a MyIsam table, you may use Free text search to do the same.
You can use the REGEXP operator:
SELECT * FROM user WHERE name REGEXP "ABC$"
http://dev.mysql.com/doc/refman/5.1/en/regexp.html

MySQL UNION query correct handling for 3 or more words

I've to ask your help to solve this problem.
My website has a search field, let's say user writes in "Korg X 50"
In my database in table "products" i have a filed "name" that holds "X50" and a field "brand" that hold "Korg". Is there a way to use the UNION option to get the correct record ?
And if the user enters "Korg X-50" ?
Thank you very much !
Matteo
May be you should use full-text search
SELECT brand, name, MATCH (brand,name) AGAINST ('Korg X 50') AS score
FROM products WHERE MATCH (brand,name) AGAINST ('Korg X 50')
As far as I understand you don't need UNION but something like
SELECT * FROM table1
WHERE CONCAT(field1, field2) LIKE '%your_string%'
On client side you get rid of all characters (like space, hyphen, etc) in your_string that appears in user input and cannot be in field1 or field2.
So, user input Korg X 50 as well as Korg X-50 becomes KorgX50.
you will need to get some form of searchable text.
either parse out the input for multiple key words and match each separately, or perhaps try to append them all together and match to the columns appended in the same way.
you will also need either a regex, or maybe a simpler search and replace to get rid of spaces and dashes after the append before the comparison.
in general, allowing users to search for open ended text strings is more complicated than 'what union do i use'... you will ideally also be worried about slight misspellings and capitalization, and keyword order.
you may consider pulling all keywords out from your normal record into a separate keyword list associated with each product, then use that list to perform your searches.
If you do not want to parse user input and use as it is, then you will need to use a query like this
select * from products where concat_ws(' ',brand,name) = user_input -- or
select * from products where concat_ws(' ',brand,name) like %user_input%
However, this query won't return result if user enters name "Korg X-50" and your table contains "Korg" and "X50", then you need to do some other thing to achive this. You may look at http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex however it won't be a complete solution. Look for text indexing libraries for that ex: lucene