How to implement phrase match in redisearch? - redisearch

I'm trying to implement a targeting feature on our AdServer (DSP). Currently, our Ads are attached to targeted keywords e.g. "Baseball caps" and all of these keywords are indexed in RediSearch. When a user on a publisher side searches for e.g. "Baseball caps for girls" then our adserver should return ads attach to keywords like the above example ad. but I have a problem figuring out how to write the queries:
FT.Search dummyIndex "#Keyword: Baseball caps for girls" won't return the example ad with "Baseball caps" since the query string has extra words that do not exist within the keyword.
using | (or) won't work either because
FT. Search dummyIndex "#Keyword: (Baseball | girls)" will return the example ad but we should not since the query string does not have caps
is there any way I can achieve this using research? please note that indexing the user query string rather than the keywords and searching for the keywords within the index is not applicable.

Related

Search in multiple column with at least 2 words in keyword

I have a table which store some datas. This is my table structure.
Course
Location
Wolden
New York
Sertigo
Seatlle
Monad
Chicago
Donner
Texas
I want to search from that table for example with this keyword Sertigo Seattle and it will return row number two as a result.
I have this query but doesn't work.
SELECT * FROM courses_data a WHERE CONCAT_WS(' ', a.Courses, a.Location) LIKE '%Sertigo Seattle%'
Maybe anyone knows how to make query to achieve my needs?
If you want to search against the course and location then use:
SELECT *
FROM courses_data
WHERE Course = 'Sertigo' AND Location = 'Seattle';
Efficient searching is usually implemented by preparing the search string before running the actual search:
You split the search string "Sertigo Seattle" into two words: "Sertigo" and "Seattle". You trim those words (remove enclosing white space characters). You might also want to normalize the words, perhaps convert them to all lower case to implement a case insentive search.
Then you run a search for the discrete words:
SELECT *
FROM courses_data
WHERE
(Course = 'Sertigo' AND Location = 'Seattle')
OR
(Course = 'Seattle' AND Location = 'Sertigo');
Of course that query is created using a prepared statement and parameter binding, using the extracted and trimmed words as dynamic parameters.
This is is much more efficient than using wildcard based search with the LIKE operator. Because the database engine can make use of the indexes you (hopefully) created for that table. You can check that by using EXPLAIN feature MySQL offers.
Also it does make sense to measure performance: run different search approaches in a loop, say 1000 times, and take the required time. You will get a clear and meaningful example. Also monitoring CPU and memory usage in such a test is of interest.

How to use fulltext matching for emails in MySQL?

I'm adding a "search by email" functionality on my contacts database table (MySQL, InnoDB engine) for our CRM system. I added the FULLTEXT index and partial searches like this
WHERE MATCH(cc.email_c) AGAINST('Joe' IN NATURAL LANGUAGE MODE)
are working fine. However, if I try to match against a full email address like this
WHERE MATCH(cc.email_c) AGAINST('Joe#gmail.com' IN NATURAL LANGUAGE MODE)
all email addresses with Gmail get returned. I've already discarded the possibility of using BOOLEAN MODE, from what I understand it doesn't support the "#" symbol in a query. Is there any other way of using a MATCH AGAINST while still being able to search by the exact full address?
I can always use an SQL with LIKE and some wildcards, but I would still prefer to have a full-text search.
Convert it to AGAINST('+Joe +gmail +com' IN BOOLEAN MODE)
But, beware; short names and "stop words" should not have a '+' on front of them.
Then, because it won't be precise enough, combine with a LIKE:
WHERE MATCH(email) AGAINST('+Joe +gmail +com' IN BOOLEAN MODE)
AND email LIKE '%Joe#gmail.com%'
The first part gives you speed; the second part will test only the ones that pass the first test.

Search XML feed's description against a keyword from database

I'm working on a project where I use XML feeds to get input. I have to filter the items which title and description that matches specific keywords. If an item contains smart phone in title or description, I have to add that item in database under the category "Smart phone".
The query I use here is
$title = $item=>title;
$desc = $item->description;
SELECT cid FROM tbl_keyword WHERE MATCH(keyword) AGAINST ('".$title." ".$desc."' IN
BOOLEAN MODE);
Query returns value but it gets other rows from database like smart watch,smart toys.
I want to know, how to include space based search.
Query have to match the exact keyword.
table looks like
id cid keyword
1 6 smart phone
2 6 iphone
3 7 smart watch
When i get a title as "Smart phones are not essential", query should return only the cid 6.
How to implement it.?
As I know , you can't just search the whole sentence(Smart phones are not essential) against the db to get that exact result .
Two ways to do this :
1.*(Recommended)*You can just break the sentence with space ("Smart", "Phone", "are", "not", "essential")and then apply the following query
select * from tbl_keyword where keyword like "%smart%" or keyword like "%phones%" or keyword like "%are%" or keyword like "%not%" or keyword like "%essential%"
This query will output list of possible entries from the database. From this you will need to compare the result with your query sentence using your programming language.
2.This way will output the entry directly from the database (but this may rule out some of the important entries at worst case).
Breakdown the sentence into single word like this ("Smart", "Phones", "are", "not", "essential")
And then break this sentence with two words like this ("Smart Phones", "Phone are", "are not", "not essential","essential Smart","Smart are","Smart not","Phone not")
And use both of this to retrieve entries from the database (This process will just narrow down the filter)
select * from tbl_keyword where (keyword like "%smart%" or keyword like "%phone%" or keyword like "%are%" or keyword like "%not%" or keyword like "%essential%") and (keyword like "%smart phones%" or keyword like "%phones are%" or keyword like "%are not%" or keyword like "%not essentials%" or keyword like "%essentials smart%")
Hope this will help you ...

mysql boolean mode fulltext search with wildcards and literals

I'm pretty new to MySQL full-text searches and I ran into this problem today:
My company table has a record with "e-magazine AG" in the name column. I have a full-text index on the name column.
When I execute this query the record is not found:
SELECT id, name FROM company WHERE MATCH(name) AGAINST('+"e-magazi"*' IN BOOLEAN MODE);
I need to work with quotes because of the dash and to use the wildcard because I implement a "search as you type" functionality.
When I search for the whole term "e-magazine AG", the record is found.
Any ideas what I'm doing wrong here? I read about adding the dash to the list of word characters (config update needed) but I'm searching for a way to do this programmatically.
This clause
MATCH(name) AGAINST('+"e-magazi"*' IN BOOLEAN MODE);
Will search for a AND "e" AND NOT "magazi"; i.e. the - inside "e-magazi" will be interpreted as a not even though it is inside quotation marks.
For this reason it will not work as expected.
A solution is to apply an extra having clause with a LIKE.
I know this having is slow, but it will only be applied to the results of the match, so not too many rows should be involved.
I suggest something like:
SELECT id, name
FROM company
WHERE MATCH(name) AGAINST('magazine' IN BOOLEAN MODE)
HAVING name LIKE '%e-magazi%';
MySQL fulltext treats the word e-magazine in a text as a phrase and not as a word. Because of that it results the two words e and magazine. And while it builds the search index it does not add the e to the index because of the ft_min_word_len (default is 4 chars).
The same length limitation is used for the search query. That is the reason why a search for e-magazine returns exactly the same results as a-magazine because a and - is fully ignored.
But now you want to find the exact phrase e-magazine. By that you use the quotes and that is the complete correct way to find phrases, but MySQL does not support operators for phrases, only for words:
https://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html
With this modifier, certain characters have special meaning at the beginning or end of words in the search string
Some people would suggest to use the following query:
SELECT id, name
FROM company
WHERE MATCH(name) AGAINST('e-magazi*' IN BOOLEAN MODE)
HAVING name LIKE 'e-magazi%';
As I said MySQL ignores the e- and searches for the wildcard word magazi*. After those results are optained it uses HAVING to aditionally filter the results for e-magazi* including the e-. By that you will find the phrase e-magazine AG. Of course HAVING is only needed if the search phrase contains the wildcard operator and you should never use quotes. This operator is used by your user and not you!
Note: As long you do not surround the search phrase with % it will find only fields that start with that word. And you do not want to surround it, because it would find bee-magazine as well. So maybe you need an additional OR HAVING name LIKE ' %e-magazi%' OR HAVING NAME LIKE '\\n%e-magazi%' to make it usable inside of texts.
Trick
But finally I prefer a trick so HAVING isn't needed at all:
If you add texts to your database table, add them additionally to a separate fulltext indexed column and replace words like up-to-date with up-to-date uptodate.
If a user searches for up-to-date replace it in the query with uptodate.
By that you can still find specific in user-specific but up-to-date as well (and not only date).
Bonus
If a user searches for -well-known huge ports MySQL treats that as not include *well*, could include *known* and *huge*. Of course you could solve that with an other extra query variant as well, but with the trick above you remove the hyphen so the search query looks simply like that:
SELECT id
FROM texts
WHERE MATCH(text) AGAINST('-wellknown huge ports' IN BOOLEAN MODE)

Use REGEXP in MySQL to match keywords for search engine in random order

I'm trying to use a regular expression to match a user entered search string to a title of an entry in my MySQL database.
For example I have the following rows in a table in my databse:
id title
1 IM2 - Article 3 Funky Business
2 IM2 - Article 4 There's no Business That's not Show Business
3 IM2 - There's no Business That's not Show Business
4 CO4 - Life's a business
When a user searches for "IM Article Business", the following query will be executed (spaces are replaced by "(.*)" using str_replace):
SELECT * FROM mytable WHERE title REGEXP 'IM(.*)Article(.*)Business'
This will return the first 2 rows.
Now, I want it to show the same results when a user uses the same words, but in another order, for example: "Business IM Article". The results MUST contain all words entered, only the order of how the words are entered shouldn't matter.
I couldn't figure out how to do it in any way and hoped regular expressions would be the answer. I've never used them before, so does anybody know how to do this?
Thanks,
Pascal
This isn't something regular expressions are great at. Fortunately, it's something SQL is pretty good at. (I'm going to not use mysql's regexp keyword, which I didn't even knew existed, and instead use the SQL standard "%" glob matching.)
select * from mytable where title like '%IM%' and title like '%Article%' and title like '%Business%'
Now title has to contain all three strings, but you haven't specified an order. Exactly what you want.