I want to do a search based on its relevance to MySQL. Whatever I tried, I could not succeed. I did a Fulltext search, I did a search with Like, again I can not do the following order.
Search word for example this: "first second"
First Second Keyword -> (full word) will begin with
Keyword First Second Keyword -> (full word) in between or at the end
Firstsecond Keyword -> (adjacent word) will begin with
Keyword Firstsecond Keyword -> (adjacent word) in between or at the end
First Keyword -> (only the first word) will begin with
Keyword First Keyword -> (only the first word) in between or at the end
Second Keyword -> (only the second word) will begin with
Keyword Second Keyword -> (only the second word) in between or at the end
Then I want it to continue as below; (all of them)
Firstasdasd Keyword
Keyword asdasdfirst Keyword
Keyword asdasdfirstasdasd Keyword
Secondasdasd Keyword
Keyword asdasdsecond Keyword
Keyword asdasdsecondasdasd Keyword
Yeah, FULLTEXT won't do this kind of search. It's word-oriented.
What you want isn't really simple SQL. I think you want something like this to hit the first keyword.
SET #match := 'first'
SELECT MIN(priority) priority, value
FROM (
SELECT 1 priority, value FROM tbl WHERE value LIKE CONCAT(#match, ' %')
UNION ALL
SELECT 2 priority, value FROM tbl WHERE value LIKE CONCAT('% ', #match, ' %')
UNION ALL
SELECT 3 priority, value FROM tbl WHERE value LIKE CONCAT('%', #match, '%')
) results
GROUP BY value
ORDER BY MIN(priority), value;
Generally, you use a UNIONed series of SELECTs with priorities for the full-word and embedded word matches, and take the lowest priority match you get for each row.
You'll need to elaborate on that general kind of pattern to handle both keywords. This looks like it could turn into a real pain in the xxx neck to get right. And the LIKE '%something' search terms mean it won't be very fast.
If you're going to large scale with this, it might be worth investigating Sphinx.
Related
I have a query on a full text column 'abstract':
SELECT title
FROM citation
WHERE MATCH(abstract)
AGAINST ('+orange' IN BOOLEAN MODE)
My question: I want to generate the AGAINST clause using results from a sub select, eg (pseudocode):
SELECT title
FROM citation
WHERE MATCH(abstract)
AGAINST (
CONCAT("'","+",
SELECT names
FROM fruits
WHERE type = "citrus"),"' "
), "IN BOOLEAN MODE"
)
Is this possible?
The arguument in against needs to be a constant.
Take a look at this document here. click to see
Your query is much likely to work with LIKE however using this makes your query a lot slower.
Or you could do something like
SET #finding_element=(SELECT c1.full_name FROM customer c1 WHERE c1.id=2); SELECT c.full_name FROM customer c WHERE MATCH(c.full_name) AGAINST((#finding_element));
(query I have is used is just for reference)
Happy Learning
I am using MySQL REGEXP to assign reviews into different topics and output them into separate columns. The problem is- some reviews may not get assigned to any topic, which is why I need an "Other" column. How do I modify the query below to achieve that?
SELECT
text,
text REGEXP 'keywords' AND text REGEXP 'other keywords' AND .... AS Cleanliness,
text REGEXP 'keywords' AND text REGEXP 'other keywords' AND .... AS Restaurant,
text REGEXP 'keywords' AND text REGEXP 'other keywords' AND .... AS Wifi,
FROM review_table;
Note that a review can belong to multiple topics.
The end result should look like this:
One solution would be create anoter REGEXP expression that represents the negation of all other expressions. But that can quickly become tedious to maintain.
Another option is to just wrap the query and analyze the results in the outer query to generate the additional column. This should be as simple as:
SELECT x.*, (Cleanliness + Food + Wifi = 0) AS Other
FROM (
--- original query
) x
Tip: in MySQL, the return value of a condition expression is 1 on success and 0 on failure. This means that this expression:
CASE
WHEN review REGEXP 'relevant keywords'
AND review REGEXP 'additional keywords if necessary'
THEN 1
ELSE 0
END AS 'Cleanliness'
Can also be written:
(
review REGEXP 'relevant keywords'
AND review REGEXP 'additional keywords if necessary'
) AS 'Cleanliness'
I think we can use the NOT(expression) command
CASE
WHEN review NOT (REGEXP 'relevant keywords'
AND review REGEXP 'additional keywords if necessary' )
THEN 1
ELSE 0
END AS 'Irrelevant'
Reference: https://dev.mysql.com/doc/refman/5.7/en/regexp.html
Related: negate regex pattern in mysql
I have a mysql query as follows.
$query="SELECT name,activity FROM appid
where result > 5 AND name LIKE :term ORDER BY name ASC LIMIT 0,40";
$result = $pdo->prepare($query);
$result->bindvalue(':term','%'.$_GET["q"].'%',PDO::PARAM_STR);
$result->execute();
What i want to do is this.
I have and entry like this that i want to find
'News & Weather'
However when i type
'news weather'
it of course will not find it. How can i be able to type that and retrieve that entry?
Regular expressions can do the trick:
select *
from appid
where name rlike 'news|weather' -- Matches 'news' or 'weather'
Another example:
select *
from appid
where name rlike 'news.*weather' -- Matches 'news' and 'wether'
-- with any characters in-between or none at all
-- (ordered)
Just one more:
select *
from appid
where name rlike '^news.{1,}weather$' -- Matches any text that starts with 'news'
-- and has at least one character before
-- ending with 'weather'
Regular espressions can be used to create very complicated filters on text fields. Read the link above for more information.
If you can add a full-text index to your table, Full-text search might be the better way to go with this. Specifically, a boolean Full-Text search:
select *
from appid
where match(name) against (+news +weather)
I believe the only way possible are through code:
Option A: Replace the spaces in your query parameter with '%' in code, but that of course will make the multiple words ordered
Option B: Split your parameter on spaces and dynamically construct your query with as many LIKEs as needed, adding additional ":termN" parameters for each one.
First off there seems to be no way to get an exact match using a full-text search. This seems to be a highly discussed issue when using the full-text search method and there are lots of different solutions to achieve the desired result, however most seem very inefficient. Being I'm forced to use full-text search due to the volume of my database I recently had to implement one of these solutions to get more accurate results.
I could not use the ranking results from the full-text search because of how it works. For instance if you searched for a movie called Toy Story and there was also a movie called The Story Behind Toy Story that would come up instead of the exact match because it found the word Story twice and Toy.
I do track my own rankings which I call "Popularity" each time a user access a record the number goes up. I use this datapoint to weight my results to help determine what the user might be looking for.
I also have the issue where sometimes need to fall back to a LIKE search and not return an exact match. I.e. searching Goonies should return The Goonies (most popular result)
So here is an example of my current stored procedure for achieving this:
DECLARE #Title varchar(255)
SET #Title = '"Toy Story"'
--need to remove quotes from parameter for LIKE search
DECLARE #Title2 varchar(255)
SET #Title2 = REPLACE(#title, '"', '')
--get top 100 results using full-text search and sort them by popularity
SELECT TOP(100) id, title, popularity As Weight into #TempTable FROM movies WHERE CONTAINS(title, #Title) ORDER BY [Weight] DESC
--check if exact match can be found
IF EXISTS(select * from #TempTable where Title = #title2)
--return exact match
SELECT TOP(1) * from #TempTable where Title = #title2
ELSE
--no exact match found, try using like with wildcards
SELECT TOP(1) * from #TempTable where Title like '%' + #title2 + '%'
DROP TABLE #TEMPTABLE
This stored procedure is executed about 5,000 times a minute, and crazy enough it's not bringing my server to it's knees. But I really want to know if there was a more efficient approach to this? Thanks.
You should use full text search CONTAINSTABLE to find the top 100 (possibly 200) candidate results and then order the results you found using your own criteria.
It sounds like you'd like to ORDER BY
exact match of the phrase (=)
the fully matched phrase (LIKE)
higher value for the Popularity column
the Rank from the CONTAINSTABLE
But you can toy around with the exact order you prefer.
In SQL that looks something like:
DECLARE #title varchar(255)
SET #title = '"Toy Story"'
--need to remove quotes from parameter for LIKE search
DECLARE #title2 varchar(255)
SET #title2 = REPLACE(#title, '"', '')
SELECT
m.ID,
m.title,
m.Popularity,
k.Rank
FROM Movies m
INNER JOIN CONTAINSTABLE(Movies, title, #title, 100) as [k]
ON m.ID = k.[Key]
ORDER BY
CASE WHEN m.title = #title2 THEN 0 ELSE 1 END,
CASE WHEN m.title LIKE #title2 THEN 0 ELSE 1 END,
m.popularity desc,
k.rank
See SQLFiddle
This will give you the movies that contain the exact phrase "Toy Story", ordered by their popularity.
SELECT
m.[ID],
m.[Popularity],
k.[Rank]
FROM [dbo].[Movies] m
INNER JOIN CONTAINSTABLE([dbo].[Movies], [Title], N'"Toy Story"') as [k]
ON m.[ID] = k.[Key]
ORDER BY m.[Popularity]
Note the above would also give you "The Goonies Return" if you searched "The Goonies".
If got the feeling you don't really like the fuzzy part of the full text search but you do like the performance part.
Maybe is this a path: if you insist on getting the EXACT match before a weighted match you could try to hash the value. For example 'Toy Story' -> bring to lowercase -> toy story -> Hash into 4de2gs5sa (with whatever hash you like) and perform a search on the hash.
In Oracle I've used UTL_MATCH for similar purposes. (http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/u_match.htm)
Even though using the Jaro Winkler algorithm, for instance, might take awhile if you compare the title column from table 1 and table 2, you can improve performance if you partially join the 2 tables. I have in some cases compared person names on table 1 with table 2 using Jaro Winkler, but limited results not just above a certain Jaro Winkler threshold, but also to names between the 2 tables where the first letter is the same. For instance I would compare Albert with Aden, Alfonzo, and Alberto, using Jaro Winkler, but not Albert and Frank (limiting the number of situations where the algorithm needs to be used).
Jaro Winkler may actually be suitable for movie titles as well. Although you are using SQL server (can't use the utl_match package) it looks like there is a free library called "SimMetrics" which has the Jaro Winkler algorithm among other string comparison metrics. You can find detail on that and instructions here: http://anastasiosyal.com/POST/2009/01/11/18.ASPX?#simmetrics
In my table I have firstname and last name. Few names are upper case ( ABRAHAM ), few names are lower case (abraham), few names are character starting with ucword (Abraham).
So when i am doing the where condition using REGEXP '^[abc]', I am not getting proper records. How to change the names to lower case and use SELECT QUERY.
SELECT * FROM `test_tbl` WHERE cus_name REGEXP '^[abc]';
This is my query, works fine if the records are lower case, but my records are intermediate ,my all cus name are not lower case , all the names are like ucword.
So for this above query am not getting proper records display.
I think you should query your database making sure that the names are lowered, suppose that name is the name you whish to find out, and in your application you've lowered it like 'abraham', now your query should be like this:
SELECT * FROM `test_tbl` WHERE LOWER(cus_name) = name
Since i dont know what language you use, I've just placed name, but make sure that this is lowered and you should retrieve Abraham, ABRAHAM or any variation of the name!
Hepe it helps!
Have you tried:
SELECT * FROM `test_tbl` WHERE LOWER(cus_name) REGEXP '^[abc]';
I don't know since when, but nowadays MySql REGEXP is case insensitive.
https://dev.mysql.com/doc/refman/5.7/en/pattern-matching.html
You don't need regexp to search for names starting with a specific string or character.
SELECT * FROM `test_tbl` WHERE cus_name LIKE 'abc%' ;
% is wildcard char. The search is case insensitive unless you set the binary attribute for column cus_name or you use the binary operator
SELECT * FROM `test_tbl` WHERE BINARY cus_name LIKE 'abc%' ;
A few valid options already presented, but here's one more with just regex:
SELECT * FROM `test_tbl` WHERE cus_name REGEXP '^[abcABC]';