FULLTEX Search on two column of Different Table in Mysql

FULLTEX Search on two column of Different Table in Mysql - mysql

Can anyone help me to find the query for fulltext search?
I have two columns Product and Generic.
Table-Product:
1. ProductID (Integer)
2. GenericID (Integer)-FK
3. Product_Name (Varchar)
And in Table-Generic:
1. GenericID (Integer)
2. Generic_Name (Varchar)
What I need is to search the input string with the combined name of both Product_Name and Generic_Name.
my sample query is given below
SELECT
prod.ProductID AS ID,
generic.Generic_Name AS genericName,
prod.Product_Name AS packageName
FROM
Product prod
INNER JOIN
Generic generic ON prod.GenericID = generic.GenericID
WHERE
MATCH (prod.Product_Name ,generic.Generic_Name) AGAINST('+acb* +ace* +serr* +para*' IN BOOLEAN MODE)
ORDER BY prod.Product_Name ASC
It doesn't work because the columns are in different tables.

FULLTEXT search operations each use a FULLTEXT index. That index can only be on one table.
So, you could try using two fulltext search operations...
WHERE (
match(prod.Product_Name) against('+acb* +ace* +serr* +para*' in boolean mode)
OR
match(generic.Generic_Name) against('+acb* +ace* +serr* +para*' in boolean mode)
)
Or, for best performance and result-set ranking, you could build a new name table like this
GenericId NOT a primary key
IsGeneric 1 or 0
Name either Product_Name or Generic_Name
You would construct this table from the union of the names in your other two tables. For example, it might contain
4321 0 Advil
4321 0 Motrin
4321 1 Ibuprofen
4322 0 Coumadin
4322 1 Warfarin
Then, a query like this would do the trick
select prod.ProductID AS ID,
generic.Generic_Name AS genericName,
prod.Product_Name AS packageName
FROM Product prod
INNER JOIN Generic generic ON prod.GenericID = generic.GenericID
INNER JOIN Name ON Name.GenericID = prod.GenericID
WHERE MATCH(Name.Name) AGAINST('+acb* +ace* +serr* +para*' in boolean mode)
ORDER BY prod.Product_Name ASC
The second alternative is more work to program. But, because it puts both tradenames and generic names into a single fulltext index, it will be faster and it is likely to give better results.

Related

Mysql fulltext index search returning weird result

I have ingredient table. I want all those recipes which have certain ingredients. Below is my table structure.
Table(ingredient) - Applied fulltext index on ingredient column.
------------------------------------------------------
ingredientID rcteID ingredient
310 1 Mint Leaves
311 1 Corriender Leaves
312 1 GreenChili
I am trying to fetch above record below fulltext search query but not getting that record.
SELECT `Ingredient`.`ingredientID` , `Ingredient`.`rcteID`
FROM `ingredient` AS `Ingredient`
WHERE MATCH (`Ingredient`.`ingredient`)
AGAINST ('+Mint Leaves +Corriender Leaves +Greenchili' IN BOOLEAN MODE)
AND `Ingredient`.`rcteID`
IN ( 1 )
GROUP BY `Ingredient`.`rcteID`
Why above query is not working for above record?
When I tried below query it worked. just changed searching text.
SELECT `Ingredient`.`ingredientID` , `Ingredient`.`rcteID`
FROM `ingredient` AS `Ingredient`
WHERE MATCH (`Ingredient`.`ingredient`)
AGAINST ('+Greenchili +Mint Leaves +Corriender Leaves' IN BOOLEAN MODE)
AND `Ingredient`.`rcteID`
IN ( 1 )
GROUP BY `Ingredient`.`rcteID`
OUTPUT
--------------------
ingredientID rcteID
311 1
Don't understand what's going on. Why first query not returning any result and below query returning result?

This is not an real explanation, but you can run this query to see the score.
SELECT MATCH (`Ingredient`.`ingredient`)
AGAINST ('+Mint Leaves +Corriender Leaves +Greenchili' IN BOOLEAN MODE)
FROM `ingredient` AS `Ingredient`
WHERE MATCH (`Ingredient`.`ingredient`)
AGAINST ('+Mint Leaves +Corriender Leaves +Greenchili' IN BOOLEAN MODE)
I believe that your query mean: find ingredients that each of them contains ALL of these Mint Leaves, Corriender Leaves, Greenchili, and which is not found in your data set. MySQL cannot find any row that contains all of these keywords above.
However if you but your query into brackets, it is a different story:
SELECT `Ingredient`.`ingredientID` , `Ingredient`.`rcteID`
FROM `ingredient` AS `Ingredient`
WHERE MATCH (`Ingredient`.`ingredient`)
AGAINST ('(+Greenchili) (+Mint Leaves) (+Corriender Leaves)' IN BOOLEAN MODE)
AND `Ingredient`.`rcteID`
IN ( 1 )
GROUP BY `Ingredient`.`rcteID`
This query can be translated into: Fetch me ingredients which contains AT LEAST one of these:Mint Leaves, Corriender Leaves, Greenchili and group them by rcteID.
UPDATED:
SELECT t1.rcteID FROM `ingredient` t1
JOIN `ingredient` t2 ON t2.rcteID = t1.rcteID
JOIN `ingredient` t3 ON t3.rcteID = t2.rcteID
WHERE
MATCH (t1.`ingredient`) AGAINST ('+Greenchili' IN BOOLEAN MODE)
AND
MATCH (t2.`ingredient`) AGAINST ('+Mint Leaves' IN BOOLEAN MODE)
AND
MATCH (t3.`ingredient`) AGAINST ('+Corriender Leaves' IN BOOLEAN MODE)
AND t1.`rcteID` IN ( 1 )
GROUP BY t1.`rcteID`
I think this query will work for you. Basically, it share the same idea with you but it looks for 3 keywords separately and only get the rcteID which contains 3 ingredients.

MYSQL Search for empty fields in table

I'm search through multiple tables.
SELECT DISTINCT cv.id, cv.tJobTitle, cv.tJobTitleAlt, cv.rEmployer, employee.firstName, employee.surname, cv.recentJobTitle, match ( cv.title, cv.recentJobTitle, cv.targetJobTitle, cv.targetJobTitleAlt ) AGAINST ('Desktop' IN BOOLEAN MODE) AS relevance
FROM cv AS cv, employee AS employee, country AS country
WHERE cv.showTo=1 AND cv.status=1 AND cv.employeeIDFK = employee.id AND cv.countryISO2FK='GB'
AND cv.countryISO2FK=country.iso2
AND match ( cv.title, cv.recentJobTitle, cv.targetJobTitle, cv.targetJobTitleAlt ) AGAINST ('Desktop' IN BOOLEAN MODE )
AND cv.salaryType='1' AND cv.salaryMax <=23088 OR cv.salaryMin is NUll
ORDER BY relevance DESC
I have a price values which I am search in my database but I also have a tick box to say if the price has not be set show that record.
So If the price field is empty then still show in result.
I have try the above but its giving me more the 100 records where my table only has 2 records.

Assuming country.iso2 is a unique field, I'm guessing that you multiple cv's per employee or vice-versa.
NOTE: It's good advice to avoid using the comman notation for INNER JOINs. Also, this will only work where your field3 is really empty and not NULL.

How to compare two comma-separated string lists using MySQL

I used a Java method called 'containsAll()' to check if ArrayLists have common content.
Let's say I have a list A (one row), and several other lists in a MySQL table (in column 'name', row by row).
All lists consist of comma-separated Strings (at least one String in a list) - names or whatever.
Now, I want to check if all Strings in list A can be found in any of the rows in column 'name'.
The result set should show all the rows in 'name' that match, that includes rows/lists must have all Strings in list A, and can have additional Strings.
Example I
A: 'Mr.T'
____name_________________________________________
'Hannibal'
'Hannibal','Face','Murdock','Mr.T','Donald Duck'
'Face','Donald Duck'
'Superman','Chuck Norris','Mr.T'
_________________________________________________
Result set: 'Hannibal','Face','Murdock','Mr.T','Donald Duck' -AND-
'Superman',Chuck Norris','Mr.T'
Example II
A: 'Rocky', 'Mr.T', 'Apollo'
______name__________________________________________________
'Hannibal','Face','Murdock','Donald Duck','Superman','Mr.T'
'Rocky','Apollo','Ivan'
'Apollo', 'Superman','Hannibal','Rocky','Mr.T','Chuck Norris'
'Rocky','Mr.T','Apollo','Chuck Norris'
_____________________________________________________________
Result set: 'Apollo', 'Superman','Hannibal','Rocky','Mr.T','Chuck Norris' -AND-
'Rocky','Mr.T','Apollo','Cuck Norris'
I wonder if one can carry out those results using a MySQL query.
Thank you in advance.

It appears you want to do an array intersection, except your array is a single column. It can be done, but it will be slow, difficult to debug and will not leverage the power of relational databases. A better way would be to change your table schema to something like this:
Table groups
group_id int unsigned not null auto_increment primary key,
character_list text
Table members_in_group
group_id int unsigned not null,
group_member varchar(45) not null
Then you can query like this:
SELECT group_id, character_list
FROM groups g
JOIN members_in_groups m USING (group_id)
WHERE m.group_member IN ('Mr. T', ...);
The groups table is probably very like your current table. The members_in_groups table is the same data chopped up into easily searchable parts.
ETA given your comment, this should work if you can guarantee that each character_list contains only one instance of each character:
SELECT group_id,
SUM(CASE m.group_member IN ('Mr. T', 'Apollo', 'Rocky') THEN 1 ELSE 0 END) AS tally,
character_list
FROM groups g
JOIN members_in_groups m ON (g.group_id=m.group_id)
GROUP BY group_id
HAVING SUM(CASE m.group_member IN ('Mr. T', 'Apollo', 'Rocky') THEN 1 ELSE 0 END) = 3;
In this case the HAVING clause must equal 3 because there are 3 members in IN ('Mr. T', 'Apollo', 'Rocky').

I solved this issue by using the REGEXP function in MySQL:
SELECT * FROM `table` WHERE `column` REGEXP("(value1|value2|value3)");

SELECT * FROM tbl_name WHERE field_name LIKE '%\'Mr.T\'%'

Search with relevance ranking using containstable and freetext

I've read that you can rank the result from a search using containstable along with contains and freetext under SQL 2008 server. I've just recently used freetext for the first time. Free text loops through the words separately and compares to the indexed column. I want to be able to search for phrases first and then single words.
Let's say the description column is indexed. I'm using a stored procedure query like this:
SELECT id, description, item from table where (FREETEXT(description,#strsearch))
Example if 3 rowsets contains words with apples in them and I search for 'apple cake', the row-set with id2 should be first, then the other two should follow:
id1 apple pie 4/01/2012
id2 apple cake 2/29/2011
id3 candy apple 5/9/2011
Example if 4 rowsets contains words with food in them and I search for 'fast food restaurant', the row-set with id3 should be first, followed by id1 (not an exact match but because it has 'fast food' in the column), then the other two should follow:
id1 McDonalds fast food
id2 healthy food
id3 fast food restaurant
id4 Italian restaurant

Does this article help?
MSDN : Limiting Ranked Result Sets (Full-Text Search)
It implies, in part, that using an additional parameter will allow you to limit the result to the ones with the greatest relevance (which you can influence using WEIGHT) and also order by that relevance (RANK).
top_n_by_rank is an integer value, n, that specifies that only the n
highest ranked matches are to be returned, in descending order.
The doc doesn't have an example for FREETEXT; it only references CONTAINSTABLE. But it definitely implies that CONTAINSTABLE outputs a RANK column that you could use to ORDER BY.
I don't know if there is any way to enforce your own definition of relevance. It may make sense to pull out the top 10 relevant matches according to FTS, then apply your own ranking on the output, e.g. you can split up the search terms using a function, and order by how many of the words matched. For simplicity and easy repro in the following example I am not using Full-Text in the subquery but you can replace it with whatever you're actually doing. First create the function:
IF OBJECT_ID('dbo.SplitStrings') IS NOT NULL
DROP FUNCTION dbo.SplitStrings;
GO
CREATE FUNCTION dbo.SplitStrings(#List NVARCHAR(MAX))
RETURNS TABLE
AS
RETURN ( SELECT Item FROM
( SELECT Item = x.i.value('(./text())[1]', 'nvarchar(max)')
FROM ( SELECT [XML] = CONVERT(XML, '<i>'
+ REPLACE(#List, ' ', '</i><i>') + '</i>').query('.')
) AS a CROSS APPLY [XML].nodes('i') AS x(i) ) AS y
WHERE Item IS NOT NULL
);
GO
Then a simple script that shows how to perform the matching:
DECLARE #foo TABLE
(
id INT,
[description] NVARCHAR(450)
);
INSERT #foo VALUES
(1,N'McDonalds fast food'),
(2,N'healthy food'),
(3,N'fast food restaurant'),
(4,N'Italian restaurant'),
(5,N'Spike''s Junkyard Dogs');
DECLARE #searchstring NVARCHAR(255) = N'fast food restaurant';
SELECT x.id, x.[description]--, MatchCount = COUNT(s.Item)
FROM
(
SELECT f.id, f.[description]
FROM #foo AS f
-- pretend this actually does full-text search:
--where (FREETEXT(description,#strsearch))
-- and ignore how I actually matched:
INNER JOIN dbo.SplitStrings(#searchstring) AS s
ON CHARINDEX(s.Item, f.[description]) > 0
GROUP BY f.id, f.[description]
) AS x
INNER JOIN dbo.SplitStrings(#searchstring) AS s
ON CHARINDEX(s.Item, x.[description]) > 0
GROUP BY x.id, x.[description]
ORDER BY COUNT(s.Item) DESC, [description];
Results:
id description
-- -----------
3 fast food restaurant
1 McDonalds fast food
2 healthy food
4 Italian restaurant

Why doesn't this sub-query seem to work?

Before anything, I am not looking for a re-write. This was presented to me, and I can't seem to figure out if this is a bug in general or some kind of syntactic craziness that occurs due to the peculiarity of the script. Okay with that said on with the setup:
Microsoft SQL Server Standard Edition (64-bit)
Version 10.50.2500.0
On a table located in a generic database, defined as:
CREATE TABLE [dbo].[Regions](
[RegionID] [int] NOT NULL,
[RegionGroupID] [int] NOT NULL,
[IsDefault] [bit] NOT NULL,
CONSTRAINT [PK_Regions] PRIMARY KEY CLUSTERED
(
[RegionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
insert some values:
INSERT INTO [dbo].[Regions]
([RegionID],[RegionGroupID],[IsDefault])
VALUES
(0,1,0),
(1,1,0),
(2,1,0),
(3,2,0),
(4,2,0),
(5,2,0),
(6,3,0),
(7,3,0),
(8,3,0)
Now run the query (to select a single from each group, remember no rewrite suggestions!):
SELECT RXXID FROM (
SELECT
RXX.RegionID as RXXID,
ROW_NUMBER() OVER (PARTITION BY RXX.RegionGroupID ORDER BY RXX.RegionGroupID) AS RXXNUM
FROM Regions as RXX
) AS tmp
WHERE tmp.RXXNUM = 1
You should get:
RXXID
-----------
0
3
6
Now stick that inside an update statement (with a preset to 0 and a select all after):
UPDATE Regions SET IsDefault = 0
UPDATE Regions
SET IsDefault = 1
WHERE RegionID IN (
SELECT RXXID FROM (
SELECT
RXX.RegionID as RXXID,
ROW_NUMBER() OVER (PARTITION BY RXX.RegionGroupID ORDER BY RXX.RegionGroupID) AS RXXNUM
FROM Regions as RXX
) AS tmp
WHERE tmp.RXXNUM = 1
)
SELECT * FROM Regions
ORDER BY RegionGroupID
and get this result:
RegionID RegionGroupID IsDefault
----------- ------------- ---------
0 1 1
1 1 1
2 1 1
3 2 1
4 2 1
5 2 1
6 3 1
7 3 1
8 3 1
zomg wtf lamaz?
While I don't claim to be a SQL guru, this seems neither proper nor correct. And to make things more crazy, if you drop the primary key it seems to work:
Drop primary key:
IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[Regions]') AND name = N'PK_Regions')
ALTER TABLE [dbo].[Regions] DROP CONSTRAINT [PK_Regions]
And re-run update statement set, result:
RegionID RegionGroupID IsDefault
----------- ------------- ---------
0 1 1
1 1 0
2 1 0
3 2 1
4 2 0
5 2 0
6 3 1
7 3 0
8 3 0
Isn't that a b?
Does anyone have any clue what is going on here? My guess is some kind of sub-query caching and is this a bug? It sure doesn't seem like what SQL should be doing?

Just update as a CTE directly:
WITH tmp AS (
SELECT
RegionID as RXXID,
RegionGroupID,
IsDefault,
ROW_NUMBER() OVER (PARTITION BY RegionGroupID ORDER BY RegionID) AS RXXNUM
FROM Regions
)
UPDATE tmp SET IsDefault = 1 WHERE RXXNUM = 1
select * from Regions
Added more columns to illustrate. You can see this on http://sqlfiddle.com/#!3/03913/9
Not 100% sure what is going on in your example, but since you partition and order by the same column, you're not really certain to get the same order back, since they are all tied. Shouldn't you order by RegionID or some other column, as i did on sqlfiddle?
Back to your question:
If you change your UPDATE (with the clustered index) to a SELECT, you'll get all 9 rows back.
If you drop the PK, and do the SELECT, you only get 3 rows. Back to your update statement. Inspecting the execution plans show that they differ slightly:
What you can see here is that in the first (with PK) query, you'll scan the clustered index for the outer reference, note that it does not have the alias RXX. Then for each row in the top, do a lookup to the RXX. And yes, because of your row number ordering, every RegionID can be row_number() 1 for each RegionGroupID. SQL Server would know this based on your PK, i guess, and can say that For every RegionID, this RegionID can be row number 1. Therefore the statement is rather valid.
In the second query, there is no index, and you get a table scan on Regions, then it builds a probe table using the RXX, and joins differently (single pass, ROW_NUMBER() can only be 1 for one row per regiongroupid now). This way in that scan, every RegionID has only one ROW_NUMBER(), though you cannot be 100% certain it'll be the same every time.
This means:
Using your subquery which doesn't have a deterministic order for every execution, you should avoid using a multiple pass (NESTED LOOP) join type, but a single pass (MERGE OR HASH) join.
To fix this without changing the structure of your query, add OPTION (HASH JOIN) or OPTION (MERGE JOIN) to the first UPDATE:
So, you'll need the following update statement (when you have the PK):
UPDATE Regions SET IsDefault = 0
UPDATE Regions
SET IsDefault = 1
WHERE RegionID IN (
SELECT RXXID FROM (
SELECT
RXX.RegionID as RXXID,
ROW_NUMBER() OVER (PARTITION BY RXX.RegionGroupID ORDER BY RXX.RegionGroupID) AS RXXNUM
FROM Regions as RXX
) AS tmp
WHERE tmp.RXXNUM = 1
)
OPTION (HASH JOIN)
SELECT * FROM Regions
ORDER BY RegionGroupID
Here are the execution plans using these two join types (note actual number of rows: 3 in the properties):

Your query in plain language is something like:
For each row in Regions check if RegionID exists in some sub query. Meaning that the sub query is executed for each row in Regions. (I know that is not the case but it is the semantics of the query).
Since you are using RegionGroupID as order and partition you really have no idea what RegionID will be returned so it might very well be a new ID for each time the sub-query is checked against.
Update:
Doing the update with a join against the derived table instead instead of using in changes the semantics of the query and it changed the result as well.
This works as expected:
UPDATE R
SET IsDefault = 1
FROM Regions as R
inner join
(
SELECT RXXID FROM (
SELECT
RXX.RegionID as RXXID,
ROW_NUMBER() OVER (PARTITION BY RXX.RegionGroupID ORDER BY RXX.RegionGroupID) AS RXXNUM
FROM Regions as RXX
) AS tmp
WHERE tmp.RXXNUM = 1
) as C
on R.RegionID = C.RXXID

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

FULLTEX Search on two column of Different Table in Mysql - mysql

Related

Mysql fulltext index search returning weird result

MYSQL Search for empty fields in table

How to compare two comma-separated string lists using MySQL

Search with relevance ranking using containstable and freetext

Why doesn't this sub-query seem to work?

Categories

Resources