MATCH AGAINST with multiple columns in MySQL - mysql

I am trying to create a simplified search box that will match multiple columns against a keyword or keywords the user inputs. The following code is my attempt at using MySQL's MATCH/AGAINST. However, I cannot get the query to execute properly when there are two columns specified, in this case 'topic, country'. It will execute when I run the code as either 'country' or 'topic', but not both.
Is there a secret to this?
SELECT * ,
MATCH (
topic, country
)
AGAINST (
'China'
) AS score
FROM reports2
WHERE MATCH (
topic, country
)
AGAINST (
'China'
)
ORDER BY score DESC
yields:
#1191 - Can't find FULLTEXT index matching the column list
Which I find to be an inaccurate description of the error, because there are FULLTEXT indices for both of these. I even copied the table, turned it into MyISAM from INNODB, in order to do so.
Any suggestions would be appreciated.

You don't need fulltext indices for each column, but a single fulltext index covering both columns
FULLTEXT_INDEX(topic, country)
Having a single index on each column will not work
FULLTEXT(topic); FULLTEXT(country); /* will not work as expected */
I also think that order is important, but I could be mistaken in that regard

Related

Should I create separate MySQL indexes for url_title vs url_title, url_description, url_keywords?

Using MySQL 5.7, I have a table of urls containing url_title, url_description, url_keywords
Sometimes I just need to look in url_title, but sometimes look for something in all columns.
Is it better to just create one index containing all 3 columns or create a separate index for url_title alone and another index containing all 3 columns ?
e.g Will it search for url_title slower in the 3 columns index vs single column ?
Or can MySQL only search/read in given column even if index would contain 3 columns ?
Later edit: this is a sample query but I do have other less important variations:
SELECT *
FROM urls
WHERE match(url_title, url_description,
url_keywords, url_paragraphs)
against('red boots' IN BOOLEAN MODE)
LIMIT 500
Update: You didn't mention in your original post that you were talking about fulltext indexes, not conventional B-tree indexes.
Fulltext indexes are a different type. You must specify ALL the columns of the fulltext index in your MATCH() clause. No fewer, and no more, and they must be in the same order as they appear in the index definition.
If you want to do a fulltext search only on a single column sometimes, then you will have to create another fulltext with that single column.
Below is my original answer, that I wrote before you clarified that you were using a fulltext index. Perhaps it will help someone else.
MySQL can use the index if the column(s) you search are the leftmost column(s) of that index. It can use a subset of the columns of a multi-column index.
For example, given an index on (a, b, c), the following query uses all three columns:
SELECT ... WHERE a = ? AND b = ? AND c = ?
The following query uses the first column a of the index, because it's the leftmost column.
SELECT ... WHERE a = ?
The following query uses the first two columns of the index, because they're consecutive and the leftmost subset of columns.
SELECT ... WHERE a = ? AND b = ?
The following query uses only the first column a of the index, because the conditions don't match consecutive columns of the index. It will use the index to narrow down the search to rows matching the a condition, but then it will have to examine each of those rows to evaluate the c condition, even though c is part of the same index.
SELECT ... WHERE a = ? AND c = ?
MySQL has an optimization called index condition pushdown which does a short-cut for this. It delegates to the storage engine to evaluate the c condition, knowing that c is part of the index. So it still counts as examining the row, but it make the row read a little bit less costly.
The following query cannot use the index at all, because the conditions are not on leftmost columns of that index.
SELECT ... WHERE b = ? AND c = ?
The guidelines for FULLTEXT indexes and MATCH...AGAINST are different than for INDEX. For this:
SELECT *
FROM urls
WHERE match(url_title, url_description,
url_keywords, url_paragraphs)
against('red boots' IN BOOLEAN MODE)
LIMIT 500
(and assuming ENGINE=InnoDB), you need a FULLTEXT index with all 4 columns in it.
FULLTEXT(url_title, url_description,
url_keywords, url_paragraphs)
If you might also be searching, say, just url_title in another query, then you would also need FULLTEXT(url_title). (Etc)
See if either of these would be 'better' for your application:
against('+red +boots' IN BOOLEAN MODE)
against('red boots')

How do you order the indexing columns in MySQL if you are using order by in your query?

I am reading an article about how Pinterest shards their MySQL database: https://medium.com/#Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f
And here they have an example of a table:
CREATE TABLE board_has_pins (
board_id INT,
pin_id INT,
sequence INT,
INDEX(board_id, pin_id, sequence)
) ENGINE=InnoDB;
And they are showing how they query from that table:
SELECT pin_id FROM board_has_pins
WHERE board_id=241294561224164665 ORDER BY sequence
LIMIT 50 OFFSET 150
What I don't understand here is the ordering of the index. Would it not make more sense if the index was like this since they are ordering by sequence and filtering by board_id?
INDEX(board_id, sequence, pin_id)
Am I missing something here or have I misunderstood how indexing works?
You are correct. The better index for this query is:
INDEX(board_id, sequence, pin_id)
The columns should be in this order:
Column(s) involved in equality comparisons. If there are multiple columns, their order does not matter.
Column(s) involved the ORDER BY clause, in the same order they appear in the ORDER BY.
Other columns used to fetch values, like pin_id.
Once the equality conditions find the subset of matching rows, they are all tied with respect to their order, because naturally they all have the same value for the column of the quality condition (board_id in this case).
The tie is resolved by the order of the next column in the index. If (and only if) the next column is the one used in the ORDER BY clause, then the rows can be read in index order, with no further work needed to sort them.
I don't know what is the explanation for the Pinterest blog post you linked to. I guess it's a mistake, because the index is not optimal for the query they showed.

how to query and display rows of an Index table (MySql DBMS)

i am quite new to SQL world and i have searched so many blogs about indexes.
i have got two questions:
1-how to query directly from an Index table and select and display its contents?
for example suppose that table A got 2 indexes names A_index_1 and A_index_2 .i want to do something like this : select * from A.A_index_2 and display its contents and experiment with it.
2-the second questions is more complicated one.according to this link
the rows are sorted in a multi column Index and it claims that the very next column to the leftmost column is sorted as well .now suppose we have this index with columns such as this:
IDX1 on Table A : Country | Province | City | Street | Shop
and suppose the there are tons of rows which share the same country and province and city . now if we query : select * from A where Country='c' order By Province
, according to the link , because the province is adjacent to Country in the index so it is sorted as well and the sorting part of that query will be ignored(no-op). now suppose we want to query this : select * from A where Country='c' and Province='p' Order By City ,
the questions is :
in rows with the same province , is the city column in the index sorted as well?
and so will the sorting part will be ignored because of that?
this question implies to other columns in the index with same data in their previous columns!
I Found it my self.
1-according to this link , you can visualize the Index table and query it like this :
Visualizing an index helps in understanding what queries the index supports. You can query the database to retrieve the entries in index order
SELECT 'INDEX COLUMNS LIST'
FROM 'TABLE'
ORDER BY 'INDEX COLUMNS LIST'
If you put the index definition and table name into the query, you will get a sample from the index.
2-according to the link in a multi column index table , starting from leftmost column , the table gets ordered. and the rows which share the same value in leftmost column will be ordered based on second column from left and again those rows that share the same value in the first and second column will be ordered based on third column from left and so forth!
With whatever knowledge I have gathered while working with Oracle DB, I can say the following:
There's nothing like query 'from' index. You can query 'on' index.
Let's say you have a table:
Col1 VARCHAR2(10)
Col2 NUMBER
Col3 DATE
And this table is indexed on Col3. In such case, if you want to use index, your query should have a where clause which filters data on Col3
So basically, indexes help in faster querying when you are filtering results, provided ALL COLUMNS in an index are used in where clause of your query.
So following query uses index:
SELECT * FROM myTab WHERE Col3 BETWEEN TRUNC(SYSDATE -1) AND TRUNC(SYSDATE)
while following query does not:
SELECT * FROM myTab WHERE Col1 = 'Hello!'
Would like to add, indexes is a good idea only when you know what you are going to do. Too many indexes is also a bad practice. A good idea is to put index on the columns which are "FREQUENTLY" used in your filtering queries(WHERE clause of the query).
What you posted in your question, is something like a partition. Which is a logical space in the table. So if you want to query some specific set of data, you do something like:
SELECT * FROM myTab PARTITION(myPart);

LIKE % or AGAINST for FULLTEXT search?

I was trying to make a very fast & efficient approach to fetch the records using keywords as search.
Our MYSQL table MASTER tablescontains 30,000 rows and has 4 fields.
ID
title (FULLTEXT)
short_descr (FULLTEXT)
long_descr (FULLTEXT)
Can any one suggest which is one more efficient?
LIKE %
MYSQL's AGAINST
It would be nice if some one can write a SQL query for the keywords
Weight Loss Secrets
SELECT id FROM MASTER
WHERE (title LIKE '%Weight Loss Secrets%' OR
short_descr LIKE '%Weight Loss Secrets%' OR
long_descr LIKE '%Weight Loss Secrets%')
Thanks in advance
The FULLTEXT index should be faster, maybe its a good idea to add all columns into 1 fulltext index.
ALTER TABLE MASTER
ADD FULLTEXT INDEX `FullTextSearch`
(`title` ASC, `short_descr` ASC, `long_descr` ASC);
Then execute using IN BOOLEAN MODE
SELECT id FROM MASTER WHERE
MATCH (title, short_descr, long_descr)
AGAINST ('+Weight +Loss +Secrets' IN BOOLEAN MODE);
This will find rows that contains all 3 keywords.
However, this wont give you exact match the keywords just need to be present in same row.
If you also want exact match you could do like this, but its a bit hacky and would only work if your table doesnt get to big.
SELECT id FROM
(
SELECT CONCAT(title,' ',short_descr,' ', long_descr) AS SearchField
FROM MASTER WHERE
MATCH (title, short_descr, long_descr)
AGAINST ('+Weight +Loss +Secrets' IN BOOLEAN MODE)
) result WHERE SearchField LIKE '%Weight Loss Secrets%'

MySQL fulltext search over multiple columns: result confusion

I have a search query which performs a fulltext search on the DB.
$sql = "SELECT
*
FROM
`tbl_auction_listing` AS `al`
JOIN
`tbl_user` AS `u` ON `al`.`user_id` = `u`.`user_id`
LEFT JOIN
`tbl_gallery_details` AS `gd` ON `al`.`user_id` = `gd`.`user_id`
LEFT JOIN
`tbl_self_represented_details` AS `sr` ON `u`.`user_id` = `sr`.`user_id`
WHERE
`al`.`status` = '" . ACTIVE . "'
AND
`al`.`start_date` < NOW()
AND
`al`.`end_date` > NOW()
AND
MATCH(`al`.`listing_title`,
`al`.`description`,
`al`.`provenance`,
`al`.`title`,
`al`.`artist_full_name`,
`al`.`artist_first_name`,
`al`.`artist_last_name`,
`sr`.`artist_name`,
`gd`.`gallery_name`,
`u`.`username`) AGAINST('$search_query' IN BOOLEAN MODE)";
When I search for 'Cardozo, Horacio' or 'cardozo' or 'horacio' I get no results however I know there is an artist with 2 records in the db with artist_full_name = Cardozo, Horacio.
If I remove all MATCH fields and just have al.artist_full_name I get 2 results. If I add in al.description I get 1 result because 'Horacio Cardozo' exists in the description.
Is there a way to have the search return all records if any condition (any search query word) is met in any of the MATCH fields? I tried removing IN BOOLEAN MODE but that produced same results.
It appears that InnoDB tables do not allow searches over several fulltext indexes in the same MATCH() condition.
Here your fields do not all belong to the same table, therefore they are covered by different indexes. Notice the same limitation applies if you had a table like this:
CREATE TABLE t (
f1 VARCHAR(20),
f2 VARCHAR(20),
FULLTEXT(f1), FULLTEXT(f2)
) ENGINE=InnoDB;
SELECT * FROM t
WHERE MATCH(f1, f2) AGAINST ('something in f2'); -- likely to return no row
It looks like a fulltext search may only search on the first fulltext index it encounters but this is only something I deduct from this experience, please do not take this for granted.
The bottomline is that you should split your search so as to use one single fulltext index per MATCH() clause:
SELECT * FROM auction, user, gallery, ...
WHERE
MATCH(auction.field1, auction.field2) AGAINST ('search query' IN BOOLEAN MODE) OR
MATCH(auction.field3) AGAINST ('search query' IN BOOLEAN MODE) OR
MATCH(user.field1, user.field2, user.field3) AGAINST...
This is an illustration of a possible query if you had two distinct indexes on auction and one one on user. You need to adapt it to your actual structure (please post your tables' descriptions if you need more guidance).
Notice this only applies to InnoDB tables. Interestingly, MyISAM tables do not seem to show the same limitation.
Update: it turns out this was a bug in the InnoDB engine, fixed in 5.6.13/5.7.2. The above example now rightfully fails with "Can't find FULLTEXT index matching the column list". Indeed, there is no index on (f1, f2), but one on (f1) and another one on (f2). As the changelog advises:
Unlike MyISAM, InnoDB does not support boolean full-text searches on
nonindexed columns, but this restriction was not enforced, resulting
in queries that returned incorrect results.
It is noteworthy that while such queries return a correct result set with MyISAM, they run slower than one might expect, as they silently ignore existing fulltext indexes.