mysql - any way to help fulltext search with another index? - mysql

Let's say i have an "articles" table which has the columns:
article_text: fulltext indexed
author_id: indexed
now i want to search for a term that appears in an article that a particular arthor has written.
so something like:
select * from articles
where author_id=54
and match (article_text) against ('foo');
the explain for this query tells me that mysql is only going to use the fulltext index.
I believe mysql can only use 1 index, but it sure seems like a wise idea to get all the articles a particular author has written first before fulltext searching for the term... so is there anyway to help mysql?
for example.. if you did a self-join?
select articles.* from articles as acopy
join articles on acopy.author_id = articles.author_id
where
articles.author_id = 54
and match(article_text) against ('foo');
the explain for this lists the use of the author_id index first, then the fulltext search.
does that mean it's actually only doing the fulltext search on the limited set as filtered by author_id?
ADDENDUM
explain plan for the self join as follows:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: acopy
type: ref
possible_keys: index_articles_on_author_id
key: index_articles_on_author_id
key_len: 5
ref: const
rows: 20
filtered: 100.00
Extra: Using where; Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: articles
type: fulltext
possible_keys: index_articles_on_author_id,fulltext_articles
key: fulltext_articles
key_len: 0
ref:
rows: 1
filtered: 100.00
Extra: Using where
2 rows in set (0.00 sec)

Ok, so, since
Index Merge is not applicable to full-text indexes
http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html
I would try this approach: (replace author_id_index by the name of your index on author_id)
select * from articles use index (author_id_index)
where author_id=54
and match (article_text) against ('foo');
Here the problem is the following:
it is indeed impossible to use a regular index in combination with a full-text index
if you join the table with itself, you are using an index already on each side of the join (the ON clause will use the author_id column, you definetly need the index here)
The most efficient has to be decided by you, with some test cases, whether using the author index is better than the text one.

Related

"SELECT [value]" vs "SELECT [value] FROM [table] LIMIT 1" in MySQL

Which query is better ?
SELECT true;
SELECT true FROM users LIMIT 1;
In terms of:
Best practice
Performance
The first query has less overhead because it doesn't reference any tables.
mysql> explain select true\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: NULL
partitions: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
filtered: NULL
Extra: No tables used
Whereas the second query does reference a table, which means it has to spend time:
Checking that the table exists and if the query references any columns, check that the columns exist.
Checking that your user has privileges to read that table.
Acquiring a metadata lock, so no one does any DDL or LOCK TABLES while your query is reading it.
Starting to do an index-scan, even though it will be cut short by the LIMIT.
Here's the explain for the second query for comparison:
mysql> explain select true from mysql.user limit 1\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: user
partitions: NULL
type: index
possible_keys: NULL
key: PRIMARY
key_len: 276
ref: NULL
rows: 8
filtered: 100.00
Extra: Using index
First query will one row with value true.
Second query will return all the rows from users table but true as only value.
So you if you need one row user first query. But if you need all the rows with same value then use second one.
In either case, it is obvious you want the value of TRUE :) With this intention, the "SELECT TRUE" is the most efficient as it won't cause MySQL to go further looking for users table, no matter how many rows in it, and then go even further with "LIMIT 1" if there are rows!
By the term BEST PRACTICE, I am not sure what you meant here, because, from my point of view, this doesn't even require a PRACTICE, let alone BEST, as I fail to see any real life application of this approach.

MySQL like performance on OR using index is better than %%?

Is better use this SQL code suppose the right index in apply on the column!!
Suppose constant is a input from a textfield!!
select ...
from .....
where lower(column) like 'Constant%' or lower(column) like '%Constant%'
Is better than?
select ...
from .....
where lower(column) like '%Constant%'
In the first code i try to match a "constant" using like but using a index trying being lucky to find a match and later i try to do a full match!!
All i want is my performance is not decreased! I mean if both queries runs in the same time or if the query can sometimes get a performance upgrade is OK with me
I use lower because we use DEFAULT CHARSET=utf8 COLLATE=utf8_bin
I created a little table:
create table dotdotdot (
col varchar(20),
othercol int,
key(col)
);
I did an EXPLAIN on a query similar to the one you showed:
explain select * from dotdotdot where lower(col) = 'value'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dotdotdot
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
filtered: 100.00
Extra: Using where
Notice the type: ALL which means it can't use the index on col. By using the lower() function, we spoil the ability for MySQL to use the index, and it has to resort to a table-scan, evaluating the expression for every row. As your table gets larger, this will get more and more expensive.
And it's unnecessary anyway! String comparisons are case-insensitive in the default collations. So unless you deliberately declared your table with a case-sensitive collation or binary collation, it's just as good to skip the lower() function call, so you can use an index.
Example:
explain select * from dotdotdot where col = 'value'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dotdotdot
partitions: NULL
type: ref
possible_keys: col
key: col
key_len: 23
ref: const
rows: 1
filtered: 100.00
Extra: NULL
The type: ref indicates the use of a non-unique index.
Also compare to using wildcards for pattern-matching. This also defeats the use of an index, and it has to do a table-scan.
explain select * from dotdotdot where col like '%value%'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dotdotdot
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
filtered: 100.00
Extra: Using where
Using wildcards like this for pattern-matching is terribly inefficient!
Instead, you need to use a fulltext index.
You might like my presentation Full Text Search Throwdown and the video here: https://www.youtube.com/watch?v=-Sa7TvXnQwY
In the other answer you ask if using OR helps. It doesn't.
explain select * from dotdotdot where col like 'value%' or col like '%value%'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dotdotdot
partitions: NULL
type: ALL
possible_keys: col
key: NULL
key_len: NULL
ref: NULL
rows: 1
filtered: 100.00
Extra: Using where
Notice the optimizer identifies the col index as a possible key, but then ultimately decides not to use it (key: NULL).
No, this would not improve the query performance significantly.
MySQL will match the WHERE clause "per row" and therefore inspect ALL of the conditions before proceeding to the next row. Hitting the index first may slightly increase the performance if there is a match, but this gain will most likely be overtaken by the double evaluation in case the first condition does not match.
What could have helped is :
1) run the query with like 'Constant%'
2) run another query with like '%Constant%'
in which case, the first one may be accelerated if there is a match.
However, you will most likely suffer from the overhead and perform worse in 2 queries than in one.
Moreover, the LIKE operator is case insensitive. Therefore, the lower(column) is unnecessary.
Meanwhile, if you expect your data to match principally on the first condition, and rarely on the second, then YES, this would lead to an increase as the second condition is not evaluated.
Using LOWER() prevents use of the index. So, switch to a ..._ci collation and ditch the LOWER.
Consider a FULLTEXT index; it is much faster than LIKE%...`. The former is fast; the latter is a full table scan.
OR is almost always a performance killer.

Index of three columns in mySQL

I have 3 columns a,b and c and i have indexed them as (a,b,c). i have a query like this :
SELECT * FROM tablename WHERE a=something and c=someone
My question is Does this query use this index or not!?
It may use the first column (a) of the index, but it can't use the third column (c).
One way you can tell is that the output of EXPLAIN.
Here's an example:
mysql> create table tablename (a int, b int, c int, key (a,b,c));
...I filled it with some random data...
mysql> explain SELECT * FROM tablename WHERE a=125 and c=456\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tablename
type: ref
possible_keys: a
key: a
key_len: 5
ref: const
rows: 20
Extra: Using where; Using index
The above shows ref: const which shows only one of the constant values are used to find rows in the index. Also the key_len: 5 shows only a subset of the index is used, since an index entry with three integers should be larger than 5 bytes.
mysql> explain SELECT * FROM tablename WHERE a=125 and b = 789 and c=456\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tablename
type: ref
possible_keys: a
key: a
key_len: 15
ref: const,const,const
rows: 1
Extra: Using index
When we use conditions on all three columns, it shows ref: const,const,const showing that all three values are being used to look up index entries. And the key_len is large enough to be an entry of three integers.
As Mihal says, if you prefix the query with EXPLAIN, the optimizer will tell you if it uses the index or not. Bill is partially correct in that it will only look up the value for a in the index, but if the table only contains the columns a,b and c, then the index is covering and the values for b and c will be retrieved from the index without reference to the table data - but the DBMS will still scan through all values of b and c in the index - not just going directly to the specified value for c.
It may be possible to fudge a query to make it use an index to a greater depth - assuming that b is an integer....
SELECT *
FROM tablename
WHERE a='something'
AND b BETWEEN -8388608 AND 8388607
AND c='someone'

Best way to index table for speeding up order by

I have following table structure.
town:
id (MEDINT,PRIMARY KEY,autoincrement),
town(VARCHAR(150),not null),
lat(FLOAT(10,6),notnull)
lng(FLOAT(10,6),notnull)
i frequently use "SELECT * FROM town ORDER BY town" query. I tried indexing town but it is not being used. So what is the best way to index so that i can speed up my queries.
USING EXPLAIN(UNIQUE INDEX Is PRESENT ON town):
mysql> EXPLAIN SELECT * FROM studpoint_town order by town \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: studpoint_town
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3
Extra: Using filesort
1 row in set (0.00 sec)
ragards ,
ravi.
Your EXPLAIN output indicates that currently the studpoint_town table has only 3 rows. As explained in the manual:
The output from EXPLAIN shows ALL in the type column when MySQL uses a table scan to resolve a query. This usually happens under the following conditions:
[...]
The table is so small that it is faster to perform a table scan than to bother with a key lookup. This is common for tables with fewer than 10 rows and a short row length. Don't worry in this case.

Why isn't MySQL using any of these possible keys?

I have the following query:
SELECT t.id
FROM account_transaction t
JOIN transaction_code tc ON t.transaction_code_id = tc.id
JOIN account a ON t.account_number = a.account_number
GROUP BY tc.id
When I do an EXPLAIN the first row shows, among other things, this:
table: t
type: ALL
possible_keys: account_id,transaction_code_id,account_transaction_transaction_code_id,account_transaction_account_number
key: NULL
rows: 465663
Why is key NULL?
Another issue you may be encountering is a data type mis-match. For example, if your column is a string data type (CHAR, for ex), and your query is not quoting a number, then MySQL won't use the index.
SELECT * FROM tbl WHERE col = 12345; # No index
SELECT * FROM tbl WHERE col = '12345'; # Index
Source: Just fought this same issue today, and learned the hard way on MySQL 5.1. :)
Edit: Additional information to verify this:
mysql> desc das_table \G
*************************** 1. row ***************************
Field: das_column
Type: varchar(32)
Null: NO
Key: PRI
Default:
Extra:
*************************** 2. row ***************************
[SNIP!]
mysql> explain select * from das_table where das_column = 189017 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: das_column
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 874282
Extra: Using where
1 row in set (0.00 sec)
mysql> explain select * from das_table where das_column = '189017' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: das_column
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 34
ref: const
rows: 1
Extra:
1 row in set (0.00 sec)
It might be because the statistics is broken, or because it knows that you always have a 1:1 ratio between the two tables.
You can force an index to be used in the query, and see if that would speed up things. If it does, try to run ANALYZE TABLE to make sure statistics are up to date.
By specifying USE INDEX (index_list), you can tell MySQL to use only one of the named indexes to find rows in the table. The alternative syntax IGNORE INDEX (index_list) can be used to tell MySQL to not use some particular index or indexes. These hints are useful if EXPLAIN shows that MySQL is using the wrong index from the list of possible indexes.
You can also use FORCE INDEX, which acts like USE INDEX (index_list) but with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the given indexes to find rows in the table.
Each hint requires the names of indexes, not the names of columns. The name of a PRIMARY KEY is PRIMARY. To see the index names for a table, use SHOW INDEX.
From http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
Index for the group by (=implicit order by)
...
GROUP BY tc.id
The group by does an implicit sort on tc.id.
tc.id is not listed a a possible key.
but t.transaction_id is.
Change the code to
SELECT t.id
FROM account_transaction t
JOIN transaction_code tc ON t.transaction_code_id = tc.id
JOIN account a ON t.account_number = a.account_number
GROUP BY t.transaction_code_id
This will put the potential index transaction_code_id into view.
Indexes for the joins
If the joins (nearly) fully join the three tables, there's no need to use the index, so MySQL doesn't.
Other reasons for not using an index
If a large % of the rows under consideration (40% IIRC) are filled with the same value. MySQL does not use an index. (because not using the index is faster)