SQL query running really slow - mysql

I am running this query to search the database:
SELECT
IFNULL(firstname, '') AS firstname,
IFNULL(lastname, '') AS lastname,
IFNULL(age, ' ') AS age,
email,
telephone,
comments,
ref
FROM person
RIGHT JOIN
order ON person.oID = order.ref
WHERE
LOWER(firstname) LIKE LOWER ('%{$search}%') OR
LOWER(lastname) LIKE LOWER ('%{$search}%') OR
LOWER(email) LIKE LOWER ('%{$search}%') OR
LOWER(telephone) LIKE LOWER ('%{$search}%') OR
LOWER(ref) LIKE LOWER ('%{$search}%');
It's doing a lot of processing, but how can I get these results faster? The page is taking about 6-7 seconds to load, If i run the query in PHPMyAdmin, the query takes 3-4 seconds to run. Its not a huge database, 3000 entries or so. I have added an index to the ref, email, firstname and lastname columns but that doesnt seem to have made any difference. Can anyone help?

The reason this query is slow is because you've combined two convenient but slow features of MySQL in the slowest possible way.
FUNCTION(column) LIKE %matchstring% requires a scan of the table; no ordered index can help satisfy this search because it's unanchored.
condition OR condition OR condition requires the table to be rescanned once per OR clause.
You also happen to be ignoring the fact that MySQL's searches are already case-insensitive if you have set up your column collations correctly.
Finally, it's not clear what you're doing with the RIGHT JOINed table data. Which columns of your result set come from that table? If you don't need data from that table get rid of it.
So, in summary, what you have is slow x many.
So, how can you fix this? The most important thing is for you to get rid of as many of these unanchored scans as possible. If you can change them to
email LIKE '{$search}%'
so the LOWER() functions and leading %s in the LIKE terms can be eliminated, you will have a big win.
If this sort of cast-a-wide-net search feature is critical to your application, you should consider using MySQL fulltext searching.
Or you could consider creating a new column in your table that's the concatenation of all the columns you presently search, so you can search it just once.
Edit to explain LIKE slowness
If the column haystack is indexed, the search haystack LIKE 'needle%' runs quite quickly. That's because the BTREE style index is inherently ordered. To search this way, MySQL can random-access the first possible match, and then scan sequentially to the last possible match.
But the search haystack LIKE '%needle%' can't use random access to find the first possible match in the index. The first possible match could be anywhere. So it has to scan all the values of the haystack one by one for the needle.

I would suggest that you change the right join to an inner join. The fields that you are looking for look like they are coming from the person table anyway, so the where clause is turning the query into an inner join.
SELECT
IFNULL(firstname, '') AS firstname,
IFNULL(lastname, '') AS lastname,
IFNULL(age, ' ') AS age,
email,
telephone,
comments,
ref
FROM person INNER JOIN
order
ON person.oID = order.ref
WHERE
LOWER(firstname) LIKE LOWER ('%{$search}%') OR
LOWER(lastname) LIKE LOWER ('%{$search}%') OR
LOWER(email) LIKE LOWER ('%{$search}%') OR
LOWER(telephone) LIKE LOWER ('%{$search}%') OR
LOWER(ref) LIKE LOWER ('%{$search}%');
Second, create an index on order(ref). This should greatly reduce the search space for the where clause. The syntax is:
create index order_ref on `order`(ref);
By the way, order is a bad name for a table, because it is a SQL reserved word. I would suggest orders instead.

why dont you use Full text search instead of bunch of OR and LOWER ?
SELECT
IFNULL(firstname, '') AS firstname,
IFNULL(lastname, '') AS lastname,
IFNULL(age, ' ') AS age,
email,
telephone,
comments,
ref
FROM person
RIGHT JOIN
order ON person.oID = order.ref
WHERE
MATCH (LOWER(firstname), LOWER(lastname),LOWER(email),LOWER(ref))
AGAINST ('$search' IN BOOLEAN MODE)
to run this faster you need to add an index .
ALTER TABLE person ADD FULLTEXT(firstname, lastname,email,ref);

Related

How to get the closest matches first from MySQL

I have a table of 10 million records and and am trying to select user details like firstname, lastname and country. I need to get results back in order where (order by column="abc") would give me results where those that match are ranked on the top.
what I have tried
Query one
-- this is match slower with 45+ seconds
select firstname, lastname, town
from user_db
order by town="abc" DESC
limit 25;
Query two
-- much faster with 0.00019 seconds
select firstname, lastname, town
from user_db
order by town DESC
limit 25;
The problem
The first query also works but takes 45+ seconds while if I remove the equals expression in the (order by clause) like in the second query, it's match faster. And obviously I do use where clauses but this is a simplified example.
other notes
There are currently no joins on the query as it is just a simple select statment of user details and my setup is pretty good with 30GB ram and 2TB of storage all local.
Indexes: All columns mentioned have indexes but the (order by town="abc") clause triggers a full table search and as a result, this ends up finishing in 2 minutes
Is there a way to get results ranked by closest matches first faster within a single query?
Any help will gladly be appreciated. Thank you.
It looks to me like your user_db table has an index on your town column. That means ORDER BY town DESC LIMIT 25 can be satisfied in O(1) constant time by random-accessing the index to the last row and then scanning 25 rows of the index.
But your ORDER BY town='abc' DESC LIMIT 25 has to look at, and sort, every single row in the table. MySQL doesn't use an index to help compute that town='abc' condition when it appears in an ORDER BY clause.
Many people with requirements like yours use FULLTEXT searching and ordering by the MATCH() function. That gets a useful ordering for a person looking at the closest matches like in the searching location bar of a web browser. But don't expect Google-like match accuracy from MySQL.
You can decouple the query into two queries each one being very fast.
First, create an index on town.
create index ix1 on user_db (town);
Then get the matches, with a limit of 25 rows:
select * from user_db where town = 'abc' limit 25
The query above may return any number of rows between 0 and 25: let's call this number R. Then, get the non-matches:
select * from user_db where town <> 'abc' limit 25 - R
Assemble both result sets and problem solved. Even if the second query results in a table scan, it will be concluded earlier resulting in a low cost.
One way is to add a new column that has a value of country="abc", then sort by this column.
I'm rebuilding my workspace right now so I cannot try it properly, but something like:
select firstname, lastname, town, town="abc" as sortme
from user_db
order by sortme desc, town, lastname, firstname
limit 25;
While it is unclear what you mean by "closest match" it is difficult to answer your question. Are "abd", "bc" etc regarded a close match to "abc"? Should the word "abc" appear in the town and match "abcville"?
There are a number of options.
Appearance of search string
Using a like "%abc%" where clause will find all towns with the string "abc" appearing in it.
select firstname, lastname
from user_db
where town like "%abc%"
order by town
Leave out the first % if you want to search by towns starting with "abc". The advantage is that this probably will search in the index if there is one for town. Use "abc%" to find towns starting with "abc". There is no ranking but you could add a sort.
Use a fulltext index
Create a FULLTEXT index on town:
ALTER TABLE user_db
ADD FULLTEXT(town);
And use this with a match:
SELECT
MATCH(town) AGAINST('abc') AS Relevance,
firstname, lastname
FROM user_db
WHERE MATCH(town) AGAINST('abc')
ORDER BY Relevance DESC
LIMIT 15
Match uses words to calculate the match so in this case the string "abc" must appear with spaces in in the town in order to have a match. The NATURAL LANGUAGE options work well for plain texts but might not do so for town names.
To be honest I have no experience with FULLTEXT and match performance but it probably is well optimized and works fairly good on large tables.
Create additional fields
As storage is cheap and time is not you might want to consider adding additional fields with search strings or alternative spellings for 'town' create all the indexes you'll need and use that as a search source. As this will need analysis of your use case it is difficult to provide an solution.

Optimizing search query

This might seem to be a redundant question but i can't find the right answer to this issue.
I have a TableA with more than 50 columns.I am implementing a search functionality for searching a query in about 10 columns of this table. TableA contains more than a million rows
For this I have created a composite index on these 10 columns.
index (col1,col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,col_10)
Now i am splitting user's query using space as regex. i.e. $search_words = $search_query.split(' '); and using individual words to match in my search query. Example :
SELECT something FROM tableA
WHERE ( MATCH ( col_1, col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,col_10 )
AGAINST ( ' +word1* +word2* +word3* +word4* ' IN BOOLEAN MODE ) )
This query works fine for general searches but if users searches for individual alphabets in query like A E I O Co. it takes too much time. What is the best way to optimise the query or another way to perform search in this situation?
If you feed a too-short string to InnoDB's FULLTEXT, it returns zero results. So... Filter out any strings that are shorter than innodb_ft_min_token_size.
If necessary, test for them separately using REGEXP '[[:<:]]A[[:>:]] to look for a 1-letter word A.
Or throw them together. This would check for the only 1-letter English words: REGEXP '[[:<:]][AI][[:>:]]

How to make mysql query fast while searchs with like

I have three table and I have to search them with a like match. The query runs over 10,000 records. It works fine but take 4 seconds to give results. What can I do to improve the speed and take it down to 1 second?
profile_category_table
----------------------
restaurant
sea food restaurant
profile_keywords_table
----------------------
rest
restroom
r.s.t
company_profile_table
---------------------
maha restaurants
indian restaurants
Query:
SELECT name
FROM (
(SELECT PC_name AS name
FROM profile_category_table
WHERE PC_status=1
AND PC_parentid!=0
AND (regex_replace('[^a-zA-Z0-9\-]','',remove_specialCharacter(PC_name)) LIKE '%rest%')
GROUP BY PC_name)
UNION
(SELECT PROFKEY_name AS name
FROM profile_keywords_table
WHERE PROFKEY_status=1
AND (regex_replace('[^a-zA-Z0-9\-]','',remove_specialCharacter(PROFKEY_name)) LIKE '%rest%')
GROUP BY PROFKEY_name)
UNION
(SELECT COM_name AS name
FROM company_profile_table
WHERE COM_status=1
AND (regex_replace('[^a-zA-Z0-9\-]','',remove_specialCharacter(COM_name)) LIKE '%rest%')
GROUP BY COM_name))a
ORDER BY IF(name LIKE '%rest%',1,0) DESC LIMIT 0, 2
And I add INDEX FOR THAT columns too.
if a user search with text rest in textbox..the auto suggestions results should be..
results
restaurant
sea food restaurant
maha restaurants
indian restaurants
rest
restroom
r.s.t
i used regex_replace('[^a-zA-Z0-9-]','',remove_specialCharacter(COM_name) to remove special characters from the field value and to math with that keyword..
There are lots of thing you can consider:
The main killer of performance here is probably the regex_replace() ... like '%FOO%'. Given that you are applying function on the columns, indices are not going to take effect, leaving you several full table scans. Not to mention regex replace is going to be heavy weight. For the sake of optimization, you may
Keep a separate column, which stored the "sanitized" data, for which you create indices on, and leaving your query like where pc_name_sanitized like '%FOO%'
I am not sure if it is available in MySql, but in a lot of DMBS, there is a feature called function-based index. You can consider making use of it to index the regex replace function
However even after the above changes, you will find the performance is not very attractive. In most case, using like with wildcard at the front is avoiding indices to be used. If possible, try to do exact match, or have the beginning of string provided, e.g. where pc_name_sanitized like 'FOO%'
As mentioned by other users mentioned, using UNION is also a performance killer. Try to use UNION ALL instead if possible.
I'm going to say don't filter on the query. Do that on whatever language you're programming in. Regex_replace is a heavy operation regardless of the environment and you're doing this several times on a query of 10,000 records with a union of who knows how many more.
Rewrite it completely.
UNION statements are killing performance, and you're doing the LIKE on too many fields.
Moreover you're searching into a temporary table (SELECT field FROM (...subquery...)), so without any indexes, which is really slow (1/1 chance to go through full-table scan for each row).
Since you use union in between all queries, you can remove the group by option in all queries and you select only column having "rest" in it. so remove the function "IF(name LIKE '%rest%',1,0)"in the order by clause.

MySql Explain ignoring the unique index in a particular query

I started looking into Index(es) in depth for the first time and started analyzing our db beginning from the users table for the first time. I searched SO to find a similar question but was not able to frame my search well, I guess.
I was going through a particular concept and this first observation left me wondering - The difference in these Explain(s) [Difference : First query is using 'a%' while the second query is using 'ab%']
[Total number of rows in users table = 9193]:
1) explain select * from users where email_address like 'a%';
(Actually matching columns = 1240)
2) explain select * from users where email_address like 'ab%';
(Actually matching columns = 109)
The index looks like this :
My question:
Why is the index totally ignored in the first query? Does mySql think that it is a better idea not to use the index in the case 1? If yes, why?
If the probability, based statistics mysql collects on distribution of the values, is above a certain ratio of the total rows (typically 1/11 of the total), mysql deems it more efficient to simply scan the whole table reading the disks pages in sequentially, rather than use the index jumping around the disk pages in random order.
You could try your luck with this query, which may use the index:
where email_address between 'a' and 'az'
Although doing the full scan may actually be faster.
This is not a direct answer to your question but I still want to point it out (in case you already don't know):
Try:
explain select email_address from users where email_address like 'a%';
explain select email_address from users where email_address like 'ab%';
MySQL would now use indexes in both the queries above since the columns of interest are directly available from the index.
Probably in the case where you do a "select *", index access is more costly since the optmizer has to go through the index records, find the row ids and then go back to the table to retrieve other column values.
But in the query above where you only do a "select email_address", the optmizer knows all the information desired is available right from the index and hence it would use the index irrespective of the 30% rule.
Experts, please correct me if I am wrong.

Mysql optimization for REGEXP

This query (with different name instead of "jack") happens many times in my slow query log. Why?
The Users table has many fields (more than these three I've selected) and about 40.000 rows.
select name,username,id from Users where ( name REGEXP
'[[:<:]]jack[[:>:]]' ) or ( username REGEXP '[[:<:]]jack[[:>:]]' )
order by name limit 0,5;
id is primary and autoincrement.
name has an index.
username has a unique index.
Sometimes it takes 3 seconds!
If I explain the select on MySQL I've got this:
select type: SIMPLE
table: Users
type: index
possible keys: NULL
key: name
key len: 452
ref: NULL
rows: 5
extra: Using where
Is this the best I can do? What can I fix?
If you must use regexp-style WHERE clauses, you definitely will be plagued by slow-query problems. For regexp-style search to work, MySQL has to compare every value in your name column with the regexp. And, your query has doubled the trouble by also looking at your username column.
This means MySQL can't take advantage of any indexes, which is how all DBMSs speed up queries of large tables.
There are a few things you can try. All of them involve saying goodbye to REGEXP.
One is this:
WHERE name LIKE CONCAT('jack', '%') OR username LIKE CONCAT('jack', '%')
If you create indexes on your name and username columns this should be decently fast. It will look for all names/usernames beginning with 'jack'. NOTICE that
WHERE name LIKE CONCAT('%','jack') /* SLOW!!! */
will look for names ending with 'jack' but will be slow like your regexp-style search.
Another thing you can do is figure out why your application needs to be able to search for part of a name or username. You can either eliminate this feature from your application, or figure out some better way to handle it.
Possible better ways:
Ask your users to break up their names into given-name and surname fields, and search separately.
Create a separate "search all users" feature that only gets used when a user needs it, thereby reducing the frequency of your slow regexp-style query.
Break up their names into a separate name-words table yourself using some sort of preprocessing progam. Search the name-words table without regexp.
Figure out how to use MySQL full text search for this feature.
All of these involve some programming work.
I reached 50% speedup just by adding fieldname != '' in where clause. It makes mysql to use indexes.
SELECT name, username, id
FROM users
WHERE name != ''
AND (name REGEXP '[[:<:]]jack[[:>:]]' or username REGEXP '[[:<:]]jack[[:>:]]')
ORDER BY name
LIMIT 0,5;
Not a perfect solution but helps.
Add "LIKE" in front
from
SELECT cat_ID, categoryName FROM category WHERE cat_ID REGEXP '^15-64-8$' ORDER BY categoryName
to
SELECT cat_ID, categoryName FROM category WHERE cat_ID LIKE '15-64-8%' and cat_ID REGEXP '^15-64-8$' ORDER BY categoryName
Of cos, that works only if U r search for phrases U know starting with what, else full text index is the solution.