Intermittently slow Mysql table - why? - mysql

We recently had an issue I'd never seen before, where, for about 3 hours, one of our Mysql tables got extremely slow. This table holds forum posts, and currently has about one million rows in it. The query that became slow was a very common one in our application:
SELECT * FROM `posts` WHERE (`posts`.forum_id = 1) ORDER BY posts.created_at DESC LIMIT 1;
We have an index on the posts table on (forum_id, created_at) which normally allows this query and sort to happen in memory. But, during these three hours, notsomuch. What is normally an instantaneous query ranged from taking 2 seconds-45 seconds during this time period. Then it went back to normal.
I've pored through our slow query log and nothing else looks out of the ordinary. I've looked at New Relic (this is a Rails app) and all other actions ran essentially the same speed as normal. We didn't have an unusual number of message posts today. I can't find anything else weird in our logs. And the database wasn't swapping, when it still had gigs of memory available to use.
I'm wondering if Mysql could change its mind back and forth about which indexes to use for a given query, and for whatever reason, it started deciding to do a full table scan on this query for a few hours today? But if that were true, why would it have stopped doing the full table scans?
Has anyone else encountered an intermittently slow query that defied reason? Or do you have any creative ideas about how one might go about debugging a problem like this?

I'd try the MySQL EXPLAIN statement...
EXPLAIN SELECT * FROM `posts` WHERE (`posts`.forum_id = 1) ORDER BY posts.created_at DESC LIMIT 1;
It may be worth checking the MySQL response time in your Rails code, and if it exceeds a threshold then run the EXPLAIN and log the details somewhere.
Table locking also springs to mind - is the posts table updated by a cronjob or hefty query while SELECTs are going on?
Hope that helps a bit!

On a site I work on, we recently switched to InnoDB from MyISAM, and we found that some simple select queries which had both WHERE and ORDER BY clauses were using the index for the ORDER BY clause, resulting in a table scan to find the few desired rows (but, heck, they didn't need to be sorted when it finally found them all!)
As noted in the linked article, if you have a small LIMIT value, your ORDER BY clause is the first member of the primary key (so the data on file is ordered by it), and there are many results that match your WHERE clause, using that ORDER BY index isn't a bad idea for MySQL. However, I presume created_at is not the first member of your primary key, so it's not a particularly smart idea in this case.
I don't know why MySQL would switch indexes if you haven't changed anything, but I'd suggest you try running ANALYZE TABLE on the relevant table. You might also change the query to remove the LIMIT and ORDER BY clauses and sort at the application level, provided the result set is small enough; or you could add a USE INDEX hint so it never guesses wrong.
You could also change the wait_timeout value to something smaller so that these queries that use a bad index simply never complete (but don't lag all of the legitimate queries too). You will still be able to run long queries interactively, even with a small wait_timeout, since there is a separate configuration parameter for that.

Related

Will records order change between two identical query in mysql without order by

The problem is I need to do pagination.I want to use order by and limit.But my colleague told me mysql will return records in the same order,and since this job doesn't care in which order the records are shown,so we don't need order by.
So I want to ask if what he said is correct? Of course assuming that no records are updated or inserted between the two queries.
You don't show your query here, so I'm going to assume that it's something like the following (where ID is the primary key of the table):
select *
from TABLE
where ID >= :x:
limit 100
If this is the case, then with MySQL you will probably get rows in the same order every time. This is because the only predicate in the query involves the primary key, which is a clustered index for MySQL, so is usually the most efficient way to retrieve.
However, probably may not be good enough for you, and if your actual query is any more complex than this one, probably no longer applies. Even though you may think that nothing changes between queries (ie, no rows inserted or deleted), so you'll get the same optimization plan, that is not true.
For one thing, the block cache will have changed between queries, which may cause the optimizer to choose a different query plan. Or maybe not. But I wouldn't take the word of anyone other than one of the MySQL maintainers that it won't.
Bottom line: use an order by on whatever column(s) you're using to paginate. And if you're paginating by the primary key, that might actually improve your performance.
The key point here is that database engines need to handle potentially large datasets and need to care (a lot!) about performance. MySQL is never going to waste any resource (CPU cycles, memory, whatever) doing an operation that doesn't serve any purpose. Sorting result sets that aren't required to be sorted is a pretty good example of this.
When issuing a given query MySQL will try hard to return the requested data as quick as possible. When you insert a bunch of rows and then run a simple SELECT * FROM my_table query you'll often see that rows come back in the same order than they were inserted. That makes sense because the obvious way to store the rows is to append them as inserted and the obvious way to read them back is from start to end. However, this simplistic scenario won't apply everywhere, every time:
Physical storage changes. You won't just be appending new rows at the end forever. You'll eventually update values, delete rows. At some point, freed disk space will be reused.
Most real-life queries aren't as simple as SELECT * FROM my_table. Query optimizer will try to leverage indices, which can have a different order. Or it may decide that the fastest way to gather the required information is to perform internal sorts (that's typical for GROUP BY queries).
You mention paging. Indeed, I can think of some ways to create a paginator that doesn't require sorted results. For instance, you can assign page numbers in advance and keep them in a hash map or dictionary: items within a page may appear in random locations but paging will be consistent. This is of course pretty suboptimal, it's hard to code and requieres constant updating as data mutates. ORDER BY is basically the easiest way. What you can't do is just base your paginator in the assumption that SQL data sets are ordered sets because they aren't; neither in theory nor in practice.
As an anecdote, I once used a major framework that implemented pagination using the ORDER BY and LIMIT clauses. (I won't say the same because it isn't relevant to the question... well, dammit, it was CakePHP/2). It worked fine when sorting by ID. But it also allowed users to sort by arbitrary columns, which were often not unique, and I once found an item that was being shown in two different pages because the framework was naively sorting by a single non-unique column and that row made its way into both ORDER BY type LIMIT 10 and ORDER BY type LIMIT 10, 10 because both sortings complied with the requested condition.

Group by, Order by and Count MySQL performance

I have the next query to get the 15 most sold plates in a place:
This query is taking 12 seconds to execute over 100,000 rows. I think this execution takes too long, so I am searching a way to optmize the query.
I ran the explain SQL command on PHPMyAdmin and i got this:
[![enter image description here][1]][1]
According to this, the main problem is on the p table which is scanning the entire table, but how can I fix this? The id of p table is a primary key, do I need to set it also as an index? Also, is there anything else I can do to make the query runs faster?
You can make a relationship between the two tables.
https://database.guide/how-to-create-a-relationship-in-mysql-workbench/
Beside this you can also use a left join so you won't load the whole right table in.
Order by is a slow function in MySQL, if you are using code afterwards you can just do it in the code that is much faster than order by.
I hope I helped and Community feel free to edit :)
You did include the explain plan but you did not give any information about your table structure, data distribution, cardinality nor volumes. Assuming your indices are accurate and you have an even data distribution, the query is having to process over 12 million rows - not 100,000. But even then, that is relatively poor performance. But you never told us what hardware this sits on nor the background load.
A query with so many joins is always going to be slow - are they all needed?
the main problem is on the p table which is scanning the entire table
Full table scans are not automatically bad. The cost of dereferencing an index lookup as opposed to a streaming read is about 20 times more. Since the only constraint you apply to this table is its joins to other tables, there's nothing in the question you asked to suggest there is much scope for improving this.

Improve performance on MySQL fulltext search query

I have a following MySQL query:
SELECT p.*, MATCH (p.description) AGAINST ('random text that you can use in sample web pages or typography samples') AS score
FROM posts p
WHERE p.post_id <> 23
AND MATCH (p.description) AGAINST ('random text that you can use in sample web pages or typography samples') > 0
ORDER BY score DESC LIMIT 1
With 108,000 rows, it takes ~200ms. With 265,000 rows, it takes ~500ms.
Under performance testing(~80 concurrent users) it shows ~18sec average latency.
Is any way to improve performance for this query ?
EXPLAIN OUTPUT:
UPDATED
We have added one new mirror MyISAM table with post_id, description and synchronized it with posts table via triggers. Now, fulltext search on this new MyISAM table works ~400ms(with the same performance load where InnoDB shows ~18sec.. this is a huge performance boost) Look like MyISAM is much more quicker for fulltext in MySQL than InnoDB. Could you please explain it ?
MySQL profiler results:
Tested on AWS RDS db.t2.small instance
Original InnoDB posts table:
MyISAM mirror table with post_id, description only:
Here are a few tips what to look for in order to maximise the speed of such queries with InnoDB:
Avoid redundant sorting. Since InnoDB already sorted the result according to ranking. MySQL Query Processing layer does not need to
sort to get top matching results.
Avoid row by row fetching to get the matching count. InnoDB provides all the matching records. All those not in the result list
should all have ranking of 0, and no need to be retrieved. And InnoDB
has a count of total matching records on hand. No need to recount.
Covered index scan. InnoDB results always contains the matching records' Document ID and their ranking. So if only the Document ID and
ranking is needed, there is no need to go to user table to fetch the
record itself.
Narrow the search result early, reduce the user table access. If the user wants to get top N matching records, we do not need to fetch
all matching records from user table. We should be able to first
select TOP N matching DOC IDs, and then only fetch corresponding
records with these Doc IDs.
I don't think you cannot get that much faster looking only at the query itself, maybe try removing the ORDER BY part to avoid unnecessary sorting. To dig deeper into this, maybe profile the query using MySQLs inbuild profiler.
Other than that, you might look into the configuration of your MySQL server. Have a look at this chapter of the MySQL manual, it contains some good informations on how to tune the fulltext index to your needs.
If you've already maximized the capabilities of your MySQL server configuration, then consider looking at the hardware itself - sometimes even a lost cost solution like moving the tables to another, faster hard drive can work wonders.
My best guess for the performance hit is the number of rows being returned by the query. To test this, simply remove the order by score and see if that improves the performance.
If it does not, then the issue is the full text index. If it does, then the issue is the order by. If so, the problem becomes a bit more difficult. Some ideas:
Determine a hardware solution to speed up the sorts (getting the intermediate files to be in memory).
Modifying the query so it returns fewer values. This might involve changing the stop-word list, changing the query to boolean mode, or other ideas.
Finding another way of pre-filtering the results.
The issue here is WHERE p.post_id <> 23
Design your system in such a way so that non-indexed columns — like post_id — need not be added to the WHERE clause.
Basically MySQL will search for the full-text indexed column and then filter the post_id. Hence, if there are a lot of matches returned by the full text search, the response time will not be as expected.

What does optimize table do, or How do I rightfully optimize PK on the disk

I was under the impression OPTIMIZE TABLE fixes fragmentation. So, if before I would do
select * from t -- (no order by, no nothing)
I would get the order of the records on the disk.
While after doing the optimize, and again running this query, the result would be ordered by the PK.
I just tried it on a table of mine, and nothing changed, I still get arbitrary order of records.
I am storing all my tables in one file. I am using InnoDB. MySQL 5.5
Am I missing something, should I have defined the PK somehow else?
Without an order by statement you are never guaranteed order.
Your assumption of
if before I would do select * from t (no order by, no nothing) I would
get the order of the records on the disk
is wrong.
How the Database decides to retrieve records and display them on the screen (or whatever you're viewing them through) is totally up to the internal implementation of the database. In the past this might have been disk order but the only way to know is to check if the Database (in your case MYSQL) mentions anything about it in their documentation.
I doubt they would though because then people would depend on this ordering and they couldn't improve their record retrieving algorithms without breaking things in the past.
Edit:
As for optimizing the table try using an index that reflexes the query results you're looking for.
Edit 2:
Another thought is that the situation you just described is a classic caching issue. Because the database already has the result set stored away somewhere in the original odd ordering, your optimization won't show a reordering until the cached data set is no longer cached. How you flush caches is a bit beyond my knowledge.

I need to speed up specific mysql query on large table

Hi I know there is a lot of topics dedicated to query optimizing strategies, but this one is so specific I couldnt find the answer anywhere on the interenet.
I have large table of product in eshop (appx. 180k rows) and the table has 65 columns. Yeah yeah I know its quite a lot, but I store there information about books, dvds, bluerays and games.
Still I am not considering a lot of cols into query, but the select is still quite tricky. There are many conditions that need to be considered and compared. Query below
SELECT *
FROM products
WHERE production = 1
AND publish_on < '2012-10-23 11:10:06'
AND publish_off > '2012-10-23 11:10:06'
AND price_vat > '0.5'
AND ean <> ''
AND publisher LIKE '%Johnny Cash%'
ORDER BY bought DESC, datec DESC, quantity_storage1 DESC, quantity_storege2 DESC, quantity_storage3 DESC
LIMIT 0, 20
I have already tried to put there indexes one by one on cols in where clause and even in order by clause, then I tried to create compound index on (production, publish_on, publish_off, price_vat, ean).
Query is still slow (couple of seconds) and it need to be fast since its eshop solution and people are leaving as they are not getting their results fast. And I am still not counting the time I need to perform the search for all found rows so I can make paging.
I mean, the best way to make it quick is to simplify the query, but all the conditions and sorting is a must in this case.
Can anyone help with this kind of issue? Is it even possible to speed this kind of query up, or is there any other way how I can for example simplify the query and leave the rest on php engine to sort the results..
Oh, Iam really clueless in this.. Share your wisdom peple, please...
Many thanks in advance
First of all be sure what you want to select and erase the '*'
Select * from
with something more specific
Select id, name, ....
There is no Join or anything other in your table so the speed up options are quite small I think.
Check that your mysql Server can use enough memory.
Have a look at this confis in your my.cnf
key_buffer_size = 384M;
myisam_sort_buffer_size = 64M;
thread_cache_size = 8;
query_cache_size = 64M
Have a look a max allowed concurrency. mysql recommends CPU's*2
thread_concurrency = 4
You should really thinks about splitting the table depending on informations you use and on standard normalization. If possible.
If it's a productive system with no way to split the tables then think about a caching server. But this will only help if you have a lot of recurring querys that are the same.
This is what I would do when knowing nothing about the underlying implementation or the system at all.
Edit:
Making as many columns indexable as you can won't necessarily speed up your system. The more indexes ≠ the more speed.
thx to all of you for good remarks..
I found the solution probably 'cause I was able to reduce query time from 2,8s down to 0,3 sec.
SOLUTION:
using SELECT * is really naive on large tables (65cols) so I realized I only need 25 of them on page - other can be easily used on product page itself.
I also reindexed my table little bit. I created compound index on
production, publish_on, publish_off, price_vat, ean
then I created another one specificaly for search including cols
title, publisher, author
last thing what I did was to use query like
SELECT SQL_CALC_FOUND_ROWS ID, title, alias, url, type, preorder, subdescription,....
which allowed me to calculate influenced rows quicker using
mysql_result(mysql_query("SELECT FOUND_ROWS()"), 0)
after mysql_query()... However I cannot understand how it could be quicker, because EXPLAIN EXTENDED says the query is not using any index, its still 0,5s quicker then calculate the number of rows in individual query.
It seems to be working rather fine. If the order by clause wasnt there it would be evil quick, but thats something I have no influence on.
Still need to check my server settings...
Thank y'all for all your help..