Inconsistent response time in mysql select query - mysql

We have lot of MySQL select queries for some reporting need. Most are little complex and they generally include 1. five-six join statements 2. three-four inner queries in select clause.
All the indexes are properly in place in the production environment. We have checked with explain query syntax multiple time and they are OK.
Some of the query behave very strangely in in terms of response time. The same query returns in less than 500 milli secs at times (which shows all index working fine), and when we run it after 1 min or so - it gives result with a high response time (varying from five-six seconds to 30 seconds.) Some time (around once in 20 times..) it gives a time out error.
This might be due to server load - but the high variance is so frequent that we think there is something else to set to solve it.
Can some one please show me some direction on what else to do!
Thanks,
Sumit

This kind of behaviour is usually caused by a bottleneck in your stack.
It's like a rotating door in a building - the door can handle 1 person at a time, and each person takes 3 seconds; as long as people don't arrive at a rate over 1 person every 3 seconds, you don't know it's a bottleneck. If people arrive at a faster rate for a short period of time, the queue grows a little but disappears quickly. If people arrive at a rate of 1 person every 2.5 seconds for an hour, the queue becomes unmanageable, and can take far longer than that 1 hour to disappear.
Your database system is made up of a long corridor with rotating doors - most doors can operate in parallel, but they are all limited.
(Sorry for the rubbish analogy, but I find it helps to visualize these things with real-world images).
If the queries are showing a high degree of variance in their performance profile, I'd look at the system performance monitor (top in Linux, Perfmon in windows) and try to correlate slow performance with the behaviour of the system. If you see a sudden spike in CPU utilization when the queries slow down, that's likely to be your bottleneck; if you see a sudden spike in disk throughput, you might look there.
Once you have a hypothesis about the bottleneck, you can look at ways of resolving them - throwing hardware at the problem is usually cheapest.

Related

Surprising timing stats for sql queries

I have two queries whose timing parameters I want to analyze.
The first query is taking much longer than the second one while in my opinion it should be the other way round.Any Explanations?
First query:
select mrn
from EncounterInformation
limit 20;
Second query:
select enc.mrn, fl.fileallocationid
from EncounterInformation as enc inner join
FileAllocation as fl
on enc.encounterIndexId = fl.encounterid
limit 20;
The first query runs in 0.760 seconds on MYSQL while second one runs in 0.509 seconds surprisingly.
There are many reasons why measured performance between two queries might be different:
The execution plans for the queries (the dominant factor)
The size of the data being returned (perhaps mrn is a string that is really long for the result set in the first query but not the second)
Other database activity, that locks tables and indexes
Other server activity
Data and index caches that are pre-loaded -- either in the database itself or in the underlying OS components
Your observation is correct. The first should be faster than the second. More importantly though is the observation that this simply does not make sense for your simple query:
The first query runs in 0.760 seconds
select mrn
from EncounterInformation
limit 20;
The work done for this would typically be to load one data page (or maybe a handful). That would only consistently take 0.760 seconds if:
You had really slow data storage (think "carrier pigeons").
EncounterInformation is a view and not a table.
You don't understand timings.
If the difference is between 0.760 milliseconds and 0.509 milliseconds, then the difference is really small and likely due to other issues -- warm caches, other activity on the server, other database activity.
It is also possible that you are measuring elapsed time and not database time, so network congestion could potentially be an issue.
If you are querying views, all bets are off without knowing what the views are. In fact, if you care about performance you should be including the execution plan in the question.
I can't explain the difference. What I can say is that your observation is reasonable, but your question lacks lots of information that suggests you need to learn more about how to understand timings. I would suggest that you start with learning about explain.

code ping time meter - is this really true?

I am using a sort of code_ping for the time it took to process the whole page, to all my pages in my webportal.
I figured if I do a $count_start in the header initialised with current timestamp and a $count_end in the footer, the same, the difference is a meter to roughly let me know how well optimised the page is (queries, loading time of all things in that particular page).
Say for one page i get 0.0075 seconds, for others I get 0.045 etc...i'm working on optimising the queries better this way.
My question is. If one page says by this meter "rough loading time" that has 0.007 seconds,
will 1000 users querying the same page at the same time get each the result in 0.007 * 1000 = 7 seconds ? meaning they will each get the page after 7 seconds ?
thanks
Luckily, it doesn't usually mean that.
The missing variable in your equation is how your database and your application server and anything else in your stack handles concurrency.
To illustrate this strictly from the MySQL perspective, I wrote a test client program that establishes a fixed number of connections to the MySQL server, each in its own thread (and so, able to issue a query to the server at approximately the same time).
Once all of the threads have signaled back that they are connected, a message is sent to all of them at the same time, to send their query.
When each thread gets the "go" signal, it looks at the current system time, then sends the query to the server. When it gets the response, it looks at the system time again, and then sends all of the information back to the main thread, which compares the timings and generates the output below.
The program is written in such a way that it does not count the time required to establish the connections to the server, since in a well-behaved application the connections would be reusable.
The query was SELECT SQL_NO_CACHE COUNT(1) FROM ... (an InnoDB table with about 500 rows in it).
threads 1 min 0.001089 max 0.001089 avg 0.001089 total runtime 0.001089
threads 2 min 0.001200 max 0.002951 avg 0.002076 total runtime 0.003106
threads 4 min 0.000987 max 0.001432 avg 0.001176 total runtime 0.001677
threads 8 min 0.001110 max 0.002789 avg 0.001894 total runtime 0.003796
threads 16 min 0.001222 max 0.005142 avg 0.002707 total runtime 0.005591
threads 32 min 0.001187 max 0.010924 avg 0.003786 total runtime 0.014812
threads 64 min 0.001209 max 0.014941 avg 0.005586 total runtime 0.019841
Times are in seconds. The min/max/avg are the best/worst/average times observed running the same query. At a concurrency of 64, you notice the best case wasn't all that different than the best case with only 1 query. But biggest take-away here is the total runtime column. That value is the difference in time from when the first thread sent its query (they all send their query at essentially the same time, but "precisely" the same time is impossible since I don't have a 64-core machine to run the test script on) to when the last thread received its response.
Observations: the good news is that the 64 queries taking an average of 0.005586 seconds definitely did not require 64 * 0.005586 seconds = 0.357504 seconds to execute... it didn't even require 64 * 0.001089 (the best case time) = 0.069696 All of those queries were started and finished within 0.019841 seconds... or only about 28.5% of the time it would have theoretically taken for them to run one-after-another.
The bad news, of course, is that the average execution time on this query at a concurrency of 64 is over 5 times as high as the time when it's only run once... and the worst case is almost 14 times as high. But that's still far better than a linear extrapolation from the single-query execution time would suggest.
Things don't scale indefinitely, though. As you can see, the performance does deteriorate with concurrency and at some point it would go downhill -- probably fairly rapidly -- as we reached whichever bottleneck occurred first. The number of tables, the nature of the queries, any locking that is encountered, all contribute to how the server performs under concurrent loads, as do the performance of your storage, the size, performance, and architecture, of the system's memory, and the internals of MySQL -- some of which can be tuned and some of which can't.
But of course, the database isn't the only factor. The way the application server handles concurrent requests can be another big part of your performance under load, sometimes to a larger extent than the database, and sometimes less.
One big unknown from your benchmarks is how much of that time is spent by the database answering the queries, how much of the time is spent by the application server executing the logic business, and how much of the time is spent by the code that is rendering the page results into HTML.

How Long is Acceptable to Run a mySQL Query

When I test this query it takes around 17 - 20 seconds to complete.
UPDATE ex_hotel_temp
SET specialoffer='1'
WHERE hid IN
(SELECT hid
FROM ex_dates
WHERE offer_id IS NOT NULL
OR xfory_id IS NOT NULL
OR long_id IS NOT NULL
OR early_id IS NOT NULL
GROUP BY hid)
Although this is a cronjob running at night to do some housekeeping on the database (there is no site visitor sitting waiting for the result), it seems to me to be an unacceptable load on the server. Am I right, or am I fussing over nothing?
When I run each element of the query individually it takes about 0.001 sec. Should I therefore break it up into a series of simple queries instead?
LATER EDIT:
With the assistance of the comments and answers received, I decided to split the query into two. The result is this:
$query_hotel = "SELECT hid FROM ex_dates WHERE offer_id IS NOT NULL OR xfory_id IS NOT NULL OR long_id IS NOT NULL OR early_id IS NOT NULL GROUP BY hid";
$hotel = mysql_query($query_hotel, $MySQL_XXX) or die(mysql_error());
$row_hotel = mysql_fetch_assoc($hotel);
$totalRows_hotel = mysql_num_rows($hotel);
$hid_array = array();
do {
array_push($hid_array,$row_hotel['hid']);
}while ($row_hotel = mysql_fetch_assoc($hotel)) ;
$hid_list = implode("','",$hid_array);
$hid_list = "'$hid_list'";
// Mark the hotels as having a special offer
$query_update = "UPDATE ex_hotel_temp SET specialoffer='1' WHERE hid IN ($hid_list)";
$result = mysql_query($query_update, $MySQL_XXX) or die(mysql_error());
It's not pretty, but it works.
As there are two queries with a bit of PHP thrown in, I can't get an accurate measure of how long it takes to run, but just by looking at the time for the page to load it is obviously much closer to the fractions of a second than 20 seconds.
Thanks to all.
You say that this runs overnight in a CRON job, and you say this supports a "site" - if this is a public-facing website, yes, you should worry.
There's no such thing as business hours on the interwebs - there will be visitors interacting with your website, hopefully trying to buy stuff, at all hours of the day; even "national" sites tend to see traffic through the night in my experience (though typically at only a small rate compared to peak hours).
It's possible that your CRON job is causing other queries to run slowly too - it depends on what's causing the query to run slowly, and whether you're using transactions. The problem with web sites is that users tend to get impatient when the site is slow, refreshing the page, often creating more traffic to the database, and if there are other slow queries on the site, it's not impossible that the site becomes unusable for a while, even with a fairly limited number of users.
So, if there may be users of your site while the script runs, it's definitely worth tidying up.
The other reason you might worry is that in my experience, database performance is not linear - queries don't slow down in linear proportion to the number of records in your table. Instead, they tend to be hockey-stick like - everything is fine, until you reach a tipping point, and everything grinds to a halt. You may be riding that hockey-stick curve, and it could easily escalate from 17-20 seconds to 17-20 minutes.
The fix looks simple - the group by is redundant, and splitting the query into smaller queries should help the subselect use indices.
I wouldn't care, just make sure that the cron job does not timeout half way trough the proces.
I personally had query's in the past then ran for serval minutes in cron jobs without any problems.

Getting top line metrics FAST from a large MySQL DB?

I'm painfully aware there probably isn't a magic bullet to this, but it's becoming a problem. Each user has hundreds of thousands of rows of metrics data across 3 tables, this is updated on a second by second basis.
When a user logs in, I want to quickly deliver them top line stats for a number of their assets (i.e. alongside each asset in navi they have top level stats).
I've tried a number of ideas; but please - if someone has some advice or experience in this area it'd be great. Stuff tried or looked into so far:-
Produce static versions of top line stats every hour or so - This is intensive across all users and all assets. So how this can be done regularly, I'm not sure.
Call stats via AJAX, so they can be processed and fill in (getting top level stats right now can take up to 10 seconds for a larger user) once page has loaded. This could also cache stats in session to save redoing queries each page load.
Query run at 30 min intervals, i.e. you log on, it'll query and then it'll hopefully use query cache every time it's loaded (only 1/2 seconds) until the next 30min interval.
The first one seems to have most legs, but I'm not sure how to do this, given only a small number of users will be needing those stats - it seems awfully expensive to do it for everyone all the time.
Produce static versions of top line stats every hour or so - This is
intensive across all users and all assets. So how this can be done
regularly, I'm not sure.
Call stats via AJAX, so they can be processed and fill in (getting
top level stats right now can take up to 10 seconds for a larger
user) once page has loaded. This could also cache stats in session to
save redoing queries each page load.
Query run at 30 min intervals, i.e. you log on, it'll query and then
it'll hopefully use query cache every time it's loaded (only 1/2
seconds) until the next 30min interval.
Your option 1 and 3 in mySQL is known as a materialized view MySQL doesn't currently support them but the concept can be completed link provides examples
hundreds of thousands of records isn't that much. good indexes and the use of analytic queries will get you quite far. Sadly this concept isn't implemented in full but there are workarounds as well as indicated in the link provided.
It really depends on top line stats. are you wanting real time data down to the second or are 10-20 or even 30 minute intervals acceptable? Using event scheduler one can schedule the creation/update of reporting table(s) which contain summarized data faster to query. This data then is available at fractions of seconds delivery time as all the heavy lifting has already been completed. Your focus can then be on indexing these tables to improve performance without worrying about impacts to production tables.
You are in the datawarehousing domain with your setup. This means, that not all the NF1 rules apply. So my approach would be to use triggers to fill a seperate stats table.

Target mysql query response times

As I'm looking to optimize some of my mysql queries and improve performance, I'm realizing that I'm just not sure what sort of response times I should be aiming for.
Obviously lower is better, but what do you normally target as a 'slow' query for a consumer site?
My queries are currently running between 0.00 to 0.70 seconds.
I assume 0.00 is pretty good, but what about 0.70 seconds? I assume I should be looking for improved performance on that? If so, what do you usually aim for?
-----A bit more info about my query to answer your questions (though I really am interested in yours & what you do ----
My queries are on joined tables with between 1-3 million rows (sometimes more).
I run my queries on the same box which has 4gb of RAM and an SATA drive.
That box runs 6 different databases two of which are regularly being queried, one regularly being updated.
The times I provided are for the 'read' database which is rarely written to.
For web requests, I am running between 1 & 3 queries (but I cache the data using memcache as well for performance).
The times I gave were from the mysql response from command line terminal to the machine (so I assume that is actually runtime).
The queries have many joins, between 4 & 7 depending on the query.
it depends - if you're in the same machine as the DB or in another machine.
it also depende which kind of query, how many joins, how many rows, do you use index to pull the data, So there is no good answer: "A Query should take this and that long".
I'm running some complicated queries, between two machines, and it takes between 780 and 850ms- and that means nothing to you....
(probably on the same machine they will be shorter..., over an internet connection - longer ect...)
It depends. Normally on smallish queries (under 50 000 rows queried), I'd expect < 150-200ms, but it depends on the hardware and environment.
In your example, you mention 700ms. What is the query doing? How much RAM is in the box? What disks are you using? These all affect the outcome.
It really depends on how many queries you are executing on a page and what acceptable response times are for your application.
A 100 second query might be okay for a data mining application, whereas a 2 second query is probably unacceptable for a web application.
When working on web applications, I have a debugging feature that I can turn out that outputs queries and execution times at the bottom of the page. Anything .5-1 second gets flagged with yellow and anything higher gets flagged with red. I have a link that I can click to get explain output for quick optimization.
Of course, if you have a web page with dozens of queries running at .3 seconds each, you have a problem.
If you are good at reading the explain for a query, then you can get an idea of how well you have optimized your queries. Red flags are things like temporary tables and lack of indexes. As far as absolute times, I think you will have a hard time finding a hard and fast rule on what 'fast' is.