How to improve the mysql "fetch" time of a result set? - mysql

I have a query that is a large data set needed for reporting purposes. Currently, the "duration" showing in MySQL workbench which I'm assuming to be execution time is about 7 seconds, so it is fairly optimized. It returns a measly 6000 rows, but it takes nearly 150 seconds to return them according to the "fetch" time.
Now, there are over 50 columns, which may explain some of the speed, but when I extracted the data set into a spreadsheet, it turned out to be about 4MB. I'm certainly not an expert, but I didn't expect 4MB to take 150 seconds to return over the pipe. I went ahead and performed the same query on a localhost setup to eliminate networking issues. Same result! It took about 7 seconds to execute, and 150 seconds to return the data on the same machine.
This report is expected to run real-time on demand, so having the end user wait 2 minutes is unacceptable for this use case. How can I improve the time it takes to return the data from MySQL?
UPDATE: Thank you all for starting to point me in the right direction. As it turns out, the "duration" and "fetch" in workbench is horribly inaccurate. The two minutes I was experiencing was all execution time and in fact my query needed optimizing. Thanks again, this was scratching my head. I will never rely on these metrics again...

Related

MySQL 8 why is initial response to query slow, speeds up thereafter

After several minutes of inactivity (no use of the website) MySQL 8 slow right down. An initial query after non-activity can take a minute but thereafter seconds. The same query (like logging in) would take a second or two if there was activity already on the server.
Has anyone encountered this or know how to correct this behavior? The machine itself has a significant amount of resources, its just the first "warm up" call that is slow.

MySQL query takes 10X time every once in a while

I am working on this issue where a MySQL 'SELECT' query which usually completes in 2 minutes, but takes more than 25 minutes every once in a while (once in ten executions).
Could this be a:
1: Index issue - If this were, then the query would take equivalent time for every execution
2: Resource crunch - The CPU utilization did go up near 60-70% when this query gets stuck (this is usually around 40%)
3: Table lock issue - The logs say that the table was only locked for 10 ms. I do not know how to check for this.
Please suggest which issue appears most likely.
Thanks in advance...
Edit: Required information
Total rows: 40,000,000
Can't post the query or the schema (Will get fired)
Anyway, I just wanted to know the general analysis techniques.

Execution time is different for the same query. What should be the reason?

When I execute the same query multiple time in Mysql console. Execution times vary all the times.
I can understand the difference if its in milliseconds. But sometimes same query take 1 second and sometimes same query take 5 seconds.
What should be the reason in this case ?
MANY reasons:
the result was cached and the cache got cleared
the table is locked (maybe because it is executing another big query)
the disk is slow or busy doing other things
you are running out of memory
the results might be changing (pull 1k record vs pulling 500k records)
the Server is remote, so you might have network problems
it is the ghost in the machine

Simple query on a large MySQL table takes very long time at first, much quicker later

We are struggling with a slow query that only happens the first time it is called. Afterwards the query is much faster.
The first time the query is done, it takes anywhere from 15-20 seconds. Subsequent calls take < 1.5 seconds. However if not called again for a few hours, the query will take 15-20 seconds again.
The table is a table of daily readings for an entity called system(foreign key), with system id, date, sample reading, and an indication if the reading is done (past). The query asks for a range of 1 year of samples (365 days) for 200 selected systems.
It looks like this:
SELECT system_id,
sample_date,
reading
FROM Dailyreadings
WHERE past = 1
AND reading IS NOT NULL
AND sample_date < '2014-02-25' AND sample_date >= DATE('2013-01-26')
AND system_id IN (list_of_ids)
list_of_ids represents a list of 200 system ids for which we want the readings.
We have an index on system_id, sample_date and an index on both. The result of the query usually gives back ~70,000 rows. And when using explain on the query, I can see the index is used, and the planning is to only go over ~70,000 rows.
The MySQL is on amazon RDS. The engine for all table is innodb.
The Dailyreadings table has about 60 million rows, so it is quite large. However I can't understand how a very simple range query, can take up to 20 seconds. This is done on a read only replica, so concurrent writes aren't an issue I would guess. This also happens on a staging copy of the DB which has very few read/write requests going on at the same time.
After reading many many questions about slow first time queries, I assume the problem is that the first time, the query needs to be read from the disk, and afterwards it is cached. However, I fail to see why such a simple query would take so much time reading from disk. I also tried many tweaks to the innodb parameters, and couldn't get this to improve. Even doubling the ram of the system didn't seem to help.
Any pointers as to what could be the problem? and how we can improve the time it takes for the first query? Any ideas how to pinpoint the exact problem?
edit
It seems the problem might be in the IN clause, which is slow since the list is big (200) items?. Is this a known issue? Is there a way to accelerate this?
The query runs fast after a run because mysql is caching it probably. To see how your query runs with caching disabled try: SELECT SQL_NO_CACHE system_id ...
Also, I found that comparing dates on tables with lots of data has a negative effect on performance. When possible, I saved the dates as ints using unix timestamps and compared the dates like that and it worked faster.

Reporting with MySQL - Simplest Query taking too long

I have a MySQL Table on an Amazon RDS Instance with 250 000 Rows. When I try to
SELECT * FROM tableName
without any conditions (just for testing, the normal query specifies the columns I need, but I need most of them) , the query takes between 20 and 60 seconds to execute. This will be the base query for my report, and the report should run in under 60 seconds, so I think this will not work out (it times out the moment I add the joins). The report runs without any problems in our smaller test environments.
Could it be that the Query is taking so long because MySQL is trying to lock the table and waiting for all writes to finish? There might be quite a lot of writes on this table. I am doing the query on a MySQL slave, since I do not want to lockup the production system with my queries.
I have no experience with how much rows are much for a relational DB. Are 250 000 Rows with ~30 columns (varchar, date and integer types) much?
How can I speedup this query (hardware, software, query optimization ...)
Can I tell MySQL that I do not care that the Data might be inconsistent (It is a snapshot from a Reporting Database)
Is there a chance that this query will run under 60 seconds, or do I have to adjust my goals?
Remember that MySQL has to prepare your result set and transport it to your client. In your case, this could be 200MB of data it has to shuttle across the connection, so 20 seconds is not bad at all. Most libraries, by default, wait for the entire result being received before forwarding it to the application.
To speed it up, fetch only the columns you need, or do it in chunks with LIMIT. SELECT * is usually a sign that someone's being super lazy and not optimizing at all.
If your library supports streaming resultsets, use that, as then you can start getting data almost immediately. It'll allow you to iterate on rows as they come in without buffering the entire result.
A table with 250,000 rows is not too big for MySQL at all.
However, waiting for those rows to be returned to the application does take time. That is network time, and there are probably a lot of hops between you and Amazon.
Unless your report is really going to process all the data, check the performance of the database with a simpler query, such as:
select count(*) from table;
EDIT:
Your problem is unlikely to be due to the database. It is probably due to network traffic. As mentioned in another answer, streaming might solve the problem. You might also be able to play with the data formats to get the total size down to something more reasonable.
A last-resort step would be to save the data in a text file, compress the file, move it over, and uncompress it. Although this sounds like a lot of work, you might get 5x - 10x compression on the data, saving oodles of time on the transmission and still have a large improvement in performance with the rest of the processing.
I got updated specs from my client and was able to reduce the amount of users returned to 250, which goes (with a lot of JOINS) though in 60 seconds.
So maybe the answer is really: Try to not dump a whole table with a query, fetch only the exact data your need. The Client has SQL access, and he will have to update his queries, so only relevant users are returned.
I should never really use * as a wildcard. Choose the fields that you actually want and then create an index of these fields combined.
If you have thousands of rows, another option is implement pagination.
If result data directly using for report , no one can look more than 100 rows in single shot.