I have a query running in vBulletin system, that fetches latest threads that have image attachments, along with their first attachment ID.
Here is the query:
SELECT thread.threadid,
thread.title,
thread.postuserid,
thread.postusername,
thread.dateline,
thread.replycount,
post.pagetext,
(
SELECT attachment.attachmentid
FROM `vb_attachment` AS attachment
LEFT JOIN `vb_filedata` AS data
ON data.filedataid=attachment.filedataid
WHERE attachment.contentid=thread.firstpostid
AND attachment.contenttypeid=1
AND data.extension IN('jpg','gif','png')
AND data.thumbnail_filesize>0
ORDER BY attachmentid ASC
LIMIT 1
) AS firstattachmentid
FROM `vb_thread` AS thread
LEFT JOIN `vb_post` AS post
ON post.postid=thread.firstpostid
WHERE thread.forumid IN(331, 318)
HAVING firstattachmentid>0
ORDER BY thread.dateline DESC
LIMIT 0, 5
The explain results for the query you can see here:
The problem: usually query runs in 0.00001 second, so almost instantly, as it is optimized query overall, however, after creating new thread (even if thread is not from forums IDs 331, 318), it takes 40+ seconds (executed directly from MySQL GUI), and even explain query takes 2+ seconds!. Explain query taking slow shows the same results regarding index usage.
After running the same query two-three times, it is back to usual speed.
If anyone could explain what happens, and how to fix the problem, I would appreciate the help.
Thanks.
MySQL caches the results of queries to allow it to return the results of the same query quicker later.
Adding a new thread is causing MySQL to have to rebuild the query cache the next time the query is run.
I have found MySQL subqueries to perform badly. Some tactics I have used to avoid subqueries:
Restructure the query as a join without subqueries.
Restructure the query as several queries.
Return more data that you need and then do some work with this data in your application.
Related
I am currently trying to run a JOIN between two tables in a local MySQL database and it's not working. Below is the query, I am even limiting the query to 10 rows just to run a test. After running this query for 15-20 minutes, it tells me "Error Code" 2013. Lost connection to MySQL server during query". My computer is not going to sleep, and I'm not doing anything to interrupt the connection.
SELECT rd_allid.CreateDate, rd_allid.SrceId, adobe.Date, adobe.Id
FROM rd_allid JOIN adobe
ON rd_allid.SrceId = adobe.Id
LIMIT 10
The rd_allid table has 17 million rows of data and the adobe table has 10 million. I know this is a lot, but I have a strong computer. My processor is an i7 6700 3.4GHz and I have 32GB of ram. I'm also running this on a solid state drive.
Any ideas why I cannot run this query?
"Why I cannot run this query?"
There's not enough information to determine definitively what is happening. We can only make guesses and speculations. And offer some suggestions.
I suspect MySQL is attempting to materialize the entire resultset before the LIMIT 10 clause is applied. For this query, there's no optimization for the LIMIT clause.
And we might guess that there is not a suitable index for the JOIN operation, which is causing MySQL to perform a nested loops join.
We also suspect that MySQL is encountering some resource limitation which is causing the session to be terminated. Possibly filling up all space in /tmp (that usually throws an error, something like "invalid/corrupted myisam table '#tmpNNN'", something of that ilk. Or it could be some other resource constraint. Without doing an analysis, we're just guessing.
It's possible MySQL wrote something to the error log (hostname.err). I'd check there.
But whatever condition MySQL is running into (the answer to the question "Why I cannot run this query")
I'm seriously questioning the purpose of the query. Why is that query being run? Why is returning that particular resultset important?
There are several possible queries we could execute. Some of those will run a long time, and some will be much more performant.
One of the best ways to investigate query performance is to use MySQL EXPLAIN. That will show us the query execution plan, revealing the operations that MySQL will perform, and in what order, and indexes will be used.
We can make some suggestions as to some possible indexes to add, based on the query shown e.g. on adobe (id, date).
And we can make some suggestions about modifications to the query (e.g. adding a WHERE clause, using a LEFT JOIN, incorporate inline views, etc. But we don't have enough of a specification to recommend a suitable alternative.
You can try something like:
SELECT rd_allidT.CreateDate, rd_allidT.SrceId, adobe.Date, adobe.Id
FROM
(SELECT CreateDate, SrceId FROM rd_allid ORDER BY SrceId LIMIT 1000) rd_allidT
INNER JOIN
(SELECT Id FROM adobe ORDER BY Id LIMIT 1000) adobeT ON adobeT.id = rd_allidT.SrceId;
This may help you get a faster response times.
Also if you are not interested in all the relation you can also put some WHERE clauses that will be executed before the INNER JOIN making the query faster also.
I have a table with 70 rows. For learning/testing purposes I wrote out a query for each row. So I wrote:
SELECT * FROM MyTable WHERE id="id1";
SELECT * FROM MyTable WHERE id="id2";
/*etc*/
SELECT * FROM MyTable WHERE id="id70";
And ran it in Sequel Pro. All of the queries took a total of 5 seconds. This seems like a really long time since I had read that MySQL has a feature called The MySQL Query Cache. It seems like a query cache, if it is this slow, is pretty useless and I might as well write my own layer of query caching between the database layer and the frontend.
Is it correct that the MySQL query cache is this slow? Or do I need to activate something or fix something to get it to work?
Per the cache documentation, it maps the text of a select statement to the returned result. Since all of those are different, the result wouldn't be cached until they have all been executed once. Does it take just as long the second time?
5 seconds seems slow even without the cache for a normal case though. How big is the table? Is id the primary key? If it is not the PK, then the server is reading every row, and just returning the one that met the criteria you asked for.
Edit - Since you're using a hosted solution, are you running the query from something on the host network, or across the internet? If it's across the internet, then the problem is almost certainly network latency rather than execution time. Especially running the queries individually, since you'll incur transit time for each select.
If you query just based on primary key, you might as well use the memcached interface.
https://dev.mysql.com/doc/refman/5.6/en/innodb-memcached.html
I'm trying to optimize the performance of some queries in my application.
In one query with multiple joins and a fulltext search I use SQL_CALC_FOUND_ROWS in a first query for pagination.
Unfortunately the performance of the query is very slow, Without the SQL_CALC_FOUND_ROWS the query is about 100 times faster.
I there a possibility to get a better performance in this case?
I tried a single count-query without the SQL_CALC_FOUND_ROWS, but this query is an additional second slower than the SQL_CALC_FOUND_ROWS-query.
Without knowing anything about your table structure and query we can't possibly tell if the query can be faster.
With SQL_CALC_FOUND_ROWS it first needs to process all records but if you use LIMIT 10 it can stop processing records after the first 10 found records. So there is the performance difference. There is no way of getting the same result faster without using a count or SQL_CALC_FOUND_ROWS.
However most of the time, a query can be optimized in different ways.
If you're looking into fulltext search, consider using sphinx. In my experience it always will outperform MySQL: http://www.sphinxsearch.com/
I've linked a MySQL view into MS Access via ODBC, but it's running WAY slow.
It's a simple select, that compares two other selects to find records that are unique to the first select.
SELECT `contacts_onlinedonors`.`contactkey` AS `contactkey`
FROM (`hal9k3-testbed`.`contacts_onlinedonors`
LEFT JOIN `hal9k3-testbed`.`contacts_offlinedonors`
ON(( `contacts_onlinedonors`.`contactkey` =
`contacts_offlinedonors`.`contactkey` )))
WHERE Isnull(`contacts_offlinedonors`.`contactkey`)
The slow query log says it returns 34,000 rows after examining 1.5 Billion. There are only 200,000 in the base table. What the heck?
The field "contactkey" is obviously an index on the table.
First thing to do is to "explain" this query.
See http://dev.mysql.com/doc/refman/5.0/en/explain.html
The idea is to figure out what the mysql server is doing, which indexes it is using, and adding indexes where needed, or rewriting your query so it can use indexes.
Recently I was pulled into the boss-man's office and told that one of my queries was slowing down the system. I then was told that it was because my WHERE clause began with 1 = 1. In my script I was just appending each of the search terms to the query so I added the 1 = 1 so that I could just append AND before each search term. I was told that this is causing the query to do a full table scan before proceeding to narrow the results down.
I decided to test this. We have a user table with around 14,000 records. The queries were ran five times each using both phpmyadmin and PuTTY. In phpmyadmin I limited the queries to 500 but in PuTTY there was no limit. I tried a few different basic queries and tried clocking the times on them. I found that the 1 = 1 seemed to cause the query to be faster than just a query with no WHERE clause at all. This is on a live database but it seemed the results were fairly consistent.
I was hoping to post on here and see if someone could either break down the results for me or explain to me the logic for either side of this.
Well, your boss-man and his information source are both idiots. Adding 1=1 to a query does not cause a full table scan. The only thing it does is make query parsing take a miniscule amount longer. Any decent query plan generator (including the mysql one) will realize this condition is a NOP and drop it.
I tried this on my own database (solar panel historical data), nothing interesting out of the noise.
mysql> select sum(KWHTODAY) from Samples where Timestamp >= '2010-01-01';
seconds: 5.73, 5.54, 5.65, 5.95, 5.49
mysql> select sum(KWHTODAY) from Samples where Timestamp >= '2010-01-01' and 1=1;
seconds: 6.01, 5.74, 5.83, 5.51, 5.83
Note I used ajreal's query cache disabling.
First at all, did you set session query_cache_type=off; during both testing?
Secondly, both your testing queries on PHPmyadmin and Putty (mysql client) are so different, how to verify?
You should apply same query on both site.
Also, you can not assume PHPmyadmin is query cache off. The time display on the phpmyadmin is including PHP processing, which you should avoid as well.
Therefore, you should just do the testing on mysql client instead.
This isn't a really accurate way to determine what's going on inside MySQL. Things like caching and network variations could skew your results.
You should look into using "explain" to find out what query plan MySQL is using for your queries with and without your 1=1. A DBA will be more interested in those results. Also, if your 1=1 is causing a full table scan, you will know for sure.
The explain syntax is here: http://dev.mysql.com/doc/refman/5.0/en/explain.html
How to interpret the results are here: http://dev.mysql.com/doc/refman/5.0/en/explain-output.html