So I've been struggling to run this query. It takes a really long time.
Its MySQL Innodb. The fields I am using are indexed. Its on a pretty beefy server with around 10gig allocated to the innodb pool config thing.
UPDATE TEMP_account_product p
JOIN products_temp c ON (c.`some_id` = p.`old_someid`)
SET p.`product` = c.id
WHERE p.product IS NULL;
The thing to note here is that both tables contain around 900,000 rows. this line brings back around 800,000 records (WHERE p.product IS NULL;)
I have a feeling I'm kinda screwed here but thought Id try anyway.
I think that possible reasons of slow execution of such type of request can be:
MOST probable - you have an INDEX(es) on updated field and that request is updating lot of rows - in that case MySQL will need to do a lot of work rebuilding that INDEX(es) during UPDATE. In that case just DROP the INDEX(es) before request, and later recreate it (if needed).
JOIN is slow (you can check it by select with that JOIN) - i.e. join is done w/o INDEXES. Add indexes in that case.
slow filtering of WHERE (i.e. MySQL make a full scan to filter), - you can check how fast it is by select with same filter.
I suggest running it in batches, so that you don't need to rely on the query plan to decide not to being the entire result set into memory before it starts doing the updates. Add something like LIMIT 1000 to the query, and then run it until the number of affected rows is zero (technique depends on your environment, but I think it could be done in SQL).
UPDATE, this is not a valid option (as-is).
Sure enough, I overlooked this in the UPDATE docs:
For the multiple-table syntax ... In this case, ORDER BY and LIMIT cannot be used.
Related
I am trying to speed up a simple SELECT query on a table that has around 2 million entries, in a MariaDB MySQL database. It took over 1.5s until I created an index for the columns that I need, and running it through PhpMyAdmin showed a significant boost in speed (now takes around 0.09s).
The problem is, when I run it through my PHP server (mysqli), the execution time does not change at all. I'm logging my execution time by running microtime() before and after the query, and it takes ~1.5s to run it, regardless of having the index or not (tried removing/readding it to see the difference).
Query example:
SELECT `pair`, `price`, `time` FROM `live_prices` FORCE INDEX
(pairPriceTime) WHERE `time` = '2022-08-07 03:01:59';
Index created:
ALTER TABLE `live_prices` ADD INDEX pairPriceTime (pair, price, time);
Any thoughts on this? Does PHP PDO ignore indexes? Do I need to restart the server in order for it to "acknowledge" that there is a new index? (Which is a problem since I'm using a shared hosting service...)
If that is really the query, then it needs an INDEX starting with the value tested in the WHERE:
INDEX(time)
Or, to make a "covering index":
INDEX(time, pair, price)
However, I suspect that most of your accesses involve pair? If so, then other queries may need
INDEX(pair, time)
especially if you as for a range of times.
To discuss various options further, please provide EXPLAIN SELECT ...
PDO, mysqli, phpmyadmin -- These all work the same way. (A possible exception deals with an implicit LIMIT on phpmyadmin.)
Try hard to avoid the use of FORCE INDEX -- what helps on today's query and dataset may hurt on tomorrow's.
When you see puzzling anomalies in timings, run the query twice. Caching may be the explanation.
The mysql documenation says
The FORCE INDEX hint acts like USE INDEX (index_list), with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the named indexes to find rows in the table.
MariaDB documentation Force Index here says this
FORCE INDEX works by only considering the given indexes (like with USE_INDEX) but in addition, it tells the optimizer to regard a table scan as something very expensive. However, if none of the 'forced' indexes can be used, then a table scan will be used anyway.
Use of the index is not mandatory. Since you have only specified one condition - the time, it can choose to use some other index for the fetch. I would suggest that you use another condition for the select in the where clause or add an order by
order by pair, price, time
I ended up creating another index (just for the time column) and it did the trick, running at ~0.002s now. Setting the LIMIT clause had no effect since I was always getting 423 rows (for 423 coin pairs).
Bottom line, I probably needed a more specific index, although the weird part is that the first index worked great on PMA but not through PHP, but the second one now applies to both approaches.
Thank you all for the kind replies :)
I have a script that tries to read all the rows from a table like this:
select count(*) from table where col1 = 'Y' or col1 is null;
col1 and col2 are not indexed and this query usually takes ~20 seconds but if someone is already running this query, it takes ages and gets blocked.
We just have around 100k rows in the table and I tried it without the where clause and it causes the same issue.
The table uses InnoDB so, it doesn't store the exact count but I am curious if there is any concurrency parameter I should look into. I am not sure if absence of indexes on the table causes the issue but it doesn't make sense to me.
Thanks!
If they are not indexed, then it is required to read the entire disk files of your tables to find your data. A single hard disk cannot perform very well concurrent read intensive operations. You have to index.
It looks like your SELECT COUNT(*)... query is being serialized with other operations on your table. Unless you tell the MySQL server otherwise, your query will do its best to be very precise.
Try changing the transaction isolation level by issuing this command immediately before your query.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Setting this enables so-called dirty reads, which means you might not count everything in the table that changes during your operation. But that probably will not foul up your application too badly.
(Adding appropriate indexes is always a good idea, but not the cause of the problem you ask about.)
The two quires below do the same thing. Basically show all the id's of table 1, which are present in table 2. The thing which puzzles me is that the simple select is way way faster than the JOIN, I would have expected that the JOIN is a bit slower, but not by that much...5 seconds vs. 0.2
Can anyone elaborate on this ?
SELECT table1.id FROM
table1,table2 WHERE
table1.id=table2.id
Duration/Fetch 0.295/0.028 (MySql Workbench 5.2.47)
SELECT table1.id
FROM table1
INNER JOIN table2
ON table1.id=table2.id
Duration/Fetch 5.035/0.027 (MySql Workbench 5.2.47)
Q: Can anyone elaborate on this?
A: Before we go the "a bug in MySQL" route that #a_horse_with_no_name seems impatient to race down, we'd really need to ensure that this is repeatable behavior, and isn't just a quirk.
And to do that, we'd really need to see the elapsed time result from more than one run of the query.
If the query cache is enabled on the server, we want to run the queries with the SQL_NO_CACHE hint added (SELECT SQL_NO_CACHE table1.id ...) so we know we aren't retrieving cached results.
I'd repeat the execution of each query at least three times, and throw out the result from the first run, and average the other runs. (The purpose of this is to eliminate the impact of the table data not being in the cache, either InnoDB buffer, or the filesystem cache.)
Also, run an EXPLAIN SELECT ... for each query. And compare the access plans.
If either of these tables is MyISAM storage engine, note that MyISAM tables are subject to locking by DML operations; while an INSERT, UPDATE or DELETE operation is run on the table, the SELECT statements will be blocked from accessing the table. (But five seconds seems a bit much for that, unless these are really large tables, or really inefficient DML statements).
With InnoDB, the SELECT queries won't be blocked by DML operations.
Elapsed time is also going to depend on what else is going on on the system.
But the total elapsed time is going include more than just the time in the MySQL server. Temporarily turning on the MySQL general_log would allow you to capture the statements that are actually being processed by the server.
This looks like something that could be further optimized by the database engine if indeed you are running both queries under the exact same context.
SQL is declarative. By successfully declaring what you want, the engine has free reign to restructure the "How" of your request to bring back the fastest result.
The earliest versions of SQL didn't even have the keyword JOIN. There was only the comma.
There are many coding constructs in SQL that imperatively force a single inferior methodology over another and they should be avoided. JOIN shouldn't be avoided. Something sounds a miss. JOIN is the core element of SQL. It would be a shame to always have to use commas.
There are a zillion factors that go into the performance of a JOIN all based your environment, schema, and data. Chances are that your table1 and table2 represent a fringe case that may have gotten past the optimization algorithms.
The SQL_NO_CACHE worked, the new results are:
Duration/Fetch 5.065 / 0.027 for the select where and
Duration/Fetch 5.050 / 0.027 for the join
I would have thought that the "select where" would be faster, but the join was actually a tad swifter. But the difference is negligible
I would like to thank everyone for their response.
I'm running a forum on a VPS, running Percona DB, with PHP 5.5.8, Opcode caching, etc, it's all very speed orientated.
I'm also running New Relic, (yes I have the t-shirt).
As I'm tuning the application, optimising queries the forum is making to the DB for any query at the top of my time consumed list.
Right now, the most time consuming query I have, as it's the most frequently used is a simple hit counter on each topic.
So the query is:
UPDATE topics SET num_views = num_views + 1 WHERE id_topic = ?
I can't think of a simpler way to perform this, or if any of the various other ways might be quicker, and why.
Is there a way of writing this query to be even faster, or an index I can add to a field to aide speed?
Thanks.
Assuming id_topic is indexed, you're not going to get better. The only recommendation I would have is to look at the other indexes on this table and make sure you don't have redundant ones that include num_views in them. That would decrease update speed on this update.
For example if you had the following indexes
( some_column, num_views)
( some_column, num_views, another_column)
Index #1 would be extraneous and just add to the insert/update overhead
Not sure if that is an improvement, but you could check the following:
How about only adding a row for each page hit to the table instead of locking and updating the row?
And then using a count to get the results, and cache them instead of doing the count each time?
(And maybe compacting the table once per day?)
I have a web app, which has quite a few queries being fired from every page. As more data was added to the DB, we noticed that the pages were taking longer and longer to load.
On examining PhpMyAdmin -> Status -> Joins, we noticed this (with the number in red):
Select_full_join 348.6 k The number of joins that do not use indexes. If this value is not 0, you should carefully check the indexes of your tables.
How do I determine which joins are causing the problems? Are all the joins equally to be blamed?
How do I determine which columns should be indexed, for the performance to be proper?
We are using CakePHP + MySQL, and the queries are all auto-generated.
The rule of thumb that I have always used, is that if I am using join, the fields that I am joining on need to be indexed.
For instance, if you have a query like the following:
SELECT t1.name, t2.salary
FROM employee AS t1
INNER JOIN info AS t2 ON t1.name = t2.name;
Both t1.name and t2.name should be indexed.
Below are some good reads for this as well:
Optimizing MySQL: Importance of JOIN Order
How to optimize MySQL JOIN queries through indexing
And in general, this guy's site has some good info as well.
MySQL Optimizer Team
Edit: This is always helpful.
And if you have access to your server settings, check out:
MySQL Slow Server Logs
Once you have a log of slow queries, you can use explain on them to see what needs indexing.
If you don't know which queries are running inefficiently, you have a couple of choices.
You could try this:
Try issuing the command SHOW FULL PROCESSLIST from phpmyadmin while your web site is active. It will show you, hopefully, a bunch of slow running queries. The FULL processlist should give you the entire query. You could then use the EXPLAIN command to figure out what it's doing.
You should also try this:
Think through the work your application is doing on behalf of your users. Think through which of your queries have to romp through lots of data to deliver value to the users. Think through which tables are growing as your application gets used more and more.
Then, find your queries that deliver that value, and that access your growing tables. Again, use the EXPLAIN command to see how MySQL is processing them, and add indexes as needed.
I suspect it will be very obvious which indexes you should add. Add the obvious ones, then let your system stabilize for a couple of workdays, then remeasure.
Notice that this is a normal part of bringing a new application into production.