Simple MySQL Query Taking 2 to 3 seconds? - mysql

I have a fairly simple process running that periodically pulls RSS feeds and updates articles in a MySQL database.
The articles table is filled to about 130k rows right now. For each article found, the processor checks to see if the article already exists. These queries almost always take 300 milliseconds, and about every 10 or 20 tries, they take more than 2 seconds.
SELECT id FROM `articles` WHERE (guid = 'http://example.com/feed.rss') LIMIT 1;
# Query_time: 2.754567 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
I have an index on the guid column but whenever a new article is encountered, it's added to the articles table - invalidating the query cache (right?).
Some of the other fields in the slow query log report 120+ rows examined.
Of course on my development machine, these queries take about 0.2 milliseconds.
The server is a virtual host from Engine Yard Solo (EC2) with 1.7GB of memory and whatever CPU EC2 ships with these days.
Any advice would be greatly appreciated.
Update
As it turns out the problem was between the chair and the keyboard.
I had an index on 'id', but was querying on 'guid'.
Adding an index on 'guid' brought the query time down to 0.2 ms each.
Thanks for all the helpful tips everyone!

Run:
EXPLAIN SELECT id FROM `articles` WHERE (guid = 'http://example.com/feed.rss') LIMIT 1;
Notice the EXPLAIN in front. That'll tell you what MySQL is doing. Its hard to believe probing one row from an index could ever take 2.7s, unless your machine is seriously overloaded and/or thrashing. Considering the row counts of 0, I'm guessing MySQL did a full table scan to find nothing, which probably means you don't have the index you think you do.
To answer your other question, whenever you make any change to the articles table, all the query cache entries involving that table are invalidated.

The log says that no rows were read or even examined, so the problem is not with your query but most likely with your server. EC2's Achilles' heel is its IO/s, perhaps MySQL had to load the index from disk but the server's disks were completely saturated.
If your index is small enough to fit in memory (make sure your my.cnf allocates enough memory to key_buffer (MyISAM) or innodb_buffer_pool_size (InnoDB)), you should be able to preload it using
SELECT guid FROM articles
Check out the EXPLAIN to make sure it says "Using Index." If it doesn't, this one should:
SELECT guid FROM articles FORCE INDEX (guid) WHERE LENGTH(guid) > 0
Alternatively, if guid isn't your PRIMARY KEY or UNIQUE, you may remove its index and create another indexed column used to retrieve records quickly at a fraction of the index size. The column guid_crc32 would be an INT UNSIGNED and would hold the CRC32 of guid
ALTER TABLE articles ADD COLUMN guid_crc32 INT UNSIGNED, ADD INDEX guid_crc32 (guid_crc32);
UPDATE articles SET guid_crc32 = CRC32(guid);
Your SELECT query would then look like this:
SELECT id FROM articles WHERE guid = 'http://example.com/feed.rss' AND guid_crc32 = CRC32('http://example.com/feed.rss') LIMIT 1;
The optimizer should use the index on guid_crc32, which should be both faster and smaller than searching through guid.

if this table gets updated alot then mysql may not update the index-counts properly. try "CHECK TABLE articles" to update the index counts and see if your table is fine.
also try to see if doing EXPLAIN on your query give the same results on your dev and prod machines. if the results are different try OPTIMIZE TABLE.
Are these myisam or innodb tables?

Assuming GUID is indexed and ID is your primary key, something is "wrong." In that scenario, it is an index only query. The index is being bumped from memory and the disks are busy, perhaps.
Depending on your update / insert / delete pattern, you database may be crying for an "optimize" command.
SQL Commands I'd like to see the output of:
show table status like 'articles';
explain SELECT id FROM `articles` WHERE (guid = 'http://example.com/feed.rss') LIMIT 1;
explain articles;
System commands I'd like to see the output of (assuming Linux):
iostat 5 5
Tell us how much memory you have because 1.7mb is wrong, or something really exciting is happening.
Edit how much memory is available to your SQL server in my.cnf?

Related

Running even basic SQL queries on a >60 GB table in MariaDB

I am running MariaDB on a vServer (8 CPU vCores, 32 GB RAM) with a few dozen database tables which aggregate data from external services around the web for efficient use across my collection of websites (basically an API layer with database caching and it's own API for easy use in all of my projects).
All but one of these database tables allow quick, basic queries such as
SELECT id, content FROM tablename WHERE date_added > somedate
(with "content" being some JSON data). I am using InnoDB as the storage engine to allow inserts without table locking, "id" is always the primary key in any table and most of these tables only have a few thousand or maybe a few hundred thousand entries, resulting in a few hundred MB.
One table where things don't work properly though has already >6 million entries (potentially heading to 100 million) and uses >60 GB including indexes. I can insert, update and select by "id" but anything more complex (e.g. involving a search in 1 or 2 additional fields or sorting the results) runs into infinity. Example:
SELECT id FROM tablename WHERE extra = ''
This query would select entries where "extra" is empty. There is an index on "extra" and
EXPLAIN SELECT id FROM tablename WHERE extra = ''
tells me it is just a SIMPLE query with the correct index automatically chosen ("Using where; Using index"). If I set a low LIMIT I am fine, selecting thousands of results though and the query never stops running. Using more than 1 field in my search even with a combined index and explicitly adding the name of that index to the query and I'm out of luck as well.
Since there is more than enough storage available on my vServer and MariaDB/InnoDB don't have such low size limits for tables there might be some settings or other limitations that would prevent me from running queries on larger database tables. Looking through all the settings of MariaDB I couldn't find anything appropriate though.
Would be glad if someone could point me into the right direction.

mysql - Deleting Rows from InnoDB is very slow

I got a mysql database with approx. 1 TB of data. Table fuelinjection_stroke has apprx. 1.000.000.000 rows. DBID is the primary key that is automatically incremented by one with each insert.
I am trying to delete the first 1.000.000 rows using a very simple statement:
Delete from fuelinjection_stroke where DBID < 1000000;
This query is takeing very long (>24h) on my dedicated 8core Xeon Server (32 GB Memory, SAS Storage).
Any idea whether the process can be sped up?
I believe that you table becomes locked. I've faced same problem and find out that can delete 10k records pretty fast. So you might want to write simple script/program which will delete records by chunks.
DELETE FROM fuelinjection_stroke WHERE DBID < 1000000 LIMIT 10000;
And keep executing it until it deletes everything
Are you space deprived? Is down time impossible?
If not, you could fit in a new INT column length 1 and default it to 1 for "active" (or whatever your terminology is) and 0 for "inactive". Actually, you could use 0 through 9 as 10 different states if necessary.
Adding this new column will take a looooooooong time, but once it's over, your UPDATEs should be lightning fast as long as you do it off the PRIMARY (as you do with your DELETE) and you don't index this new column.
The reason why InnoDB takes so long to DELETE on such a massive table as yours is because of the cluster index. It physically orders your table based upon your PRIMARY (or first UNIQUE it finds...or whatever it feels like if it can't find PRIMARY or UNIQUE), so when you pull out one row, it now reorders your ENTIRE table physically on the disk for speed and defragmentation. So it's not the DELETE that's taking so long. It's the physical reordering after that row is removed.
When you create a new INT column with a default value, the space will be filled, so when you UPDATE it, there's no need for physical reordering across your huge table.
I'm not sure exactly what your schema is exactly, but using a column for a row's state is much faster than DELETEing; however, it will take more space.
Try setting values:
innodb_flush_log_at_trx_commit=2
innodb_flush_method=O_DIRECT (for non-windows machine)
innodb_buffer_pool_size=25GB (currently it is close to 21GB)
innodb_doublewrite=0
innodb_support_xa=0
innodb_thread_concurrency=0...1000 (try different values, beginning with 200)
References:
MySQL docs for description of different variables.
MySQL Server Setting Tuning
MySQL Performance Optimization basics
http://bugs.mysql.com/bug.php?id=28382
What indexes do you have?
I think your issue is that the delete is rebuilding the index on every iteration.
I'd delete the indexes if any, do the delete, then re-add the indexes. It'll be far faster, (I think).
I was having the same problem, and my table has several indices that I didn't want to have to drop and recreate. So I did the following:
create table keepers
select * from origTable where {clause to retrieve rows to preserve};
truncate table origTable;
insert into origTable null,keepers.col2,...keepers.col(last) from keepers;
drop table keepers;
About 2.2 million rows were processed in about 3 minutes.
Your database may be checking for records that need to be modified in a foreign key (cascades, delete).
But I-Conica answer is a good point(+1). The process of deleting a single record and updating a lot of indexes during done 100000 times is inefficient. Just drop the index, delete all records and create it again.
And of course, check if there is any kind of lock in the database. One user or application can lock a record or table and your query will be waiting until the user release the resource or it reachs a timeout. One way to check if your database is doing real work or just waiting is lauch the query from a connection that sets the --innodb_lock_wait_timeout parameter to a few seconds. If it fails at least you know that the query is OK and that you need to find and realease that lock. Examples of locks are Select * from XXX For update and uncommited transactions.
For such long tables, I'd rather use MYISAM, specially if there is not a lot of transactions needed.
I don't know exact ans for ur que. But writing another way to delete those rows, pls try this.
delete from fuelinjection_stroke where DBID in
(
select top 1000000 DBID from fuelinjection_stroke
order by DBID asc
)

is breaking up large MySql queries beneficial

I have a MySql table that has about 5.5 million rows of data. Every month I need to reload this data with a new data file. Since it takes a while to remove the old data, I'm adding limit 2000000 to split the job up into chunks. Example:
DELETE FROM `list_content` WHERE list_id = 3 limit 2000000
My theory is that memory might be released after the query is done, and splitting it into chunks like this might be beneficial in not consuming resources. However I haven't found anything that supports my theory. Is there any benefit to splitting up a query like this instead of just letting it run for 20 minutes?
To answer your question directly - there is no benefit. Assuming that you're not hitting any physical limits (like the maximum amount of physical memory your OS supports) and that MySQL is a well written program, then no.
A couple things you might consider
(1)
DELETE QUICK FROM `list_content` WHERE list_id = 3
followed by a
OPTIMIZE TABLE `list_content`
DELETE QUICK will not perform any house-keeping on the index blocks as its deleting. You can then do all the house-keeping at once with the OPTIMIZE TABLE statement.
(2)
If you're wiping out the whole table, then load the new data into a new table... call it list_temp
DROP TABLE `list_content`; -- very quick since this is simple a delete file op
RENAME TABLE `list_temp` to `list_content`; -- also quick since this is a file rename op

mysql: slow query on indexed field

The orders table has 2m records. There are ~900K unique ship-to-ids.
There is an index on ship_to_id ( the field isint(8)).
The query below takes nearly 10mn to complete. I've run PROCESSLIST which has Command = Query and State = Sending Data.
When I run explain, the existing index is used, and possible_keys is NULL.
Is there anything I should do to speed this query up? Thanks.
SELECT
ship_to_id as customer_id
FROM orders
GROUP BY ship_to_id
HAVING SUM( price_after_discount ) > 0
Does not look like you have a useful index. Try adding an index on price_after_discount, and add a where condition like this:
WHERE price_after_discount > 0
to minimize the number of rows you need to sum as you can obviously discard any that are 0.
Also try running "top" command and look at the io "wait" column while the query is running. If its high, it means your query causes a lot of disk I/O. You can increase various memory buffers if you have the RAM to speed this up (if you're using innodb) or myisam is done through filesystem cacheing. Restarting the server will flush these caches.
If you do not have enough RAM (which you shouldn't need too much for 2M records) then consider a partitioning scheme against maybe ship-to-ids column (if your version of mysql supports it).
If all the orders in that table aren't current (i.e. not going to change again) then you could archive them off into another table to reduce how much data has to be scanned.
Another option is to throw a last_modified timestamp on the table with an index. You could then keep track of when the query is run and store the results in another table (query_results). When it's time to run the query again, you would only need to select the orders that were modified since the last time the query was run, then use that to update the query_results. The logic is a little more complicated, but it should be much faster assuming a low percentage of the orders are updated between query executions.
MySQL will use an index for a group by, at least according to the documentation, as explained here.
To be most useful, all the columns used in the query should be in the index. This prevents the engine from having to reference the original data as well as the index. So, try an index on orders(ship_to_id, price_after_discount).

Approximately how long should it take to delete 10m records from an MySQL InnoDB table with 30m records?

I am deleting approximately 1/3 of the records in a table using the query:
DELETE FROM `abc` LIMIT 10680000;
The query appears in the processlist with the state "updating". There are 30m records in total. The table has 5 columns and two indexes, and when dumped to SQL the file about 9GB.
This is the only database and table in MySQL.
This is running on a machine with 2GB of memory, a 3 GHz quad-core processor and a fast SAS disk. MySQL is not performing any reads or writes other than this DELETE operation. No other "heavy" processes are running on the machine.
This query has been running for more than 2 hours -- how long can I expect it to take?
Thanks for the help! I'm pretty new to MySQL, so any tidbits about what's happening "under the hood" while running this query are definitely appreciated.
Let me know if I can provide any other information that would be pertinent.
Update: I just ran a COUNT(*), and in 2 hours, it's only deleted 200k records. I think I'm going to take Joe Enos' advice and see how well inserting the data into a new table and dropping the previous table performs.
Update 2: Sorry, I actually misread the number. In 2 hours, it's not deleted anything. I'm confused. Any suggestions?
Update 3: I ended up using mysqldump with --where "true LIMIT 10680000,31622302" and then importing the data into a new table. I then deleted the old table and renamed the new one. This took just over half an hour.
Don't know if this would be any better, but it might be worth thinking about doing the following:
Create a new table and insert 2/3 of the original table into the new one.
Drop the original table.
Rename the new table to the original table's name.
This would prevent the log file from having all the deletes, but I don't know if inserting 20m records is faster than deleting 10m.
You should post the table definition.
Also, to know why is it taking to much time, try to enable the profile mode on the delete request via :
SET profiling=1;
DELETE FROM abc LIMIT 10680000;
SET profiling=0;
SHOW PROFILES;
SHOW PROFILE ALL FOR QUERY X; (X is the ID of your query shown in SHOW PROFILES)
and post what it returns (But I think the query must end to return the profiling data)
http://dev.mysql.com/doc/refman/5.0/en/show-profiles.html
Also, I think you'll get more responses on ServerFault ;)
When you run this query, the InnoDB log file for the database is used to record all the details of the rows that are deleted - and if this log file isn't large enough from the outset it'll be auto-extended as and when necessary (if configured to do so) - I'm not familiar with the specifics but I expect this auto-extension is not blindingly fast. 2 hours does seem like a long time - but doesn't surprise me if the log file is growing as the query is running.
Is the table from which the records are being deleted on the end of a foreign key (i.e. does another table reference it through a FK constraint)?
I hope your query ended by now ... :) but from what I've seen, LIMIT with large numbers (and I never tried this kind of numbers) is very slow. I would try something based on the pk like
DELETE FROM abc WHERE abc_pk < 10680000;