I have 1 MyISAM table with 620,000 rows. Im running XAMPP on a Dual Core Server with 2GB RAM. Apache is installed as a Windows Service, MySQL is controlled from the XAMPP control panel.
The query below is taking 30+ seconds to run.
select `id`,`product_name`,`search_price`,`field1`,`field2`,
`field3`,`field4`
from `all`
where MATCH (`product_name`) AGAINST ('searchterm')
AND `search_price` BETWEEN 0 AND 1000
ORDER BY `search_price` DESC
LIMIT 0, 30
I have a FULLTEXT index on product_name, a BTREE on search_price, auto increment on id
If I explain the above query the results are:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE all fulltext search_price,FULLTEXT_product_name FULLTEXT_product_name 0 NULL 1 Using where; Using filesort
How can I speed up this query? Should it be taking this long on a table of 620,000 rows?
Ive just noticed that this only happens when the database has not been queried for a while, so im guessing this is to do with the cache, the first query is taking 30+ seconds, then if I try a second time the query takes 1 second
MySQL will do the fulltext search first, then look up the rest of the info, filter on price, sort on price, and finally deliver 30. There is essentially no way to shorten that process.
Yes, caching is likely to be the explanation for 30 seconds becoming 1 second.
Switching to InnoDB (which now has FULLTEXT) may provide some benefits.
If running entirely MyISAM, do you have key_buffer_size set to about 20% of available RAM? If you were much lower (or higher) than this, that could cause performance problems.
If running entirely InnoDB, set innodb_buffer_pool_size to about 70% of available RAM.
MySQL's capability of dealing with FULLTEXT is somewhat limited when th size of the table goes above 300,000. And it will peform even worse if you use really common words as search keywords like (in,the,of, etc commonly marked as stop words). I recommend using Sphinx Full Text Search/ Apache Lucene
Stackoverflow links:
Comparison of the two
More Comparison
Related
I am running MariaDB on a vServer (8 CPU vCores, 32 GB RAM) with a few dozen database tables which aggregate data from external services around the web for efficient use across my collection of websites (basically an API layer with database caching and it's own API for easy use in all of my projects).
All but one of these database tables allow quick, basic queries such as
SELECT id, content FROM tablename WHERE date_added > somedate
(with "content" being some JSON data). I am using InnoDB as the storage engine to allow inserts without table locking, "id" is always the primary key in any table and most of these tables only have a few thousand or maybe a few hundred thousand entries, resulting in a few hundred MB.
One table where things don't work properly though has already >6 million entries (potentially heading to 100 million) and uses >60 GB including indexes. I can insert, update and select by "id" but anything more complex (e.g. involving a search in 1 or 2 additional fields or sorting the results) runs into infinity. Example:
SELECT id FROM tablename WHERE extra = ''
This query would select entries where "extra" is empty. There is an index on "extra" and
EXPLAIN SELECT id FROM tablename WHERE extra = ''
tells me it is just a SIMPLE query with the correct index automatically chosen ("Using where; Using index"). If I set a low LIMIT I am fine, selecting thousands of results though and the query never stops running. Using more than 1 field in my search even with a combined index and explicitly adding the name of that index to the query and I'm out of luck as well.
Since there is more than enough storage available on my vServer and MariaDB/InnoDB don't have such low size limits for tables there might be some settings or other limitations that would prevent me from running queries on larger database tables. Looking through all the settings of MariaDB I couldn't find anything appropriate though.
Would be glad if someone could point me into the right direction.
I'm trying to run the below query and it's taking hours and hours. We've got a dedicated server for the queries (not running on localhost).
It's an InnoDB table with around 74 million rows. I've indexed the two columns involved in the grouping (TRAN_URN, UCI) in a hope to speed up the query.
insert into data.urn_uci_lookup (TRAN_URN, UCI, `Count`)
select TRAN_URN,UCI, count(*) as `Count`
from data.diablo18
group by TRAN_URN, UCI
Is this inefficient for some reason? How can I improve it?
EDIT: Here is the EXPLAIN plan
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE diablo18 ALL \N \N \N \N 74631102 Using temporary; Using filesort
Cheers,
Lucas
This query is going to read the entire 74 million rows. It is also going to recreate much of the table in a new table, depending on how many groups you have.
We don't have enough information about your server or data set to do much but make educated guesses.
You want to look into your innodb configuration, especially in regards to how much memory you have allocated to it (should be almost the entirety of the server's available RAM, the more the better) less what's needed for basic overhead, as described in https://dev.mysql.com/doc/refman/5.5/en/innodb-buffer-pool.html.
Your server io subsystem may be the bottleneck. If IO is slow, the server may just be stuck trying to keep up with the required reads/writes of this query. Setting up a high performance database server is much more complicated than installing the mysql server on a "dedicated" machine.
I have a table having 14 million rows and i am trying to perform a full text search on this table. The query for this is performing really slow, it is taking around 9 seconds for a simple binary AND query. The same stuff executes instantly on my private cluster. Size of this table is around 3.1 GB and it contains 14 million rows. Can someone explain this behavior of RDS instance?
SELECT count(*)
FROM table_name WHERE id=97
AND match(body) against ('+data +big' IN BOOLEAN MODE)
A high IO rate often indicates insufficient memory, or buffers too small. A 3GB table, including indexes, should fit entirely in memory of a (much-less-than) 500$-per-month dedicated server.
MySQL has many different buffers, and as many parameters to fiddle with. The following buffers are the most important, compare their sizes in the two environments:
If InnoDB: innodb_buffer_pool_size
If MyISAM: key_buffer_size and read_buffer_size
have you added FULLTEXT index on body column if not then try this one surely it will make a big difference
ALTER TABLE `table_name` ADD FULLTEXT INDEX `bodytext` (`body`);
Hope it helps
Try this
SELECT count(1)
FROM table_name WHERE id=97
AND match(body) against ('+data +big' IN BOOLEAN MODE)
This should speed it up a little since you dont have to count all columns just the rows.
Can you post the explain itself?
Since DB version, table, indexes and execution plans are the same, you need to compare machine/cluster configurations. Main points of comparison CPU power available, cores used in single transaction, storage read speed, memory size and read speed/frequency. I can see Amazon provides a variety of configurations, so maybe you private cluster is much more powerful, than Amazon RDS instance config.
To add to above, you can level the load between CPU, IO and Memory to increase throughput.
Using match() against() you perform your research across your entire 3GB fulltext index and there is no way to force another index in this case.
To speed up your query you need to make your fulltext index lighter so you can:
1 - clean all the useless characters and stopwords from your fulltext index
2 - create multiple fulltext indexes and peek the appropriate one
3 - change fulltext searches to LIKE clause and force an other index such as 'id'.
Try placing id in the text index and say:
match(BODY,ID) against (+big +data +97) and id=97
You might also look at sphinx which can be used with MySQL easily.
I have a fairly simple process running that periodically pulls RSS feeds and updates articles in a MySQL database.
The articles table is filled to about 130k rows right now. For each article found, the processor checks to see if the article already exists. These queries almost always take 300 milliseconds, and about every 10 or 20 tries, they take more than 2 seconds.
SELECT id FROM `articles` WHERE (guid = 'http://example.com/feed.rss') LIMIT 1;
# Query_time: 2.754567 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
I have an index on the guid column but whenever a new article is encountered, it's added to the articles table - invalidating the query cache (right?).
Some of the other fields in the slow query log report 120+ rows examined.
Of course on my development machine, these queries take about 0.2 milliseconds.
The server is a virtual host from Engine Yard Solo (EC2) with 1.7GB of memory and whatever CPU EC2 ships with these days.
Any advice would be greatly appreciated.
Update
As it turns out the problem was between the chair and the keyboard.
I had an index on 'id', but was querying on 'guid'.
Adding an index on 'guid' brought the query time down to 0.2 ms each.
Thanks for all the helpful tips everyone!
Run:
EXPLAIN SELECT id FROM `articles` WHERE (guid = 'http://example.com/feed.rss') LIMIT 1;
Notice the EXPLAIN in front. That'll tell you what MySQL is doing. Its hard to believe probing one row from an index could ever take 2.7s, unless your machine is seriously overloaded and/or thrashing. Considering the row counts of 0, I'm guessing MySQL did a full table scan to find nothing, which probably means you don't have the index you think you do.
To answer your other question, whenever you make any change to the articles table, all the query cache entries involving that table are invalidated.
The log says that no rows were read or even examined, so the problem is not with your query but most likely with your server. EC2's Achilles' heel is its IO/s, perhaps MySQL had to load the index from disk but the server's disks were completely saturated.
If your index is small enough to fit in memory (make sure your my.cnf allocates enough memory to key_buffer (MyISAM) or innodb_buffer_pool_size (InnoDB)), you should be able to preload it using
SELECT guid FROM articles
Check out the EXPLAIN to make sure it says "Using Index." If it doesn't, this one should:
SELECT guid FROM articles FORCE INDEX (guid) WHERE LENGTH(guid) > 0
Alternatively, if guid isn't your PRIMARY KEY or UNIQUE, you may remove its index and create another indexed column used to retrieve records quickly at a fraction of the index size. The column guid_crc32 would be an INT UNSIGNED and would hold the CRC32 of guid
ALTER TABLE articles ADD COLUMN guid_crc32 INT UNSIGNED, ADD INDEX guid_crc32 (guid_crc32);
UPDATE articles SET guid_crc32 = CRC32(guid);
Your SELECT query would then look like this:
SELECT id FROM articles WHERE guid = 'http://example.com/feed.rss' AND guid_crc32 = CRC32('http://example.com/feed.rss') LIMIT 1;
The optimizer should use the index on guid_crc32, which should be both faster and smaller than searching through guid.
if this table gets updated alot then mysql may not update the index-counts properly. try "CHECK TABLE articles" to update the index counts and see if your table is fine.
also try to see if doing EXPLAIN on your query give the same results on your dev and prod machines. if the results are different try OPTIMIZE TABLE.
Are these myisam or innodb tables?
Assuming GUID is indexed and ID is your primary key, something is "wrong." In that scenario, it is an index only query. The index is being bumped from memory and the disks are busy, perhaps.
Depending on your update / insert / delete pattern, you database may be crying for an "optimize" command.
SQL Commands I'd like to see the output of:
show table status like 'articles';
explain SELECT id FROM `articles` WHERE (guid = 'http://example.com/feed.rss') LIMIT 1;
explain articles;
System commands I'd like to see the output of (assuming Linux):
iostat 5 5
Tell us how much memory you have because 1.7mb is wrong, or something really exciting is happening.
Edit how much memory is available to your SQL server in my.cnf?
One day I suspect I'll have to learn hadoop and transfer all this data to a non-structured database, but I'm surprised to find the performance degrade so significantly in such a short period of time.
I have a mysql table with just under 6 million rows.
I am doing a very simple query on this table, and believe I have all the correct indexes in place.
the query is
SELECT date, time FROM events WHERE venid='47975' AND date>='2009-07-11' ORDER BY date
the explain returns
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE updateshows range date_idx date_idx 7 NULL 648997 Using where
so i am using the correct index as far as I can tell, but this query is taking 11 seconds to run.
The database is MyISAM, and phpMyAdmin says the table is 1.0GiB.
Any ideas here?
Edited:
The date_idx is indexes both the date and venid columns. Should those be two seperate indexes?
What you want to make sure is that the query will use ONLY the index, so make sure that the index covers all the fields you are selecting. Also, since it is a range query involved, You need to have the venid first in the index, since it is queried as a constant. I would therefore create and index like so:
ALTER TABLE events ADD INDEX indexNameHere (venid, date, time);
With this index, all the information that is needed to complete the query is in the index. This means that, hopefully, the storage engine is able to fetch the information without actually seeking inside the table itself. However, MyISAM might not be able to do this, since it doesn't store the data in the leaves of the indexes, so you might not get the speed increase you desire. If that's the case, try to create a copy of the table, and use the InnoDB engine on the copy. Repeat the same steps there and see if you get a significant speed increase. InnoDB does store the field values in the index leaves, and allow covering indexes.
Now, hopefully you'll see the following when you explain the query:
mysql> EXPLAIN SELECT date, time FROM events WHERE venid='47975' AND date>='2009-07-11' ORDER BY date;
id select_type table type possible_keys key [..] Extra
1 SIMPLE events range date_idx, indexNameHere indexNameHere Using index, Using where
Try adding a key that spans venid and date (or the other way around, or both...)
I would imagine that a 6M row table should be able to be optimised with quite normal techniques.
I assume that you have a dedicated database server, and it has a sensible amount of ram (say 8G minimum).
You will want to ensure you've tuned mysql to use your ram efficiently. If you're running a 32-bit OS, don't. If you are using MyISAM, tune your key buffer to use a signficiant proportion, but not too much, of your ram.
In any case you want to run repeated performance testing on production-grade hardware.
Try putting an index on the venid column.