I've written a select statement in mySQL. The duration is 50 seconds, and the fetch is 206 seconds. This is a long time. I'd like to understand WHICH part of my query is inefficient so I can improve its run time, but I'm not sure how to do that in mySQL.
My table has a little over 1,000,000 records. I have an index built in as well:
KEY `idKey` (`id`,`name`),
Here is my query:
SELECT name, id, alt_id, count(id), min(cost), avg(resale), code from
history where name like "%brian%" group by id;
I've looked at the mySQL Execution Plan, but I can't garner from that what is wrong:
If I highlight over the "Full Index Scan" part of the image, I see this:
Access Type: Index
Full Index Scan
Key/Index:
Used Key Parts: id, name
Possible Keys: idKey, id-Key, nameKey
Attach Condition:
(`allhistory`.`history`.`name` LIKE '%brian%')
Rows Examined Per Scan: 1098181
Rows Produced Per Join: 1098181
Filter: 100%
I know I can just scan a smaller subset of data by adding a LIMIT 100 into the query, and while it makes the time much shorter, (28 second duration, 0.000 sec Fetch,) I also want to see all the records - so I don't really want to put a limit on it.
Can someone more knowledgeable on this topic suggest where my query, my index, or my methodology might be inefficient for what I'm trying to accomplish?
This question has a solution only in mysql full text search functionality.
I don't consider the use of like a workable solution. Table scans are not a solution with millions of rows.
I wrote up an answer in this link, I hope you find a workable solution for yours with that reference and quick walk thru.
Here is one of the Mysql Manual Pages on Full Text Search.
I'm thinking your covered index may be backwards. Try switching the order (name, id). That way the WHERE clause can take advantage of the index.
Related
I have the next query to get the 15 most sold plates in a place:
This query is taking 12 seconds to execute over 100,000 rows. I think this execution takes too long, so I am searching a way to optmize the query.
I ran the explain SQL command on PHPMyAdmin and i got this:
[![enter image description here][1]][1]
According to this, the main problem is on the p table which is scanning the entire table, but how can I fix this? The id of p table is a primary key, do I need to set it also as an index? Also, is there anything else I can do to make the query runs faster?
You can make a relationship between the two tables.
https://database.guide/how-to-create-a-relationship-in-mysql-workbench/
Beside this you can also use a left join so you won't load the whole right table in.
Order by is a slow function in MySQL, if you are using code afterwards you can just do it in the code that is much faster than order by.
I hope I helped and Community feel free to edit :)
You did include the explain plan but you did not give any information about your table structure, data distribution, cardinality nor volumes. Assuming your indices are accurate and you have an even data distribution, the query is having to process over 12 million rows - not 100,000. But even then, that is relatively poor performance. But you never told us what hardware this sits on nor the background load.
A query with so many joins is always going to be slow - are they all needed?
the main problem is on the p table which is scanning the entire table
Full table scans are not automatically bad. The cost of dereferencing an index lookup as opposed to a streaming read is about 20 times more. Since the only constraint you apply to this table is its joins to other tables, there's nothing in the question you asked to suggest there is much scope for improving this.
I have a table with about 22 million rows and about 20 columns containing property data. Currently a query like:
SELECT * FROM fulldataset WHERE county = 'MIDDLESBROUGH'
takes an average of 42 seconds to run. To try and improve this, I created an index on the county column like this:
ALTER TABLE fulldataset ADD INDEX county (county)
There has been no improvement at all in the speed of the same query.
So I used EXPLAIN SELECT to try and find out what was happening. If I SELECT * from countyA, it returns around 85k entries, after ~42 seconds. If I EXPLAIN SELECT the same query it says it's using the county Index I created and that the number of rows is around 167k, which is wrong but better than searching all 22 million.
Likewise, if I SELECT * for countyB I get around 48k results and EXPLAIN SELECT tells me there are around 91k rows. The EXPLAIN SELECT statement returns the result instantly, so it's able to instantly tell that there are around half as many entries for countyB as there are for countyA. The problem is the queries don't execute any faster. If it's only checking 91k rows shouldn't it be very quick?
Here's a screenshot of what I'm doing: image
EDIT: As pointed out, the query itself is not what is taking time. In answer to my own question in the comments, a multiple column index worked wonders.
The query is not the problem. If you look closely at the output of your program you will see that the query execution took less than 1s, but fetching all the rows took 42s.
If you have to wait 42s before you see anything then I recommend to use another querying tool which only fetches the first X rows and displays them in pages.
EXPLAIN is designed to be fast. In doing so, the calculation of "Rows" is only a crude estimate. If can often be off by a factor of 2. So, don't read too much into 85K vs 167K.
Since EXPLAIN is delivering only a single row (or a small number of rows), the "fetch" time is very low.
If you are selecting the AVG() of some column, it has to first read all the relevant rows, doing the computation as it goes. It cannot even start to deliver data until it has finished all the reading.
If you are reading all the rows, it can (but I am not sure that it does) start delivering rows starting with the first row.
If you do something like SELECT * FROM tbl ORDER BY x (and x is not indexed), then you get the worst or both worlds. First it has to read all the rows and write them to a temp table, then it sorts that temp table; only then can it begin to fetch the rows.
I think "duration" and "fetch" are not very useful; the sum of the two is more useful. Here's another example of it: Mysql same querys one with index second without getting 10000xFetch time?
Notice how the sum is consistent, but the separation is not.
I was trying to optimize my MySQL queries, but found out that i'm actually doing this wrong. I've changed my query from using
SELECT * FROM test WHERE tst_title LIKE '1%'
To:
SELECT * FROM `test` WHERE MATCH(tst_title) AGAINST("+1*" IN BOOLEAN MODE)
And the runtime, for the FULLTEXT, was terrible.. See them below:
USING LIKE:
Showing rows 0 - 24 (1960 total, Query took 0.0004 sec)
USING FULLTEXT:
Showing rows 0 - 24 (1960 total, Query took 0.0033 sec)
I've read many tutorials wherein they explained, on why you should use FULLTEXT (since this actually searches by indexes). But how would this be a slower way to retrieve data, then the LIKE statement (since the LIKE statement has to go through every single record in order to return their validity)?
I literally can't figure out on why this is happening.. Help on optimization would be appericiated a lot!
Unless you set the min_word_len to a smaller number than the default, FULLTEXT cannot find all the values starting with 1
If test_title is a numeric value (INT, FLOAT, etc), then both LIKE and FULLTEXT are terrible ways to do the query.
Given INDEX(tst_title) (and it being VARCHAR or TEXT), then LIKE is likely to run faster, since it only has to check all entries starting with 1.
The timings you list smell like the Query Cache took over. For timing purposes, use SELECT SQL_NO_CACHE ... to avoid the QC.
If you use MATCH or LIKE without having FULLTEXT or INDEX, respectively, then the query has no choice but to scan all rows in the table.
Where did 1960 total come from? Does the timing include computing that?
Is the table MyISAM? Or InnoDB? There are differences in FULLTEXT that may factor in this thread.
From what I've read, if you were using ...tst_tile LIKE '%1%' this would be slower as it has to perform a full table scan and has no index. The one you currently have with a wildcard on the right can use an index, and it is probably the reason why it is faster than using FULLTEXT.
Not too sure on it myself, but this is what I read and hope it helps.
EDIT:
May want to read this answer here for a full explanation on FULLTEXT vs LIKE
I have a MySQL table MtgoxTrade(id,time,price,amount,type,tid) with more than 500M+ records, i need to query the three fields (time,price,amount) from all records:
SELECT time, price, amount FROM MtgoxTrade;
It spends 110 seconds on Win7 which is too slow,my questions are:
Will a compound index help on this? Note that my SQL query has no WHERE clause
Any other optimization could be made improve the query performance here?
Updated: I'm sorry that MtgoxTrade table have totally 6 fields: (id,time,price,amount,type,tid). My SQL only need to query three fields (time,price,amount). And i already tried to add composite index on (time,price,amount), but seems no help.
If this is your real query - NO, nothing could possibly help. Come to think of it - you are asking to deliver contents of whole 500M+ table! It will be slow no matter what you do - whole table must be processed.
If you can constrain your program logic to only process some smaller subset of your table, then it is possible to make it faster.
For example, you can process only results for last month using WHERE clause:
SELECT time, price, amount
FROM MtgoxTrade
WHERE time BETWEEN '2013-09-01' AND '2013-09-21'
This can work really fast, but you would still need to add index on time field, like this:
CREATE INDEX mtgoxtrade_time_idx ON mtgoxtrade (time);
I have a java application and I would like to get some data from a table and display in the application.
I have millions of records, and the query gets really slow when I am going to the last records. it takes few good minutes to get the results.
select Id from Table1x where description like '%error%' and Id between 0 and 1329999 limit 0, 1000
The above query returns a fast result. That is first pages returns fast. But when I am moving the last pages, it becomes slow.
select Id from Table1x where description like '%error%' and Id between 0 and 1329999 limit 644000, 1000.
This query is slow and taking 17 secs.
Any ideas on how to make this faster? Id is the primary key of table1x.
The problem is in the like. To get the first 1000 records, the database only needs to filter the database until it finds 1000 records that match the search. For the other query, the database needs to match records until it has 645000 records, which makes it much slower. There is no sorting or other filtering, so the index on ID doesn't help at all.
An index on description would help, but not if you start the search with a wildcard, like you do now.
I see two solutions.
First option is to add a FULLTEXT index on the description field. It allows to to look for the word error using MATCH rather than LIKE. I think it will be a lot faster, but the index will become larger too, and I'm not sure about the optimizations on the long run.
Second solution: Since you're obviously looking for errors (I think you're building a report on a log table?), you may add a column with a record type. You can give each record a type (just an integer) which indicates where that record holds an error or not. You will need to update your table once, and insert the type along with new records, but it will make your query faster.
I must admit that this second solution is based on assumptions about the data and your goal. If I'm wrong about that, please provide additional information and I may find a solution that suits you better.