Group by, Order by and Count MySQL performance - mysql

I have the next query to get the 15 most sold plates in a place:
This query is taking 12 seconds to execute over 100,000 rows. I think this execution takes too long, so I am searching a way to optmize the query.
I ran the explain SQL command on PHPMyAdmin and i got this:
[![enter image description here][1]][1]
According to this, the main problem is on the p table which is scanning the entire table, but how can I fix this? The id of p table is a primary key, do I need to set it also as an index? Also, is there anything else I can do to make the query runs faster?

You can make a relationship between the two tables.
https://database.guide/how-to-create-a-relationship-in-mysql-workbench/
Beside this you can also use a left join so you won't load the whole right table in.
Order by is a slow function in MySQL, if you are using code afterwards you can just do it in the code that is much faster than order by.
I hope I helped and Community feel free to edit :)

You did include the explain plan but you did not give any information about your table structure, data distribution, cardinality nor volumes. Assuming your indices are accurate and you have an even data distribution, the query is having to process over 12 million rows - not 100,000. But even then, that is relatively poor performance. But you never told us what hardware this sits on nor the background load.
A query with so many joins is always going to be slow - are they all needed?
the main problem is on the p table which is scanning the entire table
Full table scans are not automatically bad. The cost of dereferencing an index lookup as opposed to a streaming read is about 20 times more. Since the only constraint you apply to this table is its joins to other tables, there's nothing in the question you asked to suggest there is much scope for improving this.

Related

Assistance in Improving a query's performace

Overview:
I have a system that builds the query statements. Some of which must join some tables to others based on set parameters passed into the system. When running some performance tests on the queries created I noticed that some of the queries were doing FULL TABLE SCANS, which in many cases, from what I've read is not good for large tables.
What I'm trying to do:
1 - Remove the full table scans
2 - Speed up the Query
3 - Find out if there is a more efficient query I can have the system build instead
The Query:
SELECT a.p_id_one, b.p_id_two, b.fk_id_one, c.fk_id_two, d.fk_id_two,
d.id_three, d.fk_id_one
FROM ATable a
LEFT JOIN BTable b ON a.p_id_one = b.fk_id_one
LEFT JOIN CTable c ON b.p_id_two = c.fk_id_two
LEFT JOIN DTable d ON b.p_id_two = d.fk_id_two
WHERE a.p_id_one = 1234567890
The Explain
Query Time
Showing rows 0 - 10 (11 total, Query took 0.0016 seconds.)
Current issues:
1 - Query time for my system/DBMS (phpmyadmin) takes between 0.0013 seconds and 0.0017 seconds.
What have I done to fix?
The full table scans or 'ALL' type queries are being ran on tables ('BTable', 'DTable') so I've tried to use FORCE INDEX on the appropriate ids.
Using FORCE INDEX removes the full table scans but it doesn't speed up the
performance.
I double checked my fk_constraints and index relationships to ensure I'm not missing anything. So far everything checks out.
2 - Advisor shows multiple warnings a few relate back to the full table scans and the indexes.
Question(s):
Assume all indexes are available and created
1 - Is there a better way to perform this query?
2 - How many joins are too many joins?
3 - Could the joins be the problem?
4 - Does the issue rest within the WHERE clause?
5 - What optimize technique/tool could I have missed?
6 - How can I get this query to perform at a speed between 0.0008 and 0.0001?
If images and visuals are needed to help clarify my situation please do ask in a comment below. I appreciate any and all assistance.
Thank you =)
"p_id_one" does not tell us much. Is this an auto_increment? Real column names sometimes gives important clues of cardinality and intent. As Willem said, "there must be more to this issue" and "what is the overall problem".
LEFT -- do you need it? It prevents certain forms of optimizations; remove it if the 'right' table row is not optional.
WHERE a.p_id_one = 1234567890 needs INDEX(p_id_one). Is that the PRIMARY KEY already? In that case, an extra INDEX is not needed. (Please provide SHOW CREATE TABLE.)
Are those really the columns/expressions you are SELECTing? It can make a difference -- especially when suggesting a "covering index" as an optimization.
Please provide the output from EXPLAIN SELECT ... (That is not the discussion you did provide.) That output would help with clues of 1:many, cardinality, etc.
If these are FOREIGN KEYs, you already have indexes on b.fk_id_one, c.fk_id_two, d.fk_id_two; so that is nothing more to do there.
1.6ms is an excellent time for a query involving 4 tables. Don't plan on speeding it up significantly. You probably handle hundreds of connections doing thousands of similar queries per second. Do you need more than that?
Are you using InnoDB? That is better at concurrent access.
Your example does not seem to have any full table scans; please provide an example that does.
ALL on a 10-row table is nothing to worry about. On a million-row table it is a big deal. Will your tables grow significantly? You should note this when worrying about ALL: A full table scan is sometimes faster than using the 'perfect' index. The optimizer decide on the scan when the estimated number of rows is more than about 20% of the table. A table scan is efficient because it is scanning straight through the table, even if skipping 80% of the rows. Using an index is more complex -- the index is scanned, but for each row found in the index, a lookup is needed into the data to find the row. If you see ALL when you don't think you should, then probably the index is not very selective. Don't worry.
Don't use FORCE INDEX -- although it may help the query with today's values, it may hurt tomorrow's query.

mysql determining which part of a query is the slowest

I've written a select statement in mySQL. The duration is 50 seconds, and the fetch is 206 seconds. This is a long time. I'd like to understand WHICH part of my query is inefficient so I can improve its run time, but I'm not sure how to do that in mySQL.
My table has a little over 1,000,000 records. I have an index built in as well:
KEY `idKey` (`id`,`name`),
Here is my query:
SELECT name, id, alt_id, count(id), min(cost), avg(resale), code from
history where name like "%brian%" group by id;
I've looked at the mySQL Execution Plan, but I can't garner from that what is wrong:
If I highlight over the "Full Index Scan" part of the image, I see this:
Access Type: Index
Full Index Scan
Key/Index:
Used Key Parts: id, name
Possible Keys: idKey, id-Key, nameKey
Attach Condition:
(`allhistory`.`history`.`name` LIKE '%brian%')
Rows Examined Per Scan: 1098181
Rows Produced Per Join: 1098181
Filter: 100%
I know I can just scan a smaller subset of data by adding a LIMIT 100 into the query, and while it makes the time much shorter, (28 second duration, 0.000 sec Fetch,) I also want to see all the records - so I don't really want to put a limit on it.
Can someone more knowledgeable on this topic suggest where my query, my index, or my methodology might be inefficient for what I'm trying to accomplish?
This question has a solution only in mysql full text search functionality.
I don't consider the use of like a workable solution. Table scans are not a solution with millions of rows.
I wrote up an answer in this link, I hope you find a workable solution for yours with that reference and quick walk thru.
Here is one of the Mysql Manual Pages on Full Text Search.
I'm thinking your covered index may be backwards. Try switching the order (name, id). That way the WHERE clause can take advantage of the index.

How to do performance tuning for huge MySQL table?

I have a MySQL table MtgoxTrade(id,time,price,amount,type,tid) with more than 500M+ records, i need to query the three fields (time,price,amount) from all records:
SELECT time, price, amount FROM MtgoxTrade;
It spends 110 seconds on Win7 which is too slow,my questions are:
Will a compound index help on this? Note that my SQL query has no WHERE clause
Any other optimization could be made improve the query performance here?
Updated: I'm sorry that MtgoxTrade table have totally 6 fields: (id,time,price,amount,type,tid). My SQL only need to query three fields (time,price,amount). And i already tried to add composite index on (time,price,amount), but seems no help.
If this is your real query - NO, nothing could possibly help. Come to think of it - you are asking to deliver contents of whole 500M+ table! It will be slow no matter what you do - whole table must be processed.
If you can constrain your program logic to only process some smaller subset of your table, then it is possible to make it faster.
For example, you can process only results for last month using WHERE clause:
SELECT time, price, amount
FROM MtgoxTrade
WHERE time BETWEEN '2013-09-01' AND '2013-09-21'
This can work really fast, but you would still need to add index on time field, like this:
CREATE INDEX mtgoxtrade_time_idx ON mtgoxtrade (time);

MySQL Query Caching (2)

This is not a problem but it belongs to site optimization. I have 110K records of hotels. When I use SELECT something query it will pulled out data from 110k records.
If I search a hotel list with more than 3 star rating, price between 100 - 300 $ and within Mexico City. Suppose I got 45 matching results.
Is there any other way when I add more refinement, it will pulled out data from just only the 45 matching and not go with the 110K data?
The key is indexes my friend... make sure you have indexes of all items used in the WHERE and this will reduce cardinality when selecting...
On a side not... 110k rows is still an extremely small data set for MySQL so shouldn't pose much of a performance issue if you haven't got correct indexing on the table anyway.
It is more depend on how often your data updates.
See.
The MySQL Query Cache
Query Caching in MySQL
Caching question MySQL or Filesystem
I am saying that is there any other way when I add more refinement, it
will pulled out data from just only the 45 matching and not go with
the 110K data.
Then make view of those 45 rows and apply query to it.
Create a view using query
Create view refined as select * from ....
And after that add more select queries to that view
like
Select * from refined where ...
Firs of all, i tend to agree with Brian, indexes matter.
Check what kind(s) of queries are most frequent, and construct multi-column indexes on the table accordingly. Note that the order of columns in the indexes does matter (as the index is a tree, first column appears in tree root, so if your query does not use that column - the whole tree is useless).
Enable slow query log to see what queries actually take long (if any), or not use indexes, so you can improve indexes over time.
Having said this, query cache is a real performance boost, if your table data is mostly read. Here is a useful article on mysql query cache.

how to query data in a fast way after table splitting?

I have a MySQL table about 1000 million records. It is very slow when I make a query.
So I split this table by ID into 10 sub-tables with the same structrue.
table_1(1-1000000)
table_2(10000001-2000000)
table_3(20000001-3000000)
......
But how can i query data in a fast way after table splitting?
when I query a user like this: select name from table where name='mark', I don't know go to which table for querying beacuse I can get the ID range.
Splitting tables this way is totally not the right way when you show your example query. You created more issues actually than solving anything.
Let's get back to the big table:
Step 1 is to see why it is slow, so post explain sql command to get an overview.
Step 2 is to see whether you can improve that query. Stating things like indexes are not a good solution can be true. If so please provide measurements showing this.
Step 3 is to think outside the box. You are running queries in a very big table which gets constantly inserts. Consider using a specifically for search designed index. For example consider indexing with Solr for the search commands.
Eventually you might even get to the hardware point, it just can't get faster on this hardware. But first follow through steps, add the right information, concrete measurements and specifications so you can get even more complete support on your case.