I switched an app from Laravel 5.1 to Laravel 5.8 by setting up a fresh 5.8 project and copying over the files, making some adjustments here and there.
The issue is that the queries with whereHas have become extremely slow.
Here is an example code:
Article::whereHas('categories', function ($category) {
$category->where('link', 'foto');
})
->active()
->recent()
->take(3)
->get();
This code generates the following query on Laravel 5.1 and completes in 0.05-0.07 seconds.
SELECT *
FROM `articles`
WHERE `articles`.`deleted_at` IS NULL
AND
(SELECT count(*)
FROM `categories`
INNER JOIN `article_category`
ON `categories`.`id` = `article_category`.`category_id`
WHERE `article_category`.`article_id` = `articles`.`id`
AND `link` = 'foto'
AND `categories`.`deleted_at` IS NULL) >= 1
ORDER BY IFNULL(published_at, created_at) DESC
LIMIT 3
and here's the explain:
+------+--------------------+------------------+------+--------------------------------------------------------------------------+-------------------------------------+---------+-----------------+------+----------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+--------------------+------------------+------+--------------------------------------------------------------------------+-------------------------------------+---------+-----------------+------+----------+------------------------------------+
| 1 | PRIMARY | articles | ALL | NULL | NULL | NULL | NULL | 4846 | 100.00 | Using where; Using filesort |
| 2 | DEPENDENT SUBQUERY | categories | ref | PRIMARY,categories_link_index | categories_link_index | 767 | const | 1 | 100.00 | Using index condition; Using where |
| 2 | DEPENDENT SUBQUERY | article_category | ref | article_category_category_id_foreign,article_category_article_id_foreign | article_category_article_id_foreign | 4 | lcf.articles.id | 1 | 100.00 | Using where |
+------+--------------------+------------------+------+--------------------------------------------------------------------------+-------------------------------------+---------+-----------------+------+----------+------------------------------------+
While on Laravel 5.8 it generates the following query that runs 10-13 seconds.
SELECT *
FROM `articles`
WHERE EXISTS
(SELECT *
FROM `categories`
INNER JOIN `article_category`
ON `categories`.`id` = `article_category`.`category_id`
WHERE `articles`.`id` = `article_category`.`article_id`
AND `link` = 'foto'
AND `categories`.`deleted_at` IS NULL)
AND `articles`.`deleted_at` IS NULL
ORDER BY IFNULL(published_at, created_at) DESC
LIMIT 3
and here's the explain
+------+--------------+------------------+------+--------------------------------------------------------------------------+--------------------------------------+---------+-------------------+------+----------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+--------------+------------------+------+--------------------------------------------------------------------------+--------------------------------------+---------+-------------------+------+----------+------------------------------------+
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 107 | 100.00 | Using temporary; Using filesort |
| 1 | PRIMARY | articles | ALL | PRIMARY | NULL | NULL | NULL | 4846 | 75.01 | Using where |
| 2 | MATERIALIZED | categories | ref | PRIMARY,categories_link_index | categories_link_index | 767 | const | 1 | 100.00 | Using index condition; Using where |
| 2 | MATERIALIZED | article_category | ref | article_category_category_id_foreign,article_category_article_id_foreign | article_category_category_id_foreign | 4 | lcf.categories.id | 107 | 100.00 | |
+------+--------------+------------------+------+--------------------------------------------------------------------------+--------------------------------------+---------+-------------------+------+----------+------------------------------------+
I ran both codebases on the same server, same MariaDB 10.2.24 database. The dataset size is approximately 6k articles, 80 categories and 10k records in the pivot.
What should I do here? So far I have discovered a bit more than 10 queries suffering from this problem in the codebase. Can I somehow flip a switch in config and make them all check the existence using the old way? Or should I somehow instruct every query to improve their plan?
UPDATE
I just noticed that if I use whereHas(..., '>', 0) I can get almost the old query (actually WHERE (SELECT COUNT...) > 0) with the old performance. However, whereHas(..., '>=', 1) does reduce itself to query with EXISTS. A question remains whether I could switch this behaviour over whole app without editing each query.
ANSWERS TO COMMENTS
Indexes onr articles
+----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| articles | 0 | PRIMARY | 1 | id | A | 4846 | NULL | NULL | | BTREE | | |
| articles | 1 | articles_author_id_foreign | 1 | author_id | A | 18 | NULL | NULL | YES | BTREE | | |
+----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Indexes on article_category
+------------------+------------+--------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+--------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| article_category | 0 | PRIMARY | 1 | id | A | 9676 | NULL | NULL | | BTREE | | |
| article_category | 1 | article_category_category_id_foreign | 1 | category_id | A | 90 | NULL | NULL | | BTREE | | |
| article_category | 1 | article_category_article_id_foreign | 1 | article_id | A | 9676 | NULL | NULL | | BTREE | | |
+------------------+------------+--------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
The data to run the examples can be found here: https://gist.github.com/tontonsb/b97bc33066a67e9d8bc3654f2c01c103
This runs faster, but it's still 2.8 vs 0.07 seconds so the problem can be clearly seen, at least on MariaDB 10.2.24. Probably the speed improved because I have removed other columns and their indices.
Try this:
mpyw/eloquent-has-by-non-dependent-subquery: Convert has() and whereHas() constraints to non-dependent subqueries.
$articles = Article::query()
->hasByNonDependentSubquery('categories', function ($category) {
$category->where('link', 'foto');
})
->active()
->recent()
->take(3)
->get();
Related
I'm stuck with a query on a InnoDB table in a MySQL database.
I need to find orders based on a fulltext search on two text fields which contain order and customer details in json encoded text.
Here is the table schema:
+--------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | int(11) | NO | MUL | NULL | |
| comment | text | NO | | NULL | |
| modified | datetime | NO | | NULL | |
| created | datetime | NO | MUL | NULL | |
| items | mediumtext | NO | MUL | NULL | |
| addressinfo | text | NO | | NULL | |
+--------------+------------+------+-----+---------+----------------+
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| orders | 0 | PRIMARY | 1 | id | A | 69144 | NULL | NULL | | BTREE | | |
| orders | 1 | user_id | 1 | user_id | A | 45060 | NULL | NULL | | BTREE | | |
| orders | 1 | created | 1 | created | A | 69240 | NULL | NULL | | BTREE | | |
| orders | 1 | search | 1 | items | NULL | 69240 | NULL | NULL | | FULLTEXT | | |
| orders | 1 | search | 2 | addressinfo | NULL | 69240 | NULL | NULL | | FULLTEXT | | |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
The table has around 150.000 rows.
It has one fulltext index on the items and addressinfo column.
And here comes the query:
SELECT
id
FROM
orders
WHERE
MATCH (items, addressinfo) AGAINST (
'+simon* +white* ' IN BOOLEAN MODE
)
ORDER BY
id DESC
LIMIT
20
This is the EXPLAIN result:
+----+-------------+--------+------------+----------+---------------+--------+---------+-------+------+----------+---------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+----------+---------------+--------+---------+-------+------+----------+---------------------------------------------------+
| 1 | SIMPLE | orders | NULL | fulltext | search | search | 0 | const | 1 | 100.00 | Using where; Ft_hints: no_ranking; Using filesort |
+----+-------------+--------+------------+----------+---------------+--------+---------+-------+------+----------+---------------------------------------------------+
On large resultsets the query takes around 30 seconds to process on a standard LAMP VM.
Without ordering by ORDER BY id DESC the query is processed much faster in around 0.6 seconds.
The only difference in the EXPLAIN result is that "Using filesort" is missing in the faster query. Measuring the query says that 98% of the processing time (27s) is used for "Creating Sort Index".
Is there any way to do the fulltext search on this table with ORDER BY in a reasonable processing time (less than a second)?
I already tried different approaches e.g. putting the order by column into the fulltext index (text_id as TEXT column) with no luck.
The approach from here: How to make a FULLTEXT search with ORDER BY fast? is also not faster.
As the application runs on a shared host I'm very limited in optimizing MySQL ini values or Memory values.
Thanks a lot!
You might gain some time when using a delivered table.
try it.
Query
SELECT
orders.id
FROM (
SELECT
id
FROM
orders
WHERE
MATCH (items, addressinfo) AGAINST (
'+simon* +white* ' IN BOOLEAN MODE
)
)
AS
orders_match
INNER JOIN
orders
ON
orders_match.id = orders.id
ORDER BY
orders.id DESC
LIMIT 20
I have two tables CUSTOMER_ORDER_PUBLIC and LINEITEM_PUBLIC which have the following indices:
+-----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| CUSTOMER_ORDER_PUBLIC | 1 | O_ORDERKEY | 1 | O_ORDERKEY | A | 2633457 | NULL | NULL | YES | BTREE | | |
| CUSTOMER_ORDER_PUBLIC | 1 | O_ORDERDATE | 1 | O_ORDERDATE | A | 2350 | NULL | NULL | YES | BTREE | | |
| CUSTOMER_ORDER_PUBLIC | 1 | PUB_C_CUSTKEY | 1 | PUB_C_CUSTKEY | A | 273000 | NULL | NULL | | BTREE | | |
+-----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
and:
+-----------------+------------+----------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+----------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| LINEITEM_PUBLIC | 0 | PRIMARY | 1 | PUB_L_ORDERKEY | A | 16488602 | NULL | NULL | | BTREE | | |
| LINEITEM_PUBLIC | 0 | PRIMARY | 2 | PUB_L_LINENUMBER | A | 44146904 | NULL | NULL | | BTREE | | |
| LINEITEM_PUBLIC | 1 | LINEITEM_PRIVATE_FK2 | 1 | PUB_L_PARTKEY | A | 2083757 | NULL | NULL | | BTREE | | |
| LINEITEM_PUBLIC | 1 | LINEITEM_PRIVATE_FK3 | 1 | PUB_L_SUPPKEY | A | 85599 | NULL | NULL | | BTREE | | |
+-----------------+------------+----------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Each time I run an Explain of a specific query I get the following:
mysql> EXPLAIN SELECT *
FROM CUSTOMER_ORDER_PUBLIC
LEFT OUTER JOIN LINEITEM_PUBLIC ON O_ORDERKEY= PUB_L_ORDERKEY;
+----+-------------+-----------------------+------------+------+---------------+---------+---------+---------------------------------------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+------+---------------+---------+---------+---------------------------------------+---------+----------+-------+
| 1 | SIMPLE | CUSTOMER_ORDER_PUBLIC | NULL | ALL | NULL | NULL | NULL | NULL | 2900769 | 100.00 | NULL |
| 1 | SIMPLE | LINEITEM_PUBLIC | NULL | ref | PRIMARY | PRIMARY | 4 | TPCH.CUSTOMER_ORDER_PUBLIC.O_ORDERKEY | 2 | 100.00 | NULL |
+----+-------------+-----------------------+------------+------+---------------+---------+---------+---------------------------------------+---------+----------+-------+
For some reason the query optimizer is not using the index (O_ORDERKEY) even if I use a FORCE INDEX. I know a lot of people posted similar questions but I tried everything and nothing seems to help!
Any other suggestions would be greatly appreciated!
Edit:
The query used is the following:
SELECT * FROM CUSTOMER_ORDER_PUBLIC
LEFT OUTER JOIN LINEITEM_PUBLIC ON O_ORDERKEY= PUB_L_ORDERKEY;
For this query:
SELECT *
FROM CUSTOMER_ORDER_PUBLIC cop LEFT OUTER JOIN
LINEITEM_PUBLIC lp
ON cop.O_ORDERKEY = lp.PUB_L_ORDERKEY;
For this query, you want an index on LINEITEM_PUBLIC(PUB_L_ORDERKEY). Of course, you already have this index because this is the first key in the primary key.
There is no reason to use an index on CUSTOMER_ORDER_PUBLIC, because all rows in the table are going to the result set.
The FORCE INDEX hint tells the optimizer that a full scan of the table is very expensive.
The most likely explanation for the observed behavior is that the optimizer thinks it needs to access every row in the table, and the index suggested in the hint is not a covering index for the query.
Based on the EXPLAIN output, we only see evidence of a single predicate on the JOIN operation. And it looks like the optimizer is choosing CUSTOMER_ORDER_PUBLIC as the driving table for the join, and using an index on the LINEITEM_PUBLIC table.
I'm not sure any of that answers the question you asked. (I'm not sure that there was a question asked.) Absent an actual SQL statement, we are just making guesses.
I have a question: Aside from the FORCE INDEX hint, why would we expect the optimizer to use a particular index? And why would that be a reasonable expectation?
I have a query that joins two tables and orders the data on the primary key. This is resulting in the very popular problem of MySQL "Using index; Using temporary; Using filesort."
The issue is causing a severe latency problem in my production tables with about 400k records.
Here's more info:
I have two tables: Doctor and Area. The Doctor table has a foreign key pointing to Area.
Doctor:
+-----------------------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| area_id | int(11) | NO | MUL | NULL | |
+-----------------------------+---------------+------+-----+---------+----------------+
Doctor indexes:
+---------------+------------+------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| doctor | 0 | PRIMARY | 1 | id | A | 5546 | NULL | NULL | | BTREE | | |
| doctor | 1 | doctor_dfd0e917 | 1 | area_id | A | 29 | NULL | NULL | | BTREE | | |
+---------------+------------+------------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Area:
+------------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
+------------------------+-------------+------+-----+---------+----------------+
And the Area indexes:
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| area | 0 | PRIMARY | 1 | id | A | 24 | NULL | NULL | | BTREE | | |
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
I'm trying to run the following query:
SELECT `doctor`.`id`,
`area`.`id`
FROM
`doctor`
INNER JOIN
`area` ON (`doctor`.`area_id` = `area`.`id`)
ORDER BY
`doctor`.`id` DESC LIMIT 100;
The EXPLAIN returns the following (with the problematic Using index; Using temporary; Using filesort):
+----+-------------+---------------+-------+------------------------+------------------------+---------+--------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+------------------------+------------------------+---------+--------------+------+----------------------------------------------+
| 1 | SIMPLE | area | index | PRIMARY | PRIMARY | 4 | NULL | 24 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | doctor | ref | doctor_dfd0e917 | doctor_dfd0e917 | 4 | area.id | 191 | Using index |
+----+-------------+---------------+-------+------------------------+------------------------+---------+--------------+------+----------------------------------------------+
If I remove the ORDER BY clause, I get the desired effect:
+----+-------------+---------------+-------+------------------------+------------------------+---------+--------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+------------------------+------------------------+---------+--------------+------+----------------------------------------------+
| 1 | SIMPLE | area | index | PRIMARY | PRIMARY | 4 | NULL | 24 | Using index |
| 1 | SIMPLE | doctor | ref | doctor_dfd0e917 | doctor_dfd0e917 | 4 | area.id | 191 | Using index |
+----+-------------+---------------+-------+------------------------+------------------------+---------+--------------+------+----------------------------------------------+
Why is the ORDER BY clause causing problems here even though I'm using the primary key?
Thank you in advance.
It seems that you only have one area per doctor. See how this query works:
SELECT d.id,
(SELECT a.id FROM area a ON a.id = d.area_id) as area_id
FROM doctor d
ORDER BY d.id DESC
LIMIT 100;
If you are using inner join to test for the presence of a doctor in the table, then add:
SELECT d.id,
(SELECT a.id FROM area a ON a.id = d.area_id) as area_id
FROM doctor d
WHERE EXISTS (SELECT 1 FROM area a ON a.id = d.area_id)
ORDER BY d.id DESC
LIMIT 100;
There is a good chance that both of these will scan the doctors table in order, picking up the information from area as needed.
sorry but after browsing nearly every posts and questions about it, I still can't manage to get rid of "Using temporary" and "Using filesort" in a simple query. I know this is a problem of keys but I can't find the right combination...
I also don't know if the order of the join defined by the optimizer is ok, I tested other orders using STRAIGHT_JOIN but nothing better... The query is pretty slow using ORDER BY, but really fast without it and of course without "Using temporary" and "Using filesort"! (there is something like 100.000 rows in points table)
The query :
SELECT points.id,
points.id_owner,
points.point_title,
points.point_desc,
users.user_id,
users.username
FROM points,
JOIN users ON points.id_owner = users.user_id
JOIN follows ON follows.id_followed = points.id_owner
WHERE points.deleted = 0
AND follows.id_follower = 22
ORDER BY points.id DESC
LIMIT 10
the explain :
+----+-------------+---------+--------+---------------+------------+---------+---------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+---------------+------------+---------+---------------------+------+----------------------------------------------+
| 1 | SIMPLE | follows | ref | FOLLOW_DUO | FOLLOW_DUO | 4 | const | 2 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | users | eq_ref | PRIMARY | PRIMARY | 4 | follows.id_followed | 1 | |
| 1 | SIMPLE | points | ref | GETPOINT1 | GETPOINT1 | 5 | users.user_id,const | 460 | Using where |
+----+-------------+---------+--------+---------------+------------+---------+---------------------+------+----------------------------------------------+
And here is the SHOW INDEX from the three tables :
SHOW INDEX FROM points
+--------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| points | 0 | PRIMARY | 1 | id | A | 91987 | NULL | NULL | | BTREE | |
| points | 0 | GETPOINT1 | 1 | id_owner | A | NULL | NULL | NULL | | BTREE | |
| points | 0 | GETPOINT1 | 2 | deleted | A | NULL | NULL | NULL | | BTREE | |
| points | 0 | GETPOINT1 | 3 | id | A | 91987 | NULL | NULL | | BTREE | |
+--------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
SHOW INDEX FROM users
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| users | 0 | PRIMARY | 1 | user_id | A | 4 | NULL | NULL | | BTREE | |
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
SHOW INDEX FROM follows
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| follows | 0 | PRIMARY | 1 | id | A | 5 | NULL | NULL | | BTREE | |
| follows | 0 | FOLLOW_DUO | 1 | id_follower | A | NULL | NULL | NULL | | BTREE | |
| follows | 0 | FOLLOW_DUO | 2 | id_followed | A | 5 | NULL | NULL | | BTREE | |
| follows | 1 | id_follower | 1 | id_follower | A | NULL | NULL | NULL | | BTREE | |
| follows | 1 | id_followed | 1 | id_followed | A | NULL | NULL | NULL | | BTREE | |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
From now I don't know what to test to try to avoid the "Using temporary" and "Using filesort"... So if you have an idea for me... Thank you in advance for your help !
Looks like so many rows are being examined from points table. I had tried following trick to avoid temporary table usage in my project. Please do as follows and give it an explain to see any improvement:
Delete all indexes called 'GETPOINT1' except Primary Key Index form points table.
Add covering index on columns (deleted, id_owner). Please keep the order of columns as mentioned.
If you still don't see any improvement, remove above index and add index again in order (id, deleted, id_owner) and (deleted, id_owner, id) columns and try again
In addition you may remove follows.id_follower = 22 from where clause and put it in join condition like JOIN follows ON follows.id_followed = points.id_owner AND follows.id_follower = 22
Please also add index in order as (id_follower, id_owner) in follows table.
I do not guarantee but above should be able to give you improvements.
I am trying to improve performance for an application. I might need to create summary tables that run on cron so the app doesn't take as long to load (5-10 seconds). Is that the best idea?
Given the following table:
mysql> describe school_data_sets_numeric_data;
+--------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| data_set_nid | int(11) | NO | MUL | NULL | |
| school_nid | int(11) | NO | MUL | NULL | |
| year | int(11) | NO | MUL | NULL | |
| description | varchar(255) | NO | | NULL | |
| value | decimal(18,5) | NO | | NULL | |
+--------------+---------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)
And the following queries (run once for each data_set_nid for a school)
This query runs fast (0 seconds):
SELECT year, description, CONCAT(FORMAT((value/(SELECT SUM(value)
FROM `school_data_sets_numeric_data` as numeric_data_inner
WHERE year = numeric_data_outer.year and data_set_nid = numeric_data_outer.data_set_nid and school_nid = numeric_data_outer.school_nid)) * 100, 2), '%') as value
FROM `school_data_sets_numeric_data` as numeric_data_outer
WHERE data_set_nid = 38251 and school_nid = 32805 ORDER BY id DESC;
Explain:
+----+--------------------+--------------------+------+---------------------------------------------+--------------+---------+-----------------------------------------------------------------------------------------------------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+------+---------------------------------------------+--------------+---------+-----------------------------------------------------------------------------------------------------------+------+-----------------------------+
| 1 | PRIMARY | numeric_data_outer | ref | data_set_nid,data_set_nid_2,school_nid | data_set_nid | 8 | const,const | 17 | Using where; Using filesort |
| 2 | DEPENDENT SUBQUERY | numeric_data_inner | ref | year,data_set_nid,data_set_nid_2,school_nid | data_set_nid | 8 | rocdocs_main_drupal_7.numeric_data_outer.data_set_nid,rocdocs_main_drupal_7.numeric_data_outer.school_nid | 9 | Using where |
+----+--------------------+--------------------+------+---------------------------------------------+--------------+---------+-----------------------------------------------------------------------------------------------------------+------+-----------------------------+
This query runs slow (1.43 seconds):
SELECT year, description, CONCAT(FORMAT((SUM(value)/(SELECT SUM(value)
FROM `school_data_sets_numeric_data` as numeric_data_inner
WHERE year = numeric_data_outer.year and data_set_nid = numeric_data_outer.data_set_nid)) * 100, 2), '%') as value
FROM `school_data_sets_numeric_data` as numeric_data_outer
WHERE data_set_nid = 38251 GROUP BY year,description ORDER BY id DESC;
Explain:
+----+--------------------+--------------------+------+----------------------------------+----------------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+------+----------------------------------+----------------+---------+-------+-------+----------------------------------------------+
| 1 | PRIMARY | numeric_data_outer | ref | data_set_nid,data_set_nid_2 | data_set_nid_2 | 4 | const | 90640 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | numeric_data_inner | ref | year,data_set_nid,data_set_nid_2 | year | 4 | func | 38871 | Using where |
+----+--------------------+--------------------+------+----------------------------------+----------------+---------+-------+-------+----------------------------------------------+
Correlated subqueries/subselects are often a bottelneck - partly due to the fact that MySql only has a nested loop join algorithm and no hash-joins/merge-joins.
I would try joining your main select to a derived table holding all the SUM values you need.