I have a working, nice, indexed SQL query aggregating notes (sum of ints) for all my users and others stuffs. This is "query A".
I want to use this aggregated notes in others queries, say "query B".
If I create a View based on "query A", will the indexes of the original query will be used when needed if I join it in "query B" ?
Is that true for MySQL ? For others flavors of SQL ?
Thanks.
In MySQL, you cannot create an index on a view. MySQL uses indexes of the underlying tables when you query data against the views that use the merge algorithm. For the views that use the temptable algorithm, indexes are not utilized when you query data against the views.
https://www.percona.com/blog/2007/08/12/mysql-view-as-performance-troublemaker/
Here's a demo table. It has a userid attribute column and a note column.
mysql> create table t (id serial primary key, userid int not null, note int, key(userid,note));
If you do an aggregation to get the sum of note per userid, it does an index-scan on (userid, note).
mysql> explain select userid, sum(note) from t group by userid;
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| 1 | SIMPLE | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
1 row in set (0.00 sec)
If we create a view for the same query, then we can see that querying the view uses the same index on the underlying table. Views in MySQL are pretty much like macros — they just query the underlying table.
mysql> create view v as select userid, sum(note) from t group by userid;
Query OK, 0 rows affected (0.03 sec)
mysql> explain select * from v;
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
2 rows in set (0.00 sec)
So far so good.
Now let's create a table to join with the view, and join to it.
mysql> create table u (userid int primary key, name text);
Query OK, 0 rows affected (0.09 sec)
mysql> explain select * from v join u using (userid);
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | test.u.userid | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
3 rows in set (0.01 sec)
I tried to use hints like straight_join to force it to read v then join to u.
mysql> explain select * from v straight_join u on (v.userid=u.userid);
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using where; Using join buffer (Block Nested Loop) |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
"Using join buffer (Block Nested Loop)" is MySQL's terminology for "no index used for the join." It's just looping over the table the hard way -- by reading batches of rows from start to finish of the table.
I tried to use force index to tell MySQL that type=ALL is to be avoided.
mysql> explain select * from v straight_join u force index(PRIMARY) on (v.userid=u.userid);
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | v.userid | 1 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
Maybe this is using an index for the join? But it's weird that table u is before table t in the EXPLAIN. I'm frankly not sure how to understand what it's doing, given the order of rows in this EXPLAIN report. I would expect the joined table should come after the primary table of the query.
I only put a few rows of data into each table. One might get some different EXPLAIN results with a larger representative sample of test data. I'll leave that to you to try.
I have a MySQL query which has a JOIN of 12 tables. When I explain the query, It showing 394699 rows for one table and 185368 rows for another table. All other tables has 1-3 rows. The total result which I am getting from the query id 472 rows only. But for that, it is taking more than 1 minute.
Is there any way to check how many rows has been analyzed to produce such a result? So that, I can find which is the table costs the higher time.
I am giving the query structure below. As the table structure is too high, I am not able to provide it here.
SELECT h.nid,h.attached_nid,h.created, s.field_species_value as species, g.field_gender_value as gender, u.field_unique_id_value as unqid, n.title, dob.field_adult_healthy_weight_value as birth_date, dcolor.field_dog_primary_color_value as dogcolor, ccolor.field_primary_color_value as catcolor, sdcolor.field_dog_secondary_color_value as sdogcolor, sccolor.field_secondary_color_value as scatcolor, dpattern.field_dog_pattern_value as dogpattern, cpattern.field_cat_pattern_value as catpattern
FROM table1 h
JOIN table2 n ON n.nid = h.nid
JOIN table3 s ON n.nid = s.entity_id
JOIN table4 u ON n.nid = u.entity_id
LEFT JOIN table5 g ON n.nid = g.entity_id
LEFT JOIN table6 dob ON n.nid = dob.entity_id
LEFT JOIN table7 AS dcolor ON n.nid = dcolor.entity_id
LEFT JOIN table8 AS ccolor ON n.nid = ccolor.entity_id
LEFT JOIN table9 AS sdcolor ON n.nid = sdcolor.entity_id
LEFT JOIN table10 AS sccolor ON n.nid = sccolor.entity_id
LEFT JOIN table11 AS dpattern ON n.nid = dpattern.entity_id
LEFT JOIN table12 AS cpattern ON n.nid = cpattern.entity_id
WHERE h.title = '4208'
AND ((h.created BETWEEN 1483257600 AND 1485935999))
AND h.uid!=1
AND h.uid IN(
SELECT etid
FROM `table`
WHERE gid=464
AND entity_type='user')
AND h.attached_nid>0
ORDER BY CAST(h.created as UNSIGNED) DESC;
Below is the EXPLAIN result which I get
+------+--------------+---------------+--------+----------------------+---------------------+---------+----------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+---------------+--------+----------------------+---------------------+---------+----------------------+--------+----------------------------------------------+
| 1 | PRIMARY | s | index | entity_id | field_species_value | 772 | NULL | 394699 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | u | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | n | eq_ref | PRIMARY | PRIMARY | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | g | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | dob | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | dcolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | ccolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | sdcolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | sccolor | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | dpattern | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | cpattern | ref | entity_id | entity_id | 4 | pantheon.s.entity_id | 1 | |
| 1 | PRIMARY | h | ref | attached_nid,nid,uid | nid | 5 | pantheon.s.entity_id | 3 | Using index condition; Using where |
| 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 4 | func | 1 | Using where |
| 2 | MATERIALIZED | og_membership | ref | entity,gid | gid | 4 | const | 185368 | Using where |
+------+--------------+---------------+--------+----------------------+---------------------+---------+----------------------+--------+----------------------------------------------+
You can find the ROWS_EXAMINED by using the Performance Schema.
Here is a link to the performance schema quick start guide.
https://dev.mysql.com/doc/refman/5.5/en/performance-schema-quick-start.html
This is the query I run in PHP applications, to find out what queries I need to optimize. You should be able to adapt it pretty easily.
The query finds the stats on the query that was run before this one. So in my apps, I run query after every query I run, store the results, then at the end of the PHP script I output the stats for every query I ran during the request.
SELECT `EVENT_ID`, TRUNCATE(`TIMER_WAIT`/1000000000000,6) as Duration,
`SQL_TEXT`, `DIGEST_TEXT`, `NO_INDEX_USED`, `NO_GOOD_INDEX_USED`, `ROWS_AFFECTED`, `ROWS_SENT`, `ROWS_EXAMINED`
FROM `performance_schema`.`events_statements_history`
WHERE
`CURRENT_SCHEMA` = '{$database}' AND `EVENT_NAME` LIKE 'statement/sql/%'
AND `THREAD_ID` = (SELECT `THREAD_ID` FROM `performance_schema`.`threads` WHERE `performance_schema`.`threads`.`PROCESSLIST_ID` = CONNECTION_ID())
ORDER BY `EVENT_ID` DESC LIMIT 1;
To decrease the number of rows accessed from og_membership, try adding an index containing the gid, entity_type, and etid fields. Including gid and entity_type should make the lookup more performant and including etid will make the index a covering index.
After adding the index, run EXPLAIN again to look at the results. Based on the new explain plan, either keep the index, remove the index, and/or add an additional index. Keep doing this until you get results you are satisfied with.
For sure, you will want to try and eliminate any mentions of Using temporary or Using filesort. Using temporary implies a temporary table is being used to make this query probably for the sheer size of your intermittent. Using filesort implies ordering isn't being satisfied with an index and is being done by examining the matching rows.
An detail explanation about EXPLAIN can be found at https://dev.mysql.com/doc/refman/5.7/en/explain-output.html.
Key-Value (EAV) schema sucks.
Indexes:
table1: INDEX(title, created)
table1: INDEX(uid, title, created)
table: INDEX(gid, entity_type, etid)
table* -- Is `entity_id` already an index? Can it be the PRIMARY KEY?
Does nid need to be NULL instead of NOT NULL?
If those don't do enough, try:
And turn the IN ( SELECT ... ) into a JOIN ( SELECT ... ) USING(hid)
If you still need help, please provide SHOW CREATE TABLE and EXPLAIN SELECT ...
have some table with index for two columns (user_id,date)
and SQL query
select user_id, stat.in, stat.out, stat.time, date
from stat
where user_id in (select id from users force index (street_id) where street_id=30);
or
select user_id, stat.in, stat.out, stat.time, date
from stat where user_id in (select id from users force index (street_id) where street_id=30)
and date between STR_TO_DATE('2010-01-01 00:00:00', '%Y-%m-%d %H:%i:%s') and TR_TO_DATE('2014-05-22 23:59:59', '%Y-%m-%d %H:%i:%s')
In two case index must work, but I sink problem in in statement. If it's possible, how make it work?
Explain:
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
| 1 | PRIMARY | stat | ALL | NULL | NULL | NULL | NULL | 32028701 | Using where |
| 2 | DEPENDENT SUBQUERY | users | ref | street_id | street_id | 8 | const | 650 | Using where; Using index |
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
if search with one user_id index work
explain select user_id, stat.in, stat.out, stat.time, date
from stat
where user_id=3991;
Explain:
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
| 1 | SIMPLE | stat | ref | user_id_2 | user_id_2 | 8 | const | 2973 | |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
First thing in the query the IN clause is creating havoc and if I am not wrong the indexes are not done properly.
So here is how it should be lets say the tables are as
create table users (id int, name varchar(100),street_id int);
insert into users values
(1,'a',20),(2,'b',30),(3,'c',10),(4,'d',20),(5,'e',10),(6,'f',40),(7,'g',20),
(8,'h',10),(9,'i',10),(10,'j',40);
create table stat (user_id int ,`in` int, `out` int, time int , date date);
insert into stat values
(1,1,1,20,'2014-01-01'),
(1,1,1,20,'2014-01-02'),
(3,1,1,20,'2014-01-01'),
(2,1,1,20,'2014-01-01'),
(4,1,1,20,'2014-01-02'),
(6,1,1,20,'2014-01-02'),
(7,1,1,20,'2014-01-02'),
(8,1,1,20,'2014-01-02'),
(1,1,1,20,'2014-01-02'),
(2,1,1,20,'2014-01-02'),
(3,1,1,20,'2014-01-03'),
(4,1,1,20,'2014-01-04'),
(5,1,1,20,'2014-01-04'),
(6,1,1,20,'2014-01-04'),
(7,1,1,20,'2014-01-04'),
(2,1,1,20,'2014-01-04'),
(3,1,1,20,'2014-01-04'),
(4,1,1,20,'2014-01-05'),
(5,1,1,20,'2014-01-05'),
(6,1,1,20,'2014-01-05'),
(7,1,1,20,'2014-01-05'),
(8,1,1,20,'2014-01-05'),
(9,1,1,20,'2014-01-05'),
(10,1,1,20,'2014-01-05'),
(1,1,1,20,'2014-01-06'),
(4,1,1,20,'2014-01-06');
Now add some indexes on the table
alter table users add index id_idx (id);
alter table users add index street_idx(street_id);
alter table stat add index user_id_idx(user_id);
Now if we execute the same query that you are trying to do using explain yields
EXPLAIN
select user_id, stat.`in`, stat.`out`, stat.time, date
from stat
where user_id in (select id from users force index (street_id) where street_id=30);
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
| 1 | PRIMARY | stat | ALL | NULL | NULL | NULL | NULL | 26 | Using where |
| 2 | DEPENDENT SUBQUERY | users | ref | street_idx | street_idx | 5 | const | 1 | Using where |
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
It still looks like trying to scan the entire table.
Now lets modify the query and use JOIN and see what explain has to say, note that I have index on both table for the joining key and which are of same type and size.
EXPLAIN
select
s.user_id,
s.`in`,
s.`out`,
s.time,
s.date
from stat s
join users u on u.id = s.user_id
where u.street_id=30 ;
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| 1 | SIMPLE | u | ref | id_idx,street_idx | street_idx | 5 | const | 1 | Using where |
| 1 | SIMPLE | s | ref | user_id_idx | user_id_idx | 5 | test.u.id | 3 | Using where |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
Better hun ?? Now lets try a range search
EXPLAIN
select
s.user_id,
s.`in`,
s.`out`,
s.time,
s.date
from stat s
join users u on u.id = s.user_id
where
u.street_id=30
and s.date between '2014-01-01' AND '2014-01-06'
;
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| 1 | SIMPLE | u | ref | id_idx,street_idx | street_idx | 5 | const | 1 | Using where |
| 1 | SIMPLE | s | ref | user_id_idx | user_id_idx | 5 | test.u.id | 3 | Using where |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
Still better right ??
So the underlying agenda is try avoiding IN queries. Use JOIN on indexed column and for search columns indexed them properly.