Is there anything wrong with having one view reference another view? For example, say I have a
users table
CREATE TABLE `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) NOT NULL,
`last_name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
Then for the sake of argument a view that just shows all users
CREATE VIEW all_users AS SELECT * FROM users
And then a view that just returns their first_name and last_name
CREATE VIEW full_names AS SELECT first_name, last_name FROM all_users
Are there performance issue with basing one view off another? Let's also pretend this is the simplest of examples and a real world scenario would be much more complex, but the same general concept of basing one view of another view.
It depends on the ALGORITHM used. TEMPTABLE can be pretty expensive while MERGE should be the same as using the table directly so no loss there.
And for your example is the same:
mysql> explain select * from users;
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| 1 | SIMPLE | users | system | NULL | NULL | NULL | NULL | 0 | const row not found |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
1 row in set (0.01 sec)
mysql> explain select * from all_users;
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| 1 | SIMPLE | users | system | NULL | NULL | NULL | NULL | 0 | const row not found |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
1 row in set (0.00 sec)
mysql> explain select * from full_names ;
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| 1 | SIMPLE | users | system | NULL | NULL | NULL | NULL | 0 | const row not found |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
1 row in set (0.02 sec)
Related
I have a simple table like this,
CREATE TABLE `domain` (
`id` varchar(191) NOT NULL,
`time` bigint(20) DEFAULT NULL,
`task_id` bigint(20) DEFAULT NULL,
`name` varchar(512) DEFAULT NULL
PRIMARY KEY (`id`),
KEY `idx_domain_time` (`time`),
KEY `idx_domain_task_id` (`task_id`),
FULLTEXT KEY `idx_domain_name` (`name`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
And indexed like this:
mysql> show index from domain;
+--------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+--------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| domain | 0 | PRIMARY | 1 | id | A | 2036092 | NULL | NULL | | BTREE | | | NO |
| domain | 1 | idx_domain_name | 1 | name | NULL | NULL | NULL | NULL | YES | FULLTEXT |
+--------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
Index is used when I select only the id field:
mysql> explain SELECT id FROM `domain` WHERE task_id = '3';
+------+-------------+--------+------+--------------------+--------------------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------+------+--------------------+--------------------+---------+-------+---------+-------------+
| 1 | SIMPLE | domain | ref | idx_domain_task_id | idx_domain_task_id | 9 | const | 1018046 | Using index |
+------+-------------+--------+------+--------------------+--------------------+---------+-------+---------+-------------+
1 row in set (0.00 sec)
When I select all fields, it does not work:
mysql> explain SELECT * FROM `domain` WHERE task_id = '3';
+------+-------------+--------+------+--------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------+------+--------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | domain | ALL | idx_domain_task_id | NULL | NULL | NULL | 2036092 | Using where |
+------+-------------+--------+------+--------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
mysql> explain SELECT id, name FROM `domain` WHERE task_id = '3';
+------+-------------+--------+------+--------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------+------+--------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | domain | ALL | idx_domain_task_id | NULL | NULL | NULL | 2036092 | Using where |
+------+-------------+--------+------+--------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
What's wrong?
Indexes other than the Primary Key work by storing data for the indexed field(s) in index order, along with the primary key.
So when you SELECT the primary key by the indexed field, there is enough information in the index to completely satisfy the query. When you add other fields, there's no longer enough information in the index. That doesn't mean the database won't use the index, but now it's no longer as much of a slam dunk, and it comes down more to table statistics.
MySql optimizer will try to achieve the best performance so it may ignore an index. You can force optimizer to use the index you want if you are sure that will give you better performance. You can use :
SELECT * FROM `domain` USE INDEX (idx_domain_task_id) WHERE task_id = '3';
For more details please see this page Index Hints .
Background
I made a small table of 10 rows from a previous SELECT already ran (SavedAnimals).
I have a massive table (animals) which I would like to UPDATE using the rows with the same id as each row in my new table.
What I have tried so far
I can quickly SELECT the desired rows from the big table like this:
mysql> EXPLAIN SELECT * FROM animals WHERE ignored=0 and id IN (SELECT animal_id FROM SavedAnimals);
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 10 | |
| 1 | PRIMARY | animals | eq_ref | PRIMARY | PRIMARY | 8 | db_staging.SavedAnimals.animal_id | 1 | Using where |
| 2 | MATERIALIZED | SavedAnimals | ALL | NULL | NULL | NULL | NULL | 10 | |
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
But the "same" command on the UPDATE is not quick:
mysql> EXPLAIN UPDATE animals SET ignored=1, ignored_when=CURRENT_TIMESTAMP WHERE ignored=0 and id IN (SELECT animal_id FROM SavedAnimals);
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
| 1 | PRIMARY | animals | index | NULL | PRIMARY | 8 | NULL | 34269464 | Using where |
| 2 | DEPENDENT SUBQUERY | SavedAnimals | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
2 rows in set (0.00 sec)
The UPDATE command never finishes if I run it.
QUESTION
How do I make mariaDB run with the Materialized select_type on the UPDATE like it does on the SELECT?
OR
Is there a totally separate way that I should approach this which would be quick?
Notes
Version: 10.3.23-MariaDB-log
Use JOIN rather than WHERE...IN. MySQL tends to optimize them better.
UPDATE animals AS a
JOIN SavedAnimals AS sa ON a.id = sa.animal_id
SET a.ignored=1, a.ignored_when=CURRENT_TIMESTAMP
WHERE a.ignored = 0
You should find an EXISTS clause more efficient than an IN clause. For example:
UPDATE animals a
SET a.ignored = 1,
a.ignored_when = CURRENT_TIMESTAMP
WHERE a.ignored = 0
AND EXISTS (SELECT * FROM SavedAnimals sa WHERE sa.animal_id = a.id)
How can I measure mysql "UPSERT" performance? More specifically, get information about the implied search before the insert/update/replace?
using mysql 8, with a schema that has three fields. Two are part of the primary key. Table is currently innodb but that is not a hard requirement.
CREATE TABLE IF NOT EXISTS `test`.`recent`
( `uid` int NOT NULL, `gid` int NOT NULL, `last` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`uid`,`gid`),
KEY `idx_last` (`last`) USING BTREE
) ENGINE=InnoDB;
+----------+----------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+----------+------+-----+-------------------+-------+
| uid | int(11) | NO | PRI | NULL | |
| gid | int(11) | NO | PRI | NULL | |
| last | datetime | NO | MUL | CURRENT_TIMESTAMP | |
+----------+----------+------+-----+-------------------+-------+
I plan to insert values using
INSERT INTO test.recent (uid,gid) VALUES (1, 1)
ON DUPLICATE KEY UPDATE last=NOW();
How do I go about figuring out the performance of this query, since EXPLAIN will not show the implied search, only the insert:
MYSQL> explain INSERT INTO test.recent (uid,gid) VALUES (1, 1) ON DUPLICATE KEY UPDATE last=NOW();
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | INSERT | recent | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set (0.00 sec)
MYSQL> explain INSERT INTO test.recent (uid,gid) VALUES (1, 1);
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | INSERT | recent | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set (0.00 sec)
which is different from the explain on an actual search:
MYSQL> explain select last from test.recent where uid=1 and gid=1;
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------+------+----------+-------+
| 1 | SIMPLE | recent | NULL | const | PRIMARY | PRIMARY | 8 | const,const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
One of the variables I am trying to figure out, is if performance would change at all if I use a blind update instead:
MYSQL> explain REPLACE INTO test.recent VALUES (1, 1, NOW());
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | REPLACE | recent | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set (0.01 sec)
But as you can see, the information i get is the same (unhelpful) as I get for an "explain insert".
Another question I would like to answer based on measurements, is if things would change for better or worse (and by how much) If i tested both upserts approaches (on duplicate vs replace) with a DATE field (instead of the DATETIME), which in theory would results in less writes (but still the same number of implied searches). But again, explain is no help here.
Don't trust EXPLAIN with very few rows.
Your IODKU is optimal. Here's how it will work:
1-. Like a SELECT, drill down the PRIMARY KEY BTree to find the row with (1,1) or a gap where (1,1) should be. That is about as fast a lookup as can be had.
2a. If the row exists, UPDATE it.
2b. If the row does not exist, INSERT (and set last to the DEFAULT of `CURRENT_TIMESTAMP).
If you want, I can go into details of the steps. But we are talking sub-millisecond for each.
If I wanted to time the query, I would use some high-res timer. Very likely the timings would bounce around, depending of whether there is a breeze blowing, or a butterfly is flapping its wings, or the phase of the moon.
Caveat: If you "simplified" your query for this question, I may not be giving you correct info for the real query.
If you are doing a thousand IODKUs, then there may be optimizations that involve combining them. There are some typical optimizations that can easily give 10x speedup.
I have a working, nice, indexed SQL query aggregating notes (sum of ints) for all my users and others stuffs. This is "query A".
I want to use this aggregated notes in others queries, say "query B".
If I create a View based on "query A", will the indexes of the original query will be used when needed if I join it in "query B" ?
Is that true for MySQL ? For others flavors of SQL ?
Thanks.
In MySQL, you cannot create an index on a view. MySQL uses indexes of the underlying tables when you query data against the views that use the merge algorithm. For the views that use the temptable algorithm, indexes are not utilized when you query data against the views.
https://www.percona.com/blog/2007/08/12/mysql-view-as-performance-troublemaker/
Here's a demo table. It has a userid attribute column and a note column.
mysql> create table t (id serial primary key, userid int not null, note int, key(userid,note));
If you do an aggregation to get the sum of note per userid, it does an index-scan on (userid, note).
mysql> explain select userid, sum(note) from t group by userid;
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| 1 | SIMPLE | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
1 row in set (0.00 sec)
If we create a view for the same query, then we can see that querying the view uses the same index on the underlying table. Views in MySQL are pretty much like macros — they just query the underlying table.
mysql> create view v as select userid, sum(note) from t group by userid;
Query OK, 0 rows affected (0.03 sec)
mysql> explain select * from v;
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
2 rows in set (0.00 sec)
So far so good.
Now let's create a table to join with the view, and join to it.
mysql> create table u (userid int primary key, name text);
Query OK, 0 rows affected (0.09 sec)
mysql> explain select * from v join u using (userid);
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | test.u.userid | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
3 rows in set (0.01 sec)
I tried to use hints like straight_join to force it to read v then join to u.
mysql> explain select * from v straight_join u on (v.userid=u.userid);
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using where; Using join buffer (Block Nested Loop) |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
"Using join buffer (Block Nested Loop)" is MySQL's terminology for "no index used for the join." It's just looping over the table the hard way -- by reading batches of rows from start to finish of the table.
I tried to use force index to tell MySQL that type=ALL is to be avoided.
mysql> explain select * from v straight_join u force index(PRIMARY) on (v.userid=u.userid);
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | v.userid | 1 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
Maybe this is using an index for the join? But it's weird that table u is before table t in the EXPLAIN. I'm frankly not sure how to understand what it's doing, given the order of rows in this EXPLAIN report. I would expect the joined table should come after the primary table of the query.
I only put a few rows of data into each table. One might get some different EXPLAIN results with a larger representative sample of test data. I'll leave that to you to try.
I have this MySQL table:
CREATE TABLE `maillog` (
`Id` varchar(200) NOT NULL DEFAULT 'Test',
`email` varchar(255) DEFAULT NULL,
PRIMARY KEY (`Id`),
KEY `email` (`email`)
) ENGINE=MyISAM DEFAULT CHARSET=cp1251 ROW_FORMAT=DYNAMIC;
Now, I want to execute this query - but it's very slow for many rows:
SELECT Id FROM `maillog` ORDER BY `Id`;
Why does MySQL not use the primary key?
If I run EXPLAIN for this query, the result shows a NULL value for Key:
mysql> EXPLAIN SELECT Id FROM `maillog` ORDER BY `Id`;
+----+-------------+---------+------------+--------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+--------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | maillog | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
+----+-------------+---------+------------+--------+---------------+------+---------+------+------+----------+-------+
But the DESCRIBE query for this table, shows the Key PRI for the Id column:
mysql> DESCRIBE `maillog`;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| Id | varchar(200) | NO | PRI | Test | |
| email | varchar(255) | YES | MUL | NULL | |
+-------+--------------+------+-----+---------+-------+
I think it is very strange you set a primary key empty string as default.
Even i do not understand how mySql allows it.
Try to quit default value, primary key must be mandatory.
You have only 1 row in your DB, the query engine don't bother loading the index, it just make a sequential read on the table, so yes "no index".
try adding a few more row, this should change (1 line in maillog, 2 in maillog2):
SQL Fiddle
MySQL 5.6 Schema Setup:
Query 1:
explain SELECT Id FROM `maillog` ORDER BY `Id`
Results:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|-------------|---------|--------|---------------|--------|---------|--------|------|--------|
| 1 | SIMPLE | maillog | system | (null) | (null) | (null) | (null) | 1 | (null) |
Query 2:
explain SELECT Id FROM `maillog2` ORDER BY `Id`
Results:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|-------------|----------|-------|---------------|---------|---------|--------|------|-------------|
| 1 | SIMPLE | maillog2 | index | (null) | PRIMARY | 202 | (null) | 2 | Using index |