MySQL JOINing a TABLE to itself without a primary key

MySQL JOINing a TABLE to itself without a primary key - mysql

I created two tables in mysql. Each contains an integer idx and a string name. In one, the idx was the primary key.
CREATE TABLE table_indexed (
idx INTEGER,
name VARCHAR(24),
PRIMARY KEY(idx)
);
CREATE TABLE table_not_indexed (
idx INTEGER,
name VARCHAR(24)
);
I then added the same data to both tables. 3 million lines of distinct values to idx (1-3_000_00, randomly arranged) and 3 million random arrangements of 8 lowercase characters to name.
Then I ran a query where I joined each table to itself. The table without the primary key runs almost 3 times as fast.
mysql> SELECT COUNT(*)
-> FROM table_indexed t1 JOIN table_indexed t2
-> ON t1.idx = t2.idx;
+----------+
| COUNT(*) |
+----------+
| 3000000 |
+----------+
1 row in set (11.80 sec)
mysql> SELECT COUNT(*)
-> FROM table_not_indexed t1 JOIN table_not_indexed t2
-> ON t1.idx = t2.idx;
+----------+
| COUNT(*) |
+----------+
| 3000000 |
+----------+
1 row in set (4.12 sec)
EDIT: Asked mySQL to Explain the query.
mysql> EXPLAIN SELECT COUNT(*)
-> FROM table_indexed t1 JOIN table_indexed t2
-> ON t1.idx = t2.idx;
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------------+---------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | index | PRIMARY | PRIMARY | 4 | NULL | 3171970 | 100.00 | Using index |
| 1 | SIMPLE | t2 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | index_test3000000.t1.idx | 1 | 100.00 | Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------------+---------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT COUNT(*)
-> FROM table_not_indexed t1 JOIN table_not_indexed t2
-> ON t1.idx = t2.idx;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+--------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 2993208 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2993208 | 10.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+--------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
mysql>

In both cases it does a table scan of t1, then looks for the matching row in t2.
In this case USING INDEX is equivalent to using the PK when the PK is involved. (EXPLAIN is a bit sloppy and inconsistent in this area.)
Sometimes you can get more details with EXPLAIN FORMAT=JSON SELECT .... (Might not be anything useful in this case.)
"rows" is just an estimate.
The non-indexed case reads t2 entirely into memory and builds a Hash index on it. With too small a value for join_buffer_size, you can experience the alternative -- repeated full table scans of t2.
Your experiment is a good example of when the "join buffer" is good, but not as good as an appropriate index.
Your experiment would probably come out the same with two separate tables instead of a "self-join".
"3 times as fast" -- I would expect a lot of variation in the "3" for different test cases.
For more on join_buffer_size, BNL, and BKA (Block Nested-Loop or Batched Key Access), see https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_join_buffer_size
It is potentially unsafe to set join_buffer_size bigger than 1% of RAM.

Related

Which is faster INSTR vs Like Prefix For varchar In MYSQL [duplicate]

This question already has answers here:
Which is faster — INSTR or LIKE?
(4 answers)
Closed 7 months ago.
If my goal is to find out if there is a string in the column. The column has no unique btree index. Which is faster and more efficient: INSTR vs LIKE prefix for varchar in MYSQL, and why?
Or are there other more-efficient methods?
INSTR(column, 'value') > 0
vs
column LIKE 'value%'
I looked up several questions, but there were only questions and answers about wild cards front and back.
For example,
column LIKE '%value%'

They are not the same.
column like 'value%' is a starts with match, equivalent to INSTR(column, 'value') = 1, rather than INSTR(column, 'value') > 0.
On the other hand, INSTR(column, 'value') > 0 is a contains anywhere match, equivalent to column LIKE '%value%' instead of column LIKE 'value%'.
Of these four expressions, column LIKE 'value%' is likely to perform the best, because it's the only one that still has a chance of using any index for the column.
But it sounds like you want the contains anywhere match, and there's probably not any meaningful difference between column like '%value%' and INSTR(column, 'value') > 0. The best option here is likely a full-text search.

A simple test on my test table (integers) shows that LIKE is faster.
MySQL [test]> select * from integers where instr(t2,'A')>0;
+----+--------------------------------------+----------+------+
| i | t1 | f | t2 |
+----+--------------------------------------+----------+------+
| 42 | 8f0c8b96-aa60-11eb-aa31-309c23b7280c | 0.983418 | ABC |
+----+--------------------------------------+----------+------+
1 row in set (24.08 sec)
MySQL [test]> select * from integers where instr(t2,'A')>0;
+----+--------------------------------------+----------+------+
| i | t1 | f | t2 |
+----+--------------------------------------+----------+------+
| 42 | 8f0c8b96-aa60-11eb-aa31-309c23b7280c | 0.983418 | ABC |
+----+--------------------------------------+----------+------+
1 row in set (24.11 sec)
MySQL [test]> explain select * from integers where instr(t2,'A')>0;
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | integers | NULL | ALL | NULL | NULL | NULL | NULL | 2104867 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
MySQL [test]> select * from integers where t2 like 'A%';
+----+--------------------------------------+----------+------+
| i | t1 | f | t2 |
+----+--------------------------------------+----------+------+
| 42 | 8f0c8b96-aa60-11eb-aa31-309c23b7280c | 0.983418 | ABC |
+----+--------------------------------------+----------+------+
1 row in set (1.00 sec)
MySQL [test]> select * from integers where t2 like 'A%';
+----+--------------------------------------+----------+------+
| i | t1 | f | t2 |
+----+--------------------------------------+----------+------+
| 42 | 8f0c8b96-aa60-11eb-aa31-309c23b7280c | 0.983418 | ABC |
+----+--------------------------------------+----------+------+
1 row in set (1.00 sec)
MySQL [test]> explain select * from integers where t2 like 'A%';
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | integers | NULL | ALL | NULL | NULL | NULL | NULL | 2104867 | 11.11 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
This table has 2621442 records, in MySQL 8.0.29, with this DDL:
CREATE TABLE `integers` (
`i` int NOT NULL,
`t1` varchar(36) DEFAULT NULL,
`f` float DEFAULT NULL,
`t2` varchar(1024) DEFAULT NULL,
PRIMARY KEY (`i`),
KEY `integers_t1` (`t1`),
KEY `idx_f` (`f`),
KEY `even` (((`i` % 2)))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci STATS_AUTO_RECALC=1

Outside query of subqueries is extremely slow (Mysql)

I have a aggregate query with two levels deep subqueries. What is strange is that the two subqueries run acceptably fast but the outside query unacceptably slow.
The basic idea behind the query is to use a table to find all elements linked to a key, selected by one of the elements queries. This resultant set should then be provided to the outside query that will match it according to its own keys/indexes.
Here with all outputs and statements:
We start with the two table definitions
CREATE TABLE `table1` (
`id1` int(11) NOT NULL DEFAULT '0',
`id2` int(11) NOT NULL,
`value` int(11) DEFAULT '0',
PRIMARY KEY (`id1`,`id2`),
KEY `k_id1` (`id1`),
KEY `k_id2` (`id2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `lookuptable1` (
`id3` int(11) NOT NULL,
`id4` int(11) NOT NULL,
PRIMARY KEY (`id3`,`id4`),
UNIQUE KEY `id4_idx` (`id4`),
KEY `id3_idx` (`id3`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
The inside subquery with it's own subquery
SELECT lt1.id4
FROM lookuptable1 lt1
WHERE lt1.id3 = (SELECT pt1.id3
FROM lookuptable1 pt1
WHERE pt1.id4 = 5960)
+-----------+
| id4 |
+-----------+
| 5960 |
| 17215 |
| 3625734 |
| 9312798 |
+-----------+
4 rows in set (0.00 sec)
As you can see: Fast enough.
But the outside query is where the bad bottleneck lies.
Complete query
SELECT
t1.id1,
sum(t1.value)
FROM table1 t1
WHERE t1.id2 = 3 AND t1.id1 IN
(
SELECT lt1.id4
FROM lookuptable1 lt1
WHERE lt1.id3 = (SELECT pt1.id3
FROM lookuptable1 pt1
WHERE pt1.id4 = 5960)
);
+-----------+-----------------------+
| id 1. | sum(t1.value) |
+-----------+-----------------------+
| 9312798 | 0 |
+-----------+-----------------------+
1 row in set (8.01 sec)
That is 8 seconds too slow
herewith the Explain extended for this query:
+----+--------------------+-------+--------+-------------------+-------------+---------+------------+---------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+--------+-------------------+-------------+---------+------------+---------+----------+--------------------------+
| 1 | PRIMARY | t1 | index | NULL. | PRIMARY | 8 | NULL. | 1454343 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | lt1 | eq_ref | PRIMARY,id3,id4 | PRIMARY | 8 | const,func | 1 | 100.00 | Using where; Using index |
| 3 | SUBQUERY | pt1 | const | id4 | id4_idx | 4 | | 1 | 100.00 | Using index |
+----+--------------------+-------+--------+-------------------+-------------+---------+------------+---------+----------+--------------------------+
As I understand from this, the outside query doesn't actually use the index that it could.
What could we possibly be doing wrong in this query. Surely it should be running much much faster.
I tried running the outside query with the subqueries' result copy-pasted inside the IN clause (in other words the subqueries aren't run. It runs normally fast. Here's the explain extended then:
+----+-------------+-------+-------+----------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+----------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t1 | range | PRIMARY,k_id1 | PRIMARY | 4 | NULL | 5 | 100.00 | Using where |
+----+-------------+-------+-------+----------------+---------+---------+------+------+----------+-------------+
Oh yeah. This is running on MySQL 5.5

you could avoid the IN clause using an inner join
SELECT
t1.id1,
sum(t1.value)
FROM table1 t1
INNER JOIN (
SELECT lt1.id4
FROM lookuptable1 lt1
WHERE lt1.id3 = (SELECT pt1.id3
FROM lookuptable1 pt1
WHERE pt1.id4 = 5960)
) t on t.id4 = t1.id1 and t1.id2 = 3
and this could improve your query ..
be sure you have a proper index on table1 (id1, id2)

Find value within a range in database table

I need the SQL equivalent of this.
I have a table like this
ID MN MX
-- -- --
A 0 3
B 4 6
C 7 9
Given a number, say 5, I want to find the ID of the row where MN and MX contain that number, in this case that would be B.
Obviously,
SELECT ID FROM T WHERE ? BETWEEN MN AND MX
would do, but I have 9 million rows and I want this to run as fast as possible. In particular, I know that there can be only one matching row, I now that the MN-MX ranges cover the space completely, and so on. With all these constraints on the possible answers, there should be some optimizations I can make. Shouldn't there be?
All I have so far is indexing MN and using the following
SELECT ID FROM T WHERE ? BETWEEN MN AND MX ORDER BY MN LIMIT 1
but that is weak.

If you have an index spanning MN and MX it should be pretty fast, even with 9M rows.
alter table T add index mn_mx (mn, mx);
Edit
I just tried a test w/ a 1M row table
mysql> select count(*) from T;
+----------+
| count(*) |
+----------+
| 1000001 |
+----------+
1 row in set (0.17 sec)
mysql> show create table T\G
*************************** 1. row ***************************
Table: T
Create Table: CREATE TABLE `T` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`mn` int(10) DEFAULT NULL,
`mx` int(10) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `mn_mx` (`mn`,`mx`)
) ENGINE=InnoDB AUTO_INCREMENT=1048561 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> select * from T order by rand() limit 1;
+--------+-----------+-----------+
| id | mn | mx |
+--------+-----------+-----------+
| 112940 | 948004986 | 948004989 |
+--------+-----------+-----------+
1 row in set (0.65 sec)
mysql> explain select id from T where 948004987 between mn and mx;
+----+-------------+-------+-------+---------------+-------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------+---------+------+--------+--------------------------+
| 1 | SIMPLE | T | range | mn_mx | mn_mx | 5 | NULL | 239000 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)
mysql> select id from T where 948004987 between mn and mx;
+--------+
| id |
+--------+
| 112938 |
| 112939 |
| 112940 |
| 112941 |
+--------+
4 rows in set (0.03 sec)
In my example I just had an incrementing range of mn values and then set mx to +3 that so that's why I got more than 1, but should apply the same to you.
Edit 2
Reworking your query will definitely be better
mysql> explain select id from T where mn<=947892055 and mx>=947892055;
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
| 1 | SIMPLE | T | range | mn_mx | mn_mx | 5 | NULL | 9 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
It's worth noting even though the first explain reported many more rows to be scanned I had enough innodb buffer pool set to keep the entire thing in RAM after creating it; so it was still pretty fast.

If there are no gaps in your set, a simple gte comparison will work:
SELECT ID FROM T WHERE ? >= MN ORDER BY MN ASC LIMIT 1

Why is mySQL query, left join 'considerably' faster than my inner join

I've researched this, but I still cannot explain why:
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155
Is significantly slower than:
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155
115ms Vs 478ms. They are both using InnoDB and there are relationships defined. The 'card_legality' contains approx 200k rows, while the 'legality' table contains 11 rows. Here is the structure for each:
CREATE TABLE `card_legality` (
`card_id` varchar(8) NOT NULL DEFAULT '',
`legality_id` int(3) NOT NULL,
`cl_boolean` tinyint(1) NOT NULL,
PRIMARY KEY (`card_id`,`legality_id`),
KEY `legality_id` (`legality_id`),
CONSTRAINT `card_legality_ibfk_2` FOREIGN KEY (`legality_id`) REFERENCES `legality` (`legality_id`),
CONSTRAINT `card_legality_ibfk_1` FOREIGN KEY (`card_id`) REFERENCES `card` (`card_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
And:
CREATE TABLE `legality` (
`legality_id` int(3) NOT NULL AUTO_INCREMENT,
`l_name` varchar(16) NOT NULL DEFAULT '',
PRIMARY KEY (`legality_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=latin1;
I could simply use LEFT-JOIN, but it doesn't seem quite right... any thoughts, please?
UPDATE:
As requested, I've included the results of explain for each. I had run it previously, but I dont pretend to have a thorough understanding of it..
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cl ALL PRIMARY NULL NULL NULL 199747 Using where
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 hexproof.co.uk.cl.legality_id 1
AND, inner join:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE l ALL PRIMARY NULL NULL NULL 11
1 SIMPLE cl ref PRIMARY,legality_id legality_id 4 hexproof.co.uk.l.legality_id 33799 Using where

It is because of the varchar on card_id. MySQL can't use the index on card_id as card_id as described here mysql type conversion. The important part is
For comparisons of a string column with a number, MySQL cannot use an
index on the column to look up the value quickly. If str_col is an
indexed string column, the index cannot be used when performing the
lookup in the following statement:
SELECT * FROM tbl_name WHERE str_col=1;
The reason for this is that there are many different strings that may
convert to the value 1, such as '1', ' 1', or '1a'.
If you change your queries to
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'
and
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'
You should see a huge improvement in speed and also see a different EXPLAIN.
Here is a similar (but easier) test to show this:
> desc id_test;
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| id | varchar(8) | NO | PRI | NULL | |
+-------+------------+------+-----+---------+-------+
1 row in set (0.17 sec)
> select * from id_test;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+----+
9 rows in set (0.00 sec)
> explain select * from id_test where id = 1;
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | id_test | index | PRIMARY | PRIMARY | 10 | NULL | 9 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
> explain select * from id_test where id = '1';
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | id_test | const | PRIMARY | PRIMARY | 10 | const | 1 | Using index |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)
In the first case there is Using where; Using index and the second is Using index. Also ref is either NULL or CONST. Needless to say, the second one is better.

L2G has it pretty much summed up, although I suspect it could be because of the varchar type used for card_id.
I actually printed out this informative page for benchmarking / profiling quickies. Here is a quick poor-mans profiling technique:
Time a SQL on MySQL
Enable Profiling
mysql> SET PROFILING = 1
...
RUN your SQLs
...
mysql> SHOW PROFILES;
+----------+------------+-----------------------+
| Query_ID | Duration | Query |
+----------+------------+-----------------------+
| 1 | 0.00014600 | SELECT DATABASE() |
| 2 | 0.00024250 | select user from user |
+----------+------------+-----------------------+
mysql> SHOW PROFILE for QUERY 2;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000034 |
| checking query cache for query | 0.000033 |
| checking permissions | 0.000006 |
| Opening tables | 0.000011 |
| init | 0.000013 |
| optimizing | 0.000004 |
| executing | 0.000011 |
| end | 0.000004 |
| query end | 0.000002 |
| freeing items | 0.000026 |
| logging slow query | 0.000002 |
| cleaning up | 0.000003 |
+--------------------------------+----------+
Good-luck, oh and please post your findings!

I'd try EXPLAIN on both of those queries. Just prefix each SELECT with EXPLAIN and run them. It gives really useful info on how mySQL is optimizing and executing queries.

I'm pretty sure that MySql has better optimization for Left Joins - no evidence to back this up at the moment.
ETA : A quick scout round and I can't find anything concrete to uphold my view so.....

Getting max value from many tables

There are two ways, that I can think of, to obtain similar results from multiple tables. One is UNION and the other is JOIN. The similar questions on SO have all been answered with a UNION. Here's the coder I just found:
SELECT max(up.id) AS up, max(sc.id) AS sc, max(cl.id) AS cl
FROM updates up, chat_staff sc, change_log cl
explain:
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
My question is -- Is this better than the following?
SELECT "up.id" AS K, max(id) AS V FROM updates
UNION
SELECT "sc.id" AS K, max(id) AS V FROM chat_staff
UNION
SELECT "cl.id" AS K, max(id) AS V FROM change_log
explain:
+----+--------------+--------------+------+---------------+------+---------+------+-------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+--------------+------+---------------+------+---------+------+------+------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 2 | UNION | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 3 | UNION | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| NULL | UNION RESULT | <union1,2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+--------------+------+---------------+------+---------+------+------+------------------------------+

Both of those methods are just fine. In fact, I have another method:
SELECT
IFNULL(maxidup,0) max_id_up,
IFNULL(maxscup,0) max_sc_up,
IFNULL(maxclup,0) max_cl_up
FROM
(SELECT max(id) maxidup FROM updates) up,
(SELECT max(id) maxidsc FROM chat_staff) sc,
(SELECT max(id) maxidcl FROM change_log) cl
;
This method presents the three values side by side like your first example. It also shows 0 in the event one of the tables are empty.
mysql> DROP DATABASE IF EXISTS junk;
Query OK, 3 rows affected (0.11 sec)
mysql> CREATE DATABASE junk;
Query OK, 1 row affected (0.00 sec)
mysql> use junk
Database changed
mysql> CREATE TABLE updates (id int not null auto_increment primary key,x int);
Query OK, 0 rows affected (0.07 sec)
mysql> CREATE TABLE chat_staff LIKE updates;
Query OK, 0 rows affected (0.07 sec)
mysql> CREATE TABLE change_log LIKE updates;
Query OK, 0 rows affected (0.06 sec)
mysql> INSERT INTO updates (x) VALUES (37),(84),(12);
Query OK, 3 rows affected (0.06 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> INSERT INTO change_log (x) VALUES (37),(84),(12),(14),(35);
Query OK, 5 rows affected (0.09 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> SELECT
-> IFNULL(maxidup,0) max_id_up,
-> IFNULL(maxidsc,0) max_sc_up,
-> IFNULL(maxidcl,0) max_cl_up
-> FROM
-> (SELECT max(id) maxidup FROM updates) up,
-> (SELECT max(id) maxidsc FROM chat_staff) sc,
-> (SELECT max(id) maxidcl FROM change_log) cl
-> ;
+-----------+-----------+-----------+
| max_id_up | max_sc_up | max_cl_up |
+-----------+-----------+-----------+
| 3 | 0 | 5 |
+-----------+-----------+-----------+
1 row in set (0.00 sec)
mysql> explain SELECT IFNULL(maxidup,0) max_id_up, IFNULL(maxidsc,0) max_sc_up, IFNULL(maxidcl,0) max_cl_up FROM (SELECT max(id) maxidup FROM updates) up, (SELECT max(id) maxidsc FROM chat_staff) sc, (SELECT max(id) maxidcl FROM change_log) cl;
+----+-------------+------------+--------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+------+---------+------+------+------------------------------+
| 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | <derived3> | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | <derived4> | system | NULL | NULL | NULL | NULL | 1 | |
| 4 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No matching min/max row |
| 2 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+------------+--------+---------------+------+---------+------+------+------------------------------+
6 rows in set (0.02 sec)
In my EXPLAIN plan, it has Select tables optimized away just like yours. Why ?
Since id is indexed in all the tables, the index is used to retrieve the max(id) rather than the table. Thus, Select tables optimized away is the correct response.
Six of one, half dozen of the other. How you present data from there is strictly your personal preference.
UPDATE 2011-10-20 15:32 EDT
You commented : Do you know how table locking would compromise this? Let's say one of the tables in question is locked. Would this query lock the other two and keep 'em locked until the first one was freed up?
This would depend on the storage engine. If all tables in question are MyISAM, definite possibility since MyISAM performs a full table lock on INSERT, UPDATE, DELETE. If the three tables are InnoDB, you have the benefit of MVCC to provide transaction isolation. This would allow everyone their view of the data in a point-in-time. Aside from DDL and an explcit LOCK TABLES against InnoDB, your query should not be blocked.

Actually, while they're similar, there's a subtle difference. The first gives you a one-row, three-column table (with the values going "across") and the second gives you a three-row, two-column table (with the values going "down").
Provided you're happy processing or viewing that data in either form, it's probably going to come down to performance.
In my experience (and this is nothing to do specifically with MySQL), the latter query will probably be better. That's because the DBMS' I work with are able to run queries like that in parallel for efficiency, combining them at completion of all. The fact that they're on different tables means that lock contention between them will be zero.
It may be that the query analysis engine of a DBMS could do a similar optimisation for the first query but it would require a lot more intelligence than I've seen from most of them.
One quick point, if you use union all instead of just union, you tell the database not to remove duplicate rows. You won't get any duplicates in this case due to the K column being different for all three sub-queries.
But, as with all optimisations, measure, don't guess! Certainly don't take as gospel the rants of random internet roamers (yes, even me).
Put together various candidate tables with the properties you're likely to have in production, and compare the performance of each.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL JOINing a TABLE to itself without a primary key - mysql

Related

Which is faster INSTR vs Like Prefix For varchar In MYSQL [duplicate]

Outside query of subqueries is extremely slow (Mysql)

Find value within a range in database table

Why is mySQL query, left join 'considerably' faster than my inner join

Getting max value from many tables

Categories

Resources