Given the following table:
CREATE TABLE `test` (
`a` int(255) unsigned NOT NULL AUTO_INCREMENT,
`b` varchar(64) NOT NULL DEFAULT '' ,
`c` varchar(32) NOT NULL DEFAULT '' ,
`d` varchar(32) NOT NULL DEFAULT '' ,
PRIMARY KEY (`a`),
KEY `b` (`b`) USING BTREE,
KEY `c` (`c`(19)) USING BTREE,
KEY `d` (`d`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='test';
insert test(a,b,c,d) values('1','1','1','1');
insert test(a,b,c,d) values('2','2','2','1');
insert test(a,b,c,d) values('3','3','3','1');
insert test(a,b,c,d) values('4','4','4','1');
I don't know which index the following SQL uses, but I know that Innodb engine only uses one index.
explain select * from test where b='2' and c='2' and d='2';
I executed the above sql in the mysql database, then this statement uses 'b index'. Are there any rules here? Or are there any rules for the optimizer, but it is used here?
In theory, MySQL would choose the index on the column that is most restrictive -- that is, the one that chooses the fewest rows.
But for your query, you want an index that has all three columns, b, c, and d in any order.
For
where b='2' and c='2' and d='2';
have
INDEX(b,c,d) -- with the columns in any order
Another issue. Don't use "prefix" indexing without a good reason:
KEY `c` (`c`(19)) USING BTREE,
It is mostly useless.
Here are some guidelines: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Related
I have posts and websites (and connecting post_websites). Each post can be on multiple websites, and some websites share the content, so I am trying to access the posts which are attached to particular website IDs.
Most of the cases WHERE IN works fine, but not for all websites, some of them are laggy, and I can't understand a difference.
SELECT *
FROM `posts`
WHERE `posts`.`id` IN (
SELECT `post_websites`.`post_id`
FROM `post_websites`
WHERE `website_id` IN (
12054,
19829,
2258,
253
)
) AND
`status` = 1 AND
`posts`.`deleted_at` IS NULL
ORDER BY `post_date` DESC
LIMIT 6
Explain
select_type
table
type
key
key_len
ref
rows
Extra
SIMPLE
post_websites
range
post_websites_website_id_index
4
NULL
440
Using index condition; Using temporary; Using filesort; Start temporary
SIMPLE
posts
eq_ref
PRIMARY
4
post_websites.post_id
1
Using where; End temporary
Other version with EXISTS
SELECT *
FROM `posts`
WHERE EXISTS (
SELECT `post_websites`.`post_id`
FROM `post_websites`
WHERE `website_id` IN (
12054,
19829,
2258,
253
) AND
`posts`.`id` = `post_websites`.`post_id`
) AND
`status` = 1 AND
`deleted_at` IS NULL
ORDER BY `post_date` DESC
LIMIT 6
EXPLAIN:
select_type
table
type
key
key_len
ref
rows
Extra
PRIMARY
posts
index
post_date_index
5
NULL
12
Using where
DEPENDENT SUBQUERY
post_websites
ref
post_id_website_id_unique
4
post.id
1
Using where; Using index
Long story short: based on different amounts of posts on each site and amount of websites sharing content the results are different from 20ms to 50s!
Based on the EXPLAIN the EXISTS works better, but on practice when the amount of data in subquery is lower, it can be very slow.
Is there a query I am missing that could work like a charm for all cases? Or should I check something before querying and choose the method of doing so dynamically?
migrations:
CREATE TABLE `posts` (
`id` int(10) UNSIGNED NOT NULL,
`title` varchar(225) COLLATE utf8_unicode_ci NOT NULL,
`description` varchar(500) COLLATE utf8_unicode_ci NOT NULL,
`post_date` timestamp NULL DEFAULT NULL,
`status` tinyint(4) NOT NULL DEFAULT '1',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
`deleted_at` timestamp NULL DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
ALTER TABLE `posts`
ADD PRIMARY KEY (`id`),
ADD KEY `created_at_index` (`created_at`) USING BTREE,
ADD KEY `status_deleted_at_index` (`status`,`deleted_at`) USING BTREE,
ADD KEY `post_date_index` (`post_date`) USING BTREE,
ADD KEY `id_post_date_status_deleted_at` (`id`,`post_date`,`status`,`deleted_at`) USING BTREE;
CREATE TABLE `post_websites` (
`post_id` int(10) UNSIGNED NOT NULL,
`website_id` int(10) UNSIGNED NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
ALTER TABLE `post_websites`
ADD PRIMARY KEY (`website_id`, `post_id`),
ADD UNIQUE KEY `post_id_website_id_unique` (`post_id`,`website_id`),
ADD KEY `website_id_index` (`website_id`),
ADD KEY `post_id_index` (`post_id`);
eloquent:
$news = Post::select(['title', 'description'])
->where('status', 1)
->whereExists(
function ($query) use ($sites) {
$query->select('post_websites.post_id')
->from('post_websites')
->whereIn('websites_id', $sites)
->whereRaw('post_websites.post_id = posts.id');
})
->orderBy('post_date', 'desc');
->limit(6)
->get();
or
$q->whereIn('posts.id',
function ($query) use ($sites) {
$query->select('post_websites.post_id')
->from('post_websites')
->whereIn('website_id', $sites);
});
Thanks.
Many:many table: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
That says to get rid if id (because it slows things down), promote that UNIQUE to be the PK, and add an INDEX in the opposite direction.
Don't use IN ( SELECT ... ). A simple JOIN is probably the best alternative here.
Did some 3rd party package provide those 3 TIMESTAMPs for each table? Are they ever used? Get rid of them.
KEY `id_post_date_status_deleted_at` (`id`,`post_date`,`status`,`deleted_at`) USING BTREE;
is mostly backward. Some rules:
Don't start an index with the PRIMARY KEY column(s).
Do start an index with = tests: status,deleted_at
I have two tables
CREATE TABLE `city_landmark` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`location` geometry NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id_UNIQUE` (`id`),
SPATIAL KEY `spatial_index1` (`location`)
) ENGINE=InnoDB AUTO_INCREMENT=10001 DEFAULT CHARSET=latin1
CREATE TABLE `device_locations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`location` geometry NOT NULL,
PRIMARY KEY (`id`),
SPATIAL KEY `spatial_index_2` (`location`)
) ENGINE=InnoDB AUTO_INCREMENT=1000004 DEFAULT CHARSET=latin1
City landmark rows: 10000
Device locations rows: 1000002
I want to find out the number of rows in 'device_locations' is within a certain proximity of each city landmark.
SELECT *,
ifnull(
(
SELECT 1
FROM city_landmark cl force INDEX (spatial_index1)
where st_within(cl.location, st_buffer(dl.location, 1, st_buffer_strategy('point_circle', 6)) ) limit 1), 0) 'in_range'
FROM device_locations dl
LIMIT 200;
This is really slow for some reason. Please suggest a better method?
For some reason it makes no difference if spatial_index1 is used or not.
With index: 2.067 seconds
Without index: 2.016 seconds
I'm not familiar with mysql spatial, I use postgresql with postgis. But I will speculate a little bit.
I guess because you have to calculate the st_buffer you aren't able to get benefit of the index. The same is true with regular index when you do some function and alter the index field.
So if your city location is a point geometry, add another field city_perimeter and fill it with the result from st_buffer Then you can create a spatial index for city_perimeter.
Your query should become:
SELECT c.id, count(*)
FROM city_landmark c
JOIN device_locations d
ON st_within(c.city_perimeter, d.location)
GROUP BY c.id
I have table that has composite PK.
CREATE TABLE `tag_value_copy` (
`tag_id` INT(11) NOT NULL,
`created_at` INT(11) NOT NULL,
`value` FLOAT NULL DEFAULT NULL,
PRIMARY KEY (`tag_id`, `created_at`)
)
COLLATE='utf8_unicode_ci'
ENGINE=InnoDB
ROW_FORMAT=COMPACT;
When I execute following query
DELETE FROM tag_value_copy WHERE (tag_id, created_at) IN ((1,2), (2,3), ..., (5,6))
mysql does not use index and goes through all rows. But why?
EXPLAIN SELECT * FROM tag_value_copy WHERE (tag_id,created_at) in ((1,1518136666), (2,1518154836)) do NOT use an index as well.
UPD 1
show index from tag_value_copy
UPD 2
explain delete from tag_value_copy where (tag_id=1 and created_at=1518103037) or (tag_id=2 and created_at=1518103038)
The Why -- MySQL's optimizer does nothing toward optimizing (a, b) IN ((1,2), ...).
The Workaround -- Create a table with the pairs to delete. Then JOIN using an AND between each of the 2 columns.
None of these help: OR, FORCE INDEX.
Why the heck do you have PRIMARY KEY (tag_id, created_at) ? Are you allowing the same tag to be entered multiple times?
I'm running MySQL 5.5 and found behaviour I didn't know of before.
Given this create:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name_UQ` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
With these inserts:
insert into test (name) values ('b');
insert into test (name) values ('a');
And this select:
select * from test;
MySQL does something I wasn't aware of:
2 a
1 b
It sorts automatically.
Given a table with one extra, non-unique column:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(128) DEFAULT NULL,
`other_column` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name_UQ` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And the same inserts (see above), the select (see above) gives this result:
1 b NULL
2 a NULL
Which is kind of expected.
Where is the behaviour of the first query (SQL Fiddle) documented? I'd like to see more of these peculiar things.
MySQL does not sort result sets automatically. The ordering of a result set is indeterminate unless the query specifies an order by clause.
You should never rely on any sort of "implicit" ordering. Just because you see it in 1 (or 100 queries). In fact, without an order by, the same query can return results in different orders on subsequent runs (although I'll admit that this regularly occurs in other database, it is unlikely in MySQL).
Instead, add the ORDER BY. Ordering by a primary key is remarkably efficient, so you don't have to worry about performance.
I have the following MySQL table:
CREATE TABLE `my_data` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`tstamp` int(10) unsigned NOT NULL DEFAULT '0',
`name` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
I want to optimise for the following SELECT query only:
SELECT `id`, `name` FROM `my_data` ORDER BY `name` ASC
Will adding the following index increase performance, regardless of the size of the table?
CREATE INDEX `idx_name_id` ON `my_data` (`name`,`id`);
An EXPLAIN query suggests it would, but I have no quick way of testing with a large data set.
Yes it would. Even though you are still doing a full table scan, having the index will make the sort operation (due to the order by) unnecessary.
But it will also add overhead to inserts and delete statements!
Index are usefull when you use the column in the where clause, or when the column is a part of the condition for a link between two tables. See MySQL Doc