Nested "select ... in" performance is slow - how to fix? - mysql

Here I have a simple join query. If first two queries get results, the whole query can be done in 0.3 secs, but if the first 2 select doesn't fetch any result, the whole query will cost more than half a minute. What causes this difference? How to fix this problem and improve the performance?
SELECT * FROM music WHERE id IN
(
SELECT id FROM music_tag_map WHERE tag_id IN
(
SELECT id FROM tag WHERE content ='xxx'
)
)
LIMIT 10
Here's the table structure:
CREATE TABLE `tag` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`content` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index2` (`content`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `music` (
`id` int(7) NOT NULL AUTO_INCREMENT,
`name` varchar(500) NOT NULL,
`othername` varchar(200) DEFAULT NULL,
`player` varchar(3000) DEFAULT NULL,
`genre` varchar(100) DEFAULT NULL,
`sounds` text,
`create_time` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `player` (`player`(255)),
KEY `name` (`othername`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `music_tag_map` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`music_id` int(7) NOT NULL,
`tag_id` int(7) NOT NULL,
`times` int(11) DEFAULT '1',
PRIMARY KEY (`id`),
KEY `music_id` (`music_id`),
KEY `tag_id` (`tag_id`),
CONSTRAINT `music_tag_map_ibfk_1` FOREIGN KEY (`id`) REFERENCES `music` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `music_tag_map_ibfk_2` FOREIGN KEY (`tag_id`) REFERENCES `tag` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

There are no joins in that query; there are two sub-selects.
A joined query would be:
SELECT *
FROM music
JOIN music_tag_map ON music.id=music_tag_map.id
JOIN tag ON music_tag_map.tag_id=tag.id
WHERE tag.content = ?
LIMIT 10;
An EXPLAIN applied to each will show you why the join performs better than the sub-select: the sub-select will scan the entire music table (the primary query), while the optimizer can pick the order of tables to scan for the joins, allowing MySQL to use indices to get only the needed rows from all the tables.

Related

How to order by calculated column in mysql

It is very slow to order by calculated column. Like this one:
select
`users`.`id`,
`username`,
`about_me`,
(
select
count(*)
from
`interests`
inner join `users_interests` on `interests`.`id` = `users_interests`.`interest_id`
where
`users`.`id` = `users_interests`.`user_id`
and `interest_id` in (1, 2, 4) --this is ids of interests which authenticated user has chosen
) as `interests_count`
from
`users`
order by
`interests_count` desc
So in this query i am ordering users By interests which is best suiting authenticated user. And it is very slow in large data tables. Any ideas what will be alternative and much faster solution. Thanks in advance
EDIT
Users Table
CREATE TABLE `users` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`roles_id` bigint(20) unsigned NOT NULL DEFAULT 2,
`username` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
`about_me` varchar(70) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`email` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`status` tinyint(1) NOT NULL DEFAULT 0,
`email_verified_at` timestamp NULL DEFAULT NULL,
`password` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `users_username_unique` (`username`),
UNIQUE KEY `users_email_unique` (`email`),
KEY `users_roles_id_foreign` (`roles_id`),
CONSTRAINT `users_roles_id_foreign` FOREIGN KEY (`roles_id`) REFERENCES
`user_roles` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=50102 DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_unicode_ci
Interests Table
CREATE TABLE `interests` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`interest_category_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `interests_interest_category_id_foreign` (`interest_category_id`),
CONSTRAINT `interests_interest_category_id_foreign` FOREIGN KEY
(`interest_category_id`) REFERENCES `interests_categories` (`id`) ON DELETE
CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=61 DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_unicode_ci
Pivot table between Users and Interests
CREATE TABLE `users_interests` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`interest_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `users_interests_user_id_foreign` (`user_id`),
KEY `users_interests_interest_id_foreign` (`interest_id`),
CONSTRAINT `users_interests_interest_id_foreign` FOREIGN KEY (`interest_id`) REFERENCES `interests` (`id`) ON DELETE CASCADE,
CONSTRAINT `users_interests_user_id_foreign` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=251552 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Without test data it's difficult to know if this would be faster than your current query, but you might find the use of a Common Table Expression (a.k.a. WITH clause) to be helpful here:
with cteCount AS (select ui.`user_id`,
count(*) as `interests_count`
from `interests` i
inner join `users_interests` ui
on i.`id` = ui.`interest_id`
where ui.`interest_id` in (1, 2, 4))
select u.`id`,
`username`,
`about_me`,
cc.`interests_count`
from `users` u
inner join cteCount cc
on cc.`user_id` = u.`id`
order by
cc.`interests_count` desc
The performance of this query will be affected by the presence or absence of indexes on the appropriate columns. Just glancing at it I'd say you should have the following indexes available:
interests
Index 1: id
users_interests
Index 1: interest_id
users
Index 1: id
Try to directly join and aggregate.
SELECT u.id,
u.username,
u.about_me,
count(ui.interest_id) interests_count
FROM users u
LEFT JOIN users_interests ui
ON ui.user_id = u.id
AND ui.interest_id IN (1, 2, 4)
GROUP BY u.id,
u.username,
u.about_me;
Additionally create an index supporting the aggregation.
CREATE INDEX users_id_username_about_me
ON users
(id,
username,
about_me);

MySQL Temporary Table with Group By and Group Concat Extremely Slow

I'm trying to build out a fairly simple temporary table. The table will end up being 2 columns:
1 A product ID
2 A string of concatenated compliance data in a format that can be consumed later
CREATE TEMPORARY TABLE products.compliances_data
(INDEX product_id_idx (product_id))
SELECT
products.product_id,
GROUP_CONCAT(JSON_OBJECT('compliance_code', cc.compliance_code, 'compliance_full_name', pc.full_name, 'compliance_web_description_short', pc.web_description_short) SEPARATOR ' - ') as compliances
FROM products.products products
LEFT OUTER JOIN products.material_compliance_map cc on products.material_id = cc.material_id
LEFT OUTER JOIN products.compliances pc on cc.compliance_id = pc.compliance_id
GROUP BY products.part_number
The table gathers and joins information from 3 tables.
The products table is the main source of information. The products table has a column for material_id.
The material_compiance_map table. This is a many-to-many table that maps material_id's to compliance_id's
The compliances table. This is the table where the actual data is stored that I need to pull from to build out the concatenated object.
The problem is the creation of the temporary table takes 3-4 minutes to run yet only yields about 1.2 million entries which seems incredibly slow.
Running an explain on the select portion of the query yields:
Explain on select image
There are only 642 entries in the material_comliance_map so there's not an awful lot of data here that needs to be traversed.
I've tried removing the group_concat and that seems to speed up the query about 33%. The problem seems to revolve around the group by statement.
How can I improve the speed when building this temp table?
EDIT:
Schemas:
material_compliance_map schema
'material_compliance_map'
CREATE TABLE `material_compliance_map` (
`material_id` int(11) NOT NULL,
`material_code` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`compliance_id` int(11) NOT NULL,
`compliance_code` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`material_id`,`compliance_id`),
KEY `fk_compliance_id_material_compliances_compliances` (`compliance_id`),
KEY `fk_material_id_material_compliances_materials` (`material_id`),
CONSTRAINT `fk_compliance_id_material_compliances_compliances` FOREIGN KEY (`compliance_id`) REFERENCES `compliances` (`compliance_id`) ON UPDATE CASCADE,
CONSTRAINT `fk_material_id_material_compliances_materials` FOREIGN KEY (`material_id`) REFERENCES `materials` (`material_id`) ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
compliances schema:
'compliances'
CREATE TABLE `compliances` (
`compliance_id` int(11) NOT NULL AUTO_INCREMENT,
`compliance_code` varchar(20) NOT NULL,
`full_name` varchar(155) DEFAULT NULL,
`web_description_short` varchar(45) DEFAULT NULL,
PRIMARY KEY (`compliance_id`),
UNIQUE KEY `compliance_id_UNIQUE` (`compliance_id`),
UNIQUE KEY `compliance_code_UNIQUE` (`compliance_code`)
) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=latin1
products schema:
'products'
CREATE TABLE `products` (
`part_number` varchar(27) NOT NULL,
`material_code` varchar(30) DEFAULT NULL,
`material_id` int(11) DEFAULT NULL,
`size_code` varchar(15) DEFAULT NULL,
`size_id` int(11) DEFAULT NULL,
`erp_description_1` varchar(31) DEFAULT NULL,
`erp_description_2` varchar(31) DEFAULT NULL,
`search_description` varchar(250) DEFAULT NULL,
`weight_lbs` decimal(8,4) DEFAULT NULL,
`part_number_prefix` varchar(15) DEFAULT NULL,
`tight_tolerance` tinyint(4) DEFAULT NULL,
`product_id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`product_id`),
UNIQUE KEY `part_number_UNIQUE` (`part_number`),
UNIQUE KEY `product_id_UNIQUE` (`product_id`),
KEY `fk_material_id_products_materials` (`material_id`),
KEY `fk_size_id_products_sizes` (`size_id`),
KEY `product_id` (`product_id`),
KEY `size_id_idx` (`size_id`),
KEY `size_id_productsidx` (`size_id`),
KEY `material_id_idx` (`material_id`),
CONSTRAINT `fk_material_id_products_materials` FOREIGN KEY (`material_id`) REFERENCES `materials` (`material_id`),
CONSTRAINT `fk_size_id_products_sizes` FOREIGN KEY (`size_id`) REFERENCES `sizes` (`size_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1140987 DEFAULT CHARSET=latin1

MariaDB: Multiple joins and where clause searching for 1 in TINYINT Column not working

I have tables called person and book and image and bookhit.
Person has id, name and Book has id, owner_id, info and Image has a column for id, owner_id, url and thumbnail which is a TINYINT (In the database half the rows are 0s and 1s.) By the way, the image column stores images of the cover of the book in two version: big-one and thumbnail. The table bookhit stores the times the book has been retrieved from the database and has a column hits.
So I tried multiple INNER JOIN to retrieve all the thumbnails for the most popular books. The SQL Query is the following:
SELECT `imagehit`.`hits`, `person`.`name`, `book`.`info`, `image`.`url`, `image`.`thumbnail` FROM `imagehit`
INNER JOIN `person` ON `person`.`id`=`book`.`owner_id`
INNER JOIN `image` ON `image`.`owner_id`=`book`.`id`
ORDER BY `imagehit`.`hits` DESC
WHERE `image`.`thumbnail`=1
LIMIT 10;
And that doesn't work, even though half rows has 1s in image.thumbnail . If I change the following line:
WHERE `image`.`thumbnail`=1
To
WHERE `image`.`thumbnail`=0
It does work. Well, I went to the image table and did a simple query like the following:
SELECT * FROM `image` WHERE `image`.`thumbnail`=0;
And gave me total rows stored in the table. But when I browse image table in phpMyAdmin I see there are 1s stored in the table. :(
Any ideas why this happens? thank you in advance.
Table definitions:
CREATE TABLE `image` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`owner_id` int(11) NOT NULL,
`thumbnail` tinyint(1) NOT NULL,
`url` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `image_url` (`url`),
KEY `image_owner_id` (`owner_id`),
CONSTRAINT `image_ibfk_1` FOREIGN KEY (`owner_id`) REFERENCES `book` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1450 DEFAULT CHARSET=utf8
CREATE TABLE `person` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`url` varchar(60) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `person_url` (`url`),
) ENGINE=InnoDB AUTO_INCREMENT=6287 DEFAULT CHARSET=utf8
CREATE TABLE `book` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`owner_id` int(11) NOT NULL,
`book` varchar(3000) NOT NULL,
`info` varchar(3000) NOT NULL,
`url` varchar(60) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `book_url` (`url`),
KEY `book_owner_id` (`owner_id`),
CONSTRAINT `book_ibfk_1` FOREIGN KEY (`owner_id`) REFERENCES `person` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=725 DEFAULT CHARSET=utf8
CREATE TABLE `imagehit` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`owner_id` int(11) NOT NULL,
`person_id` int(11) NOT NULL,
`hits` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `imagehit_person_id` (`person_id`),
KEY `imagehit_owner_id` (`owner_id`),
KEY `hits` (`hits`),
CONSTRAINT `imagehit_ibfk_1` FOREIGN KEY (`owner_id`) REFERENCES `image` (`id`),
CONSTRAINT `imagehit_ibfk_2` FOREIGN KEY (`person_id`) REFERENCES `person` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=725 DEFAULT CHARSET=utf8
Proof I'm not crazy:
I inserted the data using Peewee, when I created the row I set thumbnail=True if the image was a thumbnail and as thumbnail=False if it wasn't. The column thumbnail is the field BooleanField in Peewee.

How to optimize this mysql query - explain output included

This is the query (a search query basically, based on tags):-
select
SUM(DISTINCT(ttagrels.id_tag in (2105,2120,2151,2026,2046) )) as key_1_total_matches, td.*, u.*
from Tutors_Tag_Relations AS ttagrels
Join Tutor_Details AS td ON td.id_tutor = ttagrels.id_tutor
JOIN Users as u on u.id_user = td.id_user
where (ttagrels.id_tag in (2105,2120,2151,2026,2046)) group by td.id_tutor HAVING key_1_total_matches = 1
And following is the database dump needed to execute this query:-
CREATE TABLE IF NOT EXISTS `Users` (
`id_user` int(10) unsigned NOT NULL auto_increment,
`id_group` int(11) NOT NULL default '0',
PRIMARY KEY (`id_user`),
KEY `Users_FKIndex1` (`id_group`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=730 ;
INSERT INTO `Users` (`id_user`, `id_group`) VALUES
(303, 1);
CREATE TABLE IF NOT EXISTS `Tutor_Details` (
`id_tutor` int(10) unsigned NOT NULL auto_increment,
`id_user` int(10) NOT NULL default '0',
PRIMARY KEY (`id_tutor`),
KEY `Users_FKIndex1` (`id_user`),
KEY `id_user` (`id_user`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=58 ;
INSERT INTO `Tutor_Details` (`id_tutor`, `id_user`) VALUES
(26, 303);
CREATE TABLE IF NOT EXISTS `Tags` (
`id_tag` int(10) unsigned NOT NULL auto_increment,
`tag` varchar(255) default NULL,
PRIMARY KEY (`id_tag`),
UNIQUE KEY `tag` (`tag`),
KEY `id_tag` (`id_tag`),
KEY `tag_2` (`tag`),
KEY `tag_3` (`tag`),
KEY `tag_4` (`tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=2957 ;
INSERT INTO `Tags` (`id_tag`, `tag`) VALUES
(2026, 'Brendan.\nIn'),
(2046, 'Brendan.'),
(2105, 'Brendan'),
(2120, 'Brendan''s'),
(2151, 'Brendan)');
CREATE TABLE IF NOT EXISTS `Tutors_Tag_Relations` (
`id_tag` int(10) unsigned NOT NULL default '0',
`id_tutor` int(10) unsigned default NULL,
`tutor_field` varchar(255) default NULL,
`cdate` timestamp NOT NULL default CURRENT_TIMESTAMP,
`udate` timestamp NULL default NULL,
KEY `Tutors_Tag_Relations` (`id_tag`),
KEY `id_tutor` (`id_tutor`),
KEY `id_tag` (`id_tag`),
KEY `id_tutor_2` (`id_tutor`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Tutors_Tag_Relations` (`id_tag`, `id_tutor`, `tutor_field`, `cdate`, `udate`) VALUES
(2105, 26, 'firstname', '2010-06-17 17:08:45', NULL);
ALTER TABLE `Tutors_Tag_Relations`
ADD CONSTRAINT `Tutors_Tag_Relations_ibfk_2` FOREIGN KEY (`id_tutor`) REFERENCES `Tutor_Details` (`id_tutor`) ON DELETE NO ACTION ON UPDATE NO ACTION,
ADD CONSTRAINT `Tutors_Tag_Relations_ibfk_1` FOREIGN KEY (`id_tag`) REFERENCES `Tags` (`id_tag`) ON DELETE NO ACTION ON UPDATE NO ACTION;
What the query does?
This query actually searches tutors which contain "Brendan"(as their name or biography or something). The id_tags 2105,2120,2151,2026,2046 are nothing but the tags which are LIKE "%Brendan%".
My question is :-
1.In the explain of this query, the reference column shows NULL for ttagrels, but there are possible keys (Tutors_Tag_Relations,id_tutor,id_tag,id_tutor_2). So, why is no key being taken. How to make the query take references. Is it possible at all?
2. The other two tables td and u are using references. Any indexing needed in those? I think not.
Check the explain query output here
http://www.test.examvillage.com/explain.png
Don't analyze performance of database with single record in table. Create at least 100 records.

Optimize Join sentence with foreign keys, and show records with nulls

I have the following structure
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
CREATE TABLE IF NOT EXISTS `sis_param_tax` (
`id` int(5) NOT NULL auto_increment,
`description` varchar(50) NOT NULL,
`code` varchar(5) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=7;
CREATE TABLE IF NOT EXISTS `sis_param_city` (
`id` int(4) NOT NULL auto_increment,
`name` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;
CREATE TABLE IF NOT EXISTS `sis_supplier` (
`id` int(15) NOT NULL auto_increment,
`name` varchar(200) NOT NULL,
`address` varchar(200) default NULL,
`phone` varchar(30) NOT NULL,
`fk_city` int(11) default NULL,
`fk_tax` int(11) default NULL,
PRIMARY KEY (`id`),
KEY `fk_city` (`fk_city`),
KEY `fk_tax` (`fk_tax`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=2 ;
ALTER TABLE `sis_supplier`
ADD CONSTRAINT `sis_supplier_ibfk_4` FOREIGN KEY (`fk_tax`) REFERENCES `sis_param_tax` (`id`) ON DELETE SET NULL ON UPDATE CASCADE,
ADD CONSTRAINT `sis_supplier_ibfk_3` FOREIGN KEY (`fk_city`) REFERENCES `sis_param_city` (`id`) ON DELETE SET NULL ON UPDATE CASCADE;
My questions are
1. This structure allows me to have a supplier with city and tax fields = null (in case user didn't set these values). Right?
2. If I delete "X" city, supplier's fk_city with city="X" are set to null, same with fk_tax. Right?
3. I want to optimize (IF POSSIBLE) the following join sentence, so I can show suppliers whom have fk_city and/or fk_tax = NULL
SELECT DISTINCT
sis_supplier.id,
sis_supplier.name,
sis_supplier.telefono,
sis_supplier.address,
sis_supplier.phone,
sis_supplier.cuit,
sis_param_city.name AS city,
sis_param_tax.description AS tax,
sis_supplier.fk_city,
sis_supplier.fk_tax
FROM
sis_supplier
LEFT OUTER JOIN sis_param_city
ON
sis_supplier.`fk_city` = sis_param_city.id
LEFT OUTER JOIN `sis_param_tax`
ON
sis_supplier.`fk_tax` = `sis_param_tax`.`id`
Thanks a lot in advance,
Yes.
Yes.
Yes, it's good to optimize. The query you showed looks fine. How is it not working for you?
Have you analyzed the query with EXPLAIN? This can help you tell when you have a query that isn't using indexes effectively. In fact, all of Chapter 7 Optimization would be recommended reading.
if you want to show records with nulls than use RIGHT or LEFT JOIN
depend on your needs