performance issue when joining two large tables - mysql

I have a multilingual CMS that uses a translation table (70k rows) that contains all of the texts
CREATE TABLE IF NOT EXISTS `translations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`key` int(11) NOT NULL,
`lang` int(11) NOT NULL,
`value` text CHARACTER SET utf8,
PRIMARY KEY (`id`),
KEY `key` (`key`,`lang`)
) ENGINE=MyISAM
and products table (4k rows) containing products with translation keys
CREATE TABLE IF NOT EXISTS `products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name_trans_id` int(11) NOT NULL,
`desc_trans_id` int(11) DEFAULT NULL,
`text_trans_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `name_index` (`name_trans_id`),
KEY `desc_index` (`desc_trans_id`),
KEY `text_index` (`text_trans_id`)
) ENGINE=MyISAM
now i need to get top 20 products in alphabetical order, to do that i use this query :
SELECT
SQL_CALC_FOUND_ROWS
dt_table.* ,
t_name.value as 'name'
FROM
products as dt_table
LEFT JOIN
`translations` as t_name on dt_table.name_trans_id = t_name.key
WHERE
(t_name.lang = 1 OR t_name.lang is null)
ORDER BY
name ASC LIMIT 0, 20
It takes forever.
Any help optimizing this query/tables will be appreciated.
Thank you.

Try to change your structure of translations table to:
CREATE TABLE IF NOT EXISTS `translations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`key` int(11) NOT NULL,
`lang` int(11) NOT NULL DEFAULT 0,
`value` text CHARACTER SET utf8,
PRIMARY KEY (`id`),
KEY `lang` (`lang`),
KEY `key` (`key`,`lang`),
FULLTEXT idx (`value`)
) ENGINE=InnoDB;
because you really need lang to be indexed as soon as you use it in WHERE clause.
And try to change your query a little bit:
SELECT
dt_table.* ,
t_name.value as 'name',
SUBSTR(t_name.value,0,100) as text_order
FROM
products as dt_table
LEFT JOIN (
SELECT key, value FROM `translations`
WHERE lang = 1 OR lang is null
) as t_name
ON dt_table.name_trans_id = t_name.key
ORDER BY
text_order ASC LIMIT 0, 20
and if you really need SQL_CALC_FOUND_ROWS (I don't understand why do you need counter for translations items)
you can run another query just right after the first one:
SELECT COUNT(*) FROM products;
I am pretty sure you will be surprised with performance :-)

Related

I need to optimize tables and queries

I have 3 tables: info, data, link, there is a request for data:
select *
from data,link,info
where link.info_id = info.id and link.data_id = data.id
offer optimization options:
a) tables
b) request.
Queries for creating tables:
CREATE TABLE info (
id int(11) NOT NULL auto_increment,
name varchar(255) default NULL,
desc text default NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM DEFAULT CHARSET=cp1251;
CREATE TABLE data (
id int(11) NOT NULL auto_increment,
date date default NULL,
value INT(11) default NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM DEFAULT CHARSET=cp1251;
CREATE TABLE link (
data_id int(11) NOT NULL,
info_id int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=cp1251;
Thanks!
Never use commas in the FROM clause. Always use proper, explicit, standard, readable JOIN syntax:
select *
from data d join
link l
on l.data_id = d.id join
info i
on l.info_id = i.id;
Second, for this query your indexes are probably fine. I would also recommend a primary key index on link:
CREATE TABLE link (
data_id int(11) NOT NULL,
info_id int(11) NOT NULL,
PRIMARY KEY (data_id, info_id)
);
This is a good idea in general, even if it is not specific to this query.

Optimize sql query to speed up a search which currently takes around 85 seconds

I have a database with the records near about 2.7 milion . I need to fetch records from that for that i am using the below query
for result
SELECT r3.original_image_title,r3.uuid,r3.original_image_URL FROM `image_attributes` AS r1 INNER JOIN `filenames` as r3 WHERE r1.`uuid` = r3.`uuid` and r3.`status` = 1 and r1.status=1 and (r1.`attribute_name` like "Quvenzhané Wallis%" or r3.original_image_URL like "Quvenzhané Wallis%") group by r3.`uuid` limit 0,20
for total count
SELECT count(DISTINCT(r1.`uuid`)) as count FROM `image_attributes` AS r1 INNER JOIN `filenames` as r3 WHERE r1.`uuid` = r3.`uuid` and r3.`status` = 1 and r1.status=1 and (r1.`attribute_name` like "Quvenzhané Wallis%" or r3.original_image_URL like "Quvenzhané Wallis%")
table structures are as below
CREATE TABLE IF NOT EXISTS `image_attributes` (
`index` int(11) NOT NULL AUTO_INCREMENT,
`attribute_name` text NOT NULL,
`attribute_type` varchar(255) NOT NULL,
`uuid` varchar(255) NOT NULL,
`status` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`index`),
KEY `attribute_type` (`attribute_type`),
KEY `uuid` (`uuid`),
KEY `status` (`status`),
KEY `attribute_name` (`attribute_name`(50))
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=2730431 ;
CREATE TABLE IF NOT EXISTS `filenames` (
`index` int(11) NOT NULL AUTO_INCREMENT,
`original_image_title` text NOT NULL,
`original_image_URL` text NOT NULL,
`uuid` varchar(255) NOT NULL,
`status` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`index`),
KEY `uuid` (`uuid`),
KEY `original_image_URL` (`original_image_URL`(50))
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=591967 ;
please suggest me how can i optimize the queries to make the search result faster
I would recommend to you a book called 'High Performance MySql'. There is a section called Optimize databases and queries, or something like that.

Mysql optimize slow query with explain

I'm working on MySQL 5.5.29-0ubuntu0.12.04.1.
I have the need to create a query that can sort results by date and by a score.
I read the documentation and the posts here on stackoverflow (specifically this) about how to optimize a query but I'm still struggling to do it well.
The key findings is that to avoid the use of a temporary table the ORDER BY or GROUP BY must contains only columns from the first table in the join queue, so that's why the use of the STRAIGHT_JOIN clause and the two slightly different queries.
To avoid confusion, I'm going to assign a number to various query configuration:
order by date with STRAIGHT_JOIN clause
order by score with STRAIGHT_JOIN clause
order by date without STRAIGHT_JOIN clause
order by score without STRAIGHT_JOIN clause
Following is query 1, takes about 2.5 seconds to complete:
SELECT STRAIGHT_JOIN item.id AS id
FROM item
INNER JOIN score ON item.id = score.item_id
LEFT JOIN url ON item.url_id = url.id
LEFT JOIN doc ON url.doc_id = doc.id
INNER JOIN feed ON feed.id = item.feed_id
INNER JOIN user_feed ON feed.id = user_feed.feed_id AND score.user_id = user_feed.user_id
LEFT JOIN star ON item.id = star.item_id AND score.user_id = star.user_id
JOIN unseen ON item.id = unseen.item_id AND score.user_id = unseen.user_id
WHERE score.user_id = 1 AND user_feed.id = 7
ORDER BY zen_time DESC
LIMIT 0, 10
Following is query 2 (first join tables are inverted and the ordering column is different), takes only about 0.01 seconds to complete:
SELECT STRAIGHT_JOIN item.id AS id
FROM score
INNER JOIN item ON item.id = score.item_id
LEFT JOIN url ON item.url_id = url.id
LEFT JOIN doc ON url.doc_id = doc.id
INNER JOIN feed ON feed.id = item.feed_id
INNER JOIN user_feed ON feed.id = user_feed.feed_id AND score.user_id = user_feed.user_id
LEFT JOIN star ON item.id = star.item_id AND score.user_id = star.user_id
JOIN unseen ON item.id = unseen.item_id AND score.user_id = unseen.user_id
WHERE score.user_id = 1 AND user_feed.id = 7
ORDER BY score DESC
LIMIT 0, 10
Following are the EXPLAIN results for the queries.
Explain for query 1:
Explain for query 2:
Explain for query 3:
Explain for query 4:
Profiler result for query 1:
Profiler result for query 2:
Profiler result for query 3:
Profiler result for query 4:
Following are tables definitions:
CREATE TABLE `doc` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`md5` char(32) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `Md5_index` (`md5`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `feed` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`url` text NOT NULL,
`title` text,
PRIMARY KEY (`id`),
FULLTEXT KEY `Title_url_index` (`title`,`url`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `item` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`feed_id` bigint(20) unsigned NOT NULL,
`url_id` bigint(20) unsigned DEFAULT NULL,
`md5` char(32) NOT NULL,
PRIMARY KEY (`id`),
KEY `Md5_index` (`md5`),
KEY `Zen_time_index` (`zen_time`),
KEY `Feed_index` (`feed_id`),
KEY `Url_index` (`url_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `score` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`item_id` bigint(20) unsigned NOT NULL,
`score` float DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `User_item_index` (`user_id`,`item_id`),
KEY Score_index (`score`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `star` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`item_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `User_item_index` (`user_id`,`item_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `unseen` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`item_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `User_item_index` (`user_id`,`item_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `url` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`doc_id` bigint(20) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY Doc_index (`doc_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `user` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_Email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `user_feed` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned NOT NULL,
`feed_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `User_feed_index` (`user_id`,`feed_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Here are the row counts for the tables involved in the query:
Score: 68657
Item: 197602
Url: 198354
Doc: 186113
Feed: 754
User_feed: 721
Star: 0
Unseen: 150762
Which approach should I take since my program needs to be able to order results both by zen_time and score in the fastest way possible?
Due to the different query speeds I decided to make an even more accurate analysis based on the various results I want to achieve.
The result sets I need are four:
Select all the items from a specific feed, order them by SCORE.score (intelligent order)
Select all the items from a specific feed, order them by ITEM.zen_time (time order)
Select all the items, order them by SCORE.score (intelligent order)
Select all the items, order them by ITEM.zen_time (time order)
The query so has to be adapted to those conditions, and its variable parts are:
STRAIGHT_JOIN yes/no
First JOIN table score/item
WHERE condition on specific feed yes/no
ORDER BY score/zen_time
All of the tests have been executed with the SELECT SQL_NO_CACHE instruction.
Following are the results:
Now it's clear what I have to do:
No STRAIGHT_JOIN, first JOIN table SCORE
No STRAIGHT_JOIN, first JOIN table SCORE
STRAIGHT_JOIN (I did beat MySQL engine here :D ), first JOIN table SCORE
STRAIGHT_JOIN (I did beat MySQL engine here :D ), first JOIN table ITEM

MySQL optimize count query

I've got a question about MySQL performance.
These are my tables:
(about 140.000 records)
CREATE TABLE IF NOT EXISTS `article` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`label` varchar(256) COLLATE utf8_unicode_ci NOT NULL,
`title` varchar(256) COLLATE utf8_unicode_ci NOT NULL,
`intro` text COLLATE utf8_unicode_ci NOT NULL,
`content` text COLLATE utf8_unicode_ci NOT NULL,
`date` int(11) NOT NULL,
`active` int(1) NOT NULL,
`language_id` int(11) NOT NULL,
`category_id` int(11) NOT NULL,
`indexed` int(1) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=132911 ;
(about 400.000 records)
CREATE TABLE IF NOT EXISTS `article_category` (
`article_id` int(11) NOT NULL,
`category_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
RUNNING THIS COUNT QUERY:
SELECT SQL_NO_CACHE COUNT(id) as total
FROM (`article`)
LEFT JOIN `article_category` ON `article_category`.`article_id` = `article`.`id`
WHERE `article`.`language_id` = 1
AND `article_category`.`category_id` = '<catid>'
This query takes a lot of resources, so I am wondering how to optimize this query.
After executing it's beeing cached, so after the first run I am fine.
RUNNING THE EXPLAIN FUNCTION:
AFTER CREATING AN INDEX:
ALTER TABLE `article_category` ADD INDEX ( `article_id` , `category_id` ) ;
After adding indexes and changing LEFT JOIN to JOIN the query runs alot faster!
Thanks for these fast replys :)
QUERY I USE NOW (I removed the language_id because it was not that neccesary):
SELECT COUNT(id) as total
FROM (`article`)
JOIN `article_category` ON `article_category`.`article_id` = `article`.`id`
AND `article_category`.`category_id` = '<catid>'
I've read something about forcing an index, but I think thats not neccesary anymore because the tables are already indexed, right?
Thanks alot!
Martijn
You haven't created necessary index on the table
Table article_category - Create a compound index on (article_id, category_id)
Table article -Create a compound index on (id, language_id)
If this doesn't help post the explain statement.
The columns used in a JOIN condition should have an index, so you need to index article_id.

How to query on a three mysql joined table?

I have three mysql table:
*page_category* table
CREATE TABLE `page_category` (
`id_page` VARCHAR(255) NOT NULL,
`name` VARCHAR(255) DEFAULT NULL,
`search_here` TEXT,
PRIMARY KEY (`id_page`),
FULLTEXT KEY `search` (`search_here`)
) ENGINE=MYISAM DEFAULT CHARSET=latin1
*page_category* table contains more than 2 million rows of data.
*user_page* table
CREATE TABLE `user_page` (
`user_id` VARCHAR(255) NOT NULL,
`id_page` VARCHAR(255) NOT NULL,
PRIMARY KEY (`user_id`,`id_page`)
) ENGINE=INNODB DEFAULT CHARSET=latin1
*user_page* table contains more than 10 million rows of data.
*user_relationship* table
CREATE TABLE `user_relationship` (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
`me` VARCHAR(255) DEFAULT NULL,
`friend` VARCHAR(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `me_friend` (`me`,`friend`)
) ENGINE=INNODB AUTO_INCREMENT=7517967 DEFAULT CHARSET=latin1
*user_relationship* table contains more than 1 million rows of data.
I do a query:
SELECT a.id_page AS ids, b.user_id,
a.name AS nama, c.me,
COUNT(c.me) AS nfriend,
GROUP_CONCAT(b.user_id SEPARATOR ',') AS friendlist
FROM
page_category a
LEFT JOIN user_page b
ON a.id_page = b.id_page
LEFT JOIN user_relationship c
ON
b.user_id = c.friend
WHERE
c.me='12' AND
MATCH(a.search_here) AGAINST('+book' IN BOOLEAN MODE);
results are shown in a very long time. am I wrong on writing the query?
You need to add proper indexing as well as you need to make query so that it has less temp. You can change order of join and explain it to debug your query to ger the best one.