We have performance problems with a string search query (MySQL 5.5). Table main contains a few text columns. Table multi contains zero or more multivalue fields per main entry. We want to count how many main items that have one of the fields starting with a specific string.
Here is the query:
select count(main_id) from main ma
where s1 like '888%' or s2 like '888%' or s3 like '888%'
or exists (select m.multivalue
from multi m
where ma.main_id=m.main_id and (m.multivalue like '888%'))
Already at about a few hundred thousand records in main, and about the same number of records in multi, the query takes seconds to complete.
Here is the explain:
PRIMARY ma ALL s1,s2,s3 100407 100.00 Using where
DEPENDENT SUBQUERY m ref PRIMARY,main_id,multivalue PRIMARY 8 sample.ma.main_id 1 100.00 Using where; Using index
And the table definitions:
CREATE TABLE `main` (
`main_id` bigint(20) NOT NULL AUTO_INCREMENT,
`s1` varchar(50) CHARACTER SET utf8 NOT NULL,
`s2` varchar(50) CHARACTER SET utf8 DEFAULT NULL,
`s3` varchar(50) CHARACTER SET utf8 DEFAULT NULL,
PRIMARY KEY (`main_id`),
KEY `s1` (`s1`),
KEY `s2` (`s2`),
KEY `s3` (`s3`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE `multi` (
`main_id` bigint(20) NOT NULL,
`multivalue` varchar(50) CHARACTER SET utf8 NOT NULL,
PRIMARY KEY (`main_id`,`multivalue`),
KEY `main_id` (`main_id`),
KEY `multivalue` (`multivalue`),
CONSTRAINT `FK_multi_main` FOREIGN KEY (`main_id`) REFERENCES `main` (`main_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
Any ideas how to make this query faster?
innodb settings:
innodb_additional_mem_pool_size=32M
innodb_flush_log_at_trx_commit=1
innodb_log_buffer_size=64M
innodb_buffer_pool_size=1000M
innodb_log_file_size=128M
innodb_thread_concurrency=16
innodb_file_per_table
innodb_file_format=Barracuda
Related
I have the following 2 tables (browsers and metrics). browsers is a "dimensions table" which stores the name and version of a browser. metrics is a "facts table" which holds the browser_id and metrics, in conjunction with a date. According to explain select (...) no key is used on metrics and the primary key is used on browsers.
SELECT browsers.name AS browser_name,
SUM(visits_count) AS visits_count,
SUM(clicks_count) AS clicks_count,
IFNULL((100 / SUM(visits_count)) * SUM(clicks_count), 0) AS ctr,
SUM(cost_integral) AS cost_integral,
IFNULL((SUM(cost_integral) / SUM(visits_count)), 0) AS cpv_integral,
IFNULL((SUM(cost_integral) / SUM(clicks_count)), 0) AS cpc_integral,
SUM(conversions_count) AS conversions_count,
IFNULL((100 / SUM(clicks_count)) * conversions_count, 0) AS cvr,
SUM(revenue_integral) AS revenue_integral,
IFNULL((SUM(revenue_integral) / SUM(clicks_count)), 0) AS epc_integral,
(SUM(revenue_integral) - SUM(cost_integral)) AS profit_integral,
IFNULL((SUM(revenue_integral) - SUM(cost_integral)) / SUM(cost_integral) * 100, 0) AS roi
FROM metrics
JOIN browsers ON browsers.id = browser_id
GROUP BY browsers.name
Server:
8 vCPU, 32 GB Memory, 250 GB SSD
MySQL 8
Without all the SUM functions, the time of 900ms is reduced by about 250 to 300ms. Without the GROUP BY even down to 1 to 2 digit ms. Unfortunately I need the GROUP BY, as well as the number of SUM functions.
What can be the reason that such a server needs between 1 second and 2 seconds to execute the query on a table with only 80,000 rows? According to explain analyze the SUM functions need 96% of the time (actual time=845.038..845.052) that is needed in total.
-- browsers-Table
CREATE TABLE `browsers` (
`id` bigint(20) UNSIGNED NOT NULL,
`name` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
`version` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
ALTER TABLE `browsers`
ADD PRIMARY KEY (`id`),
ADD KEY `b_n` (`name`),
ADD KEY `b_v` (`version`),
ADD KEY `b_n_v` (`name`,`version`),
ADD KEY `b_v_n` (`version`,`name`);
ALTER TABLE `browsers`
MODIFY `id` bigint(20) UNSIGNED NOT NULL AUTO_INCREMENT;
-- metrics-Table
CREATE TABLE `metrics` (
`reference_date` date NOT NULL,
`browser_id` bigint(20) UNSIGNED NOT NULL,
`visits_count` bigint(20) NOT NULL DEFAULT 0,
`cost_integral` bigint(20) NOT NULL DEFAULT 0,
`clicks_count` bigint(20) NOT NULL DEFAULT 0,
`conversions_count` bigint(20) NOT NULL DEFAULT 0,
`revenue_integral` bigint(20) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
ALTER TABLE `metrics`
ADD UNIQUE KEY `mu` (`reference_date`,`browser_id`),
ADD KEY `metrics_browser_id_foreign` (`browser_id`);
ALTER TABLE `metrics`
ADD CONSTRAINT `metrics_browser_id_foreign` FOREIGN KEY (`browser_id`) REFERENCES `browsers` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
Even on my localserver, with the same data, I need only ~10ms - so I suspect a faulty setting of the server (according to mysqltuner there are no remarkable suggestions).
I have this table:
CREATE TABLE `mytable` (
`session_id` mediumint(8) UNSIGNED NOT NULL,
`data` json NOT NULL,
`jobname` varchar(100) COLLATE utf8_unicode_ci GENERATED ALWAYS AS
(json_unquote(json_extract(`data`,'$.jobname'))) VIRTUAL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
PARTITION BY HASH (session_id)
PARTITIONS 10;
ALTER TABLE `mytable`
ADD KEY `session` (`session_id`),
ADD KEY `jobname` (`jobname`);
It has 2 million rows.
When execute this query, it takes around 23 sec to get the result.
SELECT JSON_EXTRACT(f.data, '$.jobdesc') AS jobdesc
FROM mytable f
WHERE f.session_id = 1
ORDER BY jobdesc DESC
I understand that it is slow because there is no index for jobdesc field.
On data's column, I have 12 fields. I want to let user to be able to sort all fields. If I add index for each field, is it good approach?
Is there any way to improve it?
I am using MYSQL 5.7.13.
You would have to create a virtual column with an index for each of your 12 fields, if you want the user to have the option of sorting by.
CREATE TABLE `mytable` (
`session_id` mediumint(8) UNSIGNED NOT NULL,
`data` json NOT NULL,
`jobname` varchar(100) AS (json_unquote(json_extract(`data`,'$.jobname'))),
`jobdesc` varchar(100) AS (json_unquote(json_extract(`data`,'$.jobdesc'))),
...other extracted virtual fields...
KEY (`jobname`),
KEY (`jobdesc`),
...other indexes on virtual columns...
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
PARTITION BY HASH (session_id)
PARTITIONS 10;
This makes me wonder: why bother using JSON? Why not just declare 12 conventional, non-virtual columns with indexes?
CREATE TABLE `mytable` (
`session_id` mediumint(8) UNSIGNED NOT NULL,
...no `data` column needed...
`jobname` varchar(100),
`jobdesc` varchar(100),
...
KEY (`jobname`),
KEY (`jobdesc`),
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
PARTITION BY HASH (session_id)
PARTITIONS 10;
JSON is best when you treat it as a single atomic document, and don't try to use SQL operations on fields within it. If you regularly need to access fields within your JSON, make them into conventional columns.
I have two huge innodb tables (page: +40M rows, +30Gb and stat: +45M rows, +10Gb). I have a query that selects rows from the join of these two tables and it used to take about a second for execution. Recently it's taking more than 20 seconds (sometime up to few minutes) for the exact same query to be completed. I suspected that with lot's of inserts and updates it might need an optimization. I ran OPTIMIZE TABLE on the table using phpMyAdmin but no improvements. I've Googled a lot but couldn't find any content helping me on this situation.
The query I mentioned earlier looks like below:
SELECT `c`.`unique`, `c`.`pub`
FROM `pages` `c`
LEFT JOIN `stat` `s` ON `c`.`unique`=`s`.`unique`
WHERE `s`.`isc`='1'
AND `s`.`haa`='0'
AND (`pubID`='24')
ORDER BY `eid` ASC LIMIT 0, 10
These are the tables structure:
CREATE TABLE `pages` (
`eid` int(10) UNSIGNED NOT NULL,
`ti` text COLLATE utf8_persian_ci NOT NULL,
`fat` text COLLATE utf8_persian_ci NOT NULL,
`de` text COLLATE utf8_persian_ci NOT NULL,
`fad` text COLLATE utf8_persian_ci NOT NULL,
`pub` varchar(100) COLLATE utf8_persian_ci NOT NULL,
`pubID` int(10) UNSIGNED NOT NULL,
`pubn` text COLLATE utf8_persian_ci NOT NULL,
`unique` tinytext COLLATE utf8_persian_ci NOT NULL,
`pi` tinytext COLLATE utf8_persian_ci NOT NULL,
`kw` text COLLATE utf8_persian_ci NOT NULL,
`fak` text COLLATE utf8_persian_ci NOT NULL,
`te` text COLLATE utf8_persian_ci NOT NULL,
`fae` text COLLATE utf8_persian_ci NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci;
ALTER TABLE `pages`
ADD PRIMARY KEY (`eid`),
ADD UNIQUE KEY `UNIQ` (`unique`(128)),
ADD KEY `pub` (`pub`),
ADD KEY `unique` (`unique`(128)),
ADD KEY `pubID` (`pubID`) USING BTREE;
ALTER TABLE `pages` ADD FULLTEXT KEY `faT` (`fat`);
ALTER TABLE `pages` ADD FULLTEXT KEY `faA` (`fad`,`fae`);
ALTER TABLE `pages` ADD FULLTEXT KEY `faK` (`fak`);
ALTER TABLE `pages` ADD FULLTEXT KEY `pubn` (`pubn`);
ALTER TABLE `pages` ADD FULLTEXT KEY `faTAK` (`fat`,`fad`,`fak`,`fae`);
ALTER TABLE `pages` ADD FULLTEXT KEY `ab` (`de`,`te`);
ALTER TABLE `pages` ADD FULLTEXT KEY `Ti` (`ti`);
ALTER TABLE `pages` ADD FULLTEXT KEY `Kw` (`kw`);
ALTER TABLE `pages` ADD FULLTEXT KEY `TAK` (`ti`,`de`,`kw`,`te`);
ALTER TABLE `pages`
MODIFY `eid` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;
CREATE TABLE `stat` (
`sid` int(10) UNSIGNED NOT NULL,
`unique` tinytext COLLATE utf8_persian_ci NOT NULL,
`haa` tinyint(1) UNSIGNED NOT NULL,
`isc` tinyint(1) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci;
ALTER TABLE `stat`
ADD PRIMARY KEY (`sid`),
ADD UNIQUE KEY `Unique` (`unique`(128)),
ADD KEY `isc` (`isc`),
ADD KEY `haa` (`haa`),
ALTER TABLE `stat`
MODIFY `sid` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;
The following query took only 0.0126 seconds with 38685601 total results as said by phpMyAdmin:
SELECT `sid` FROM `stat` WHERE `s`.`isc`='1' AND `s`.`haa`='0'
and this one took 0.0005 seconds with 5159484 total results
SELECT `eid`, `unique`, `pubn`, `pi` FROM `pages` WHERE `pubID`='24'
Am I missing something? Can anybody help?
The slowdown is probably due to scanning so many rows, and that is now more than can fit in cache. So, let's try to improve the query.
Replace INDEX(pubID) with INDEX(pubID, eid) -- This may allow both the WHERE and ORDER BY to be handled by the index, thereby avoiding a sort.
Replace TINYTEXT with VARCHAR(255) or some smaller limit. This may speed up tmp tables.
Don't use prefix index on eid -- its an INT !
Don't say UNIQUE with prefixing -- UNIQUE(x(128)) only checks the uniqueness of the first 128 columns !
Once you change to VARCHAR(255) (or less), you can apply UNIQUE to the entire column.
The biggest performance issue is filtering on two tables -- can you move the status flags into the main table?
Change LEFT JOIN to JOIN.
What does unique look like? If it is a "UUID", that could further explain the trouble.
If that is a UUID that is 39 characters, the string can be converted to a 16-byte column for further space savings (and speedup). Let's discuss this further if necessary.
5 million results in 0.5ms is bogus -- it was fetching from the Query cache. Either turn off the QC or run with SELECT SQL_NO_CACHE...
+1 to #RickJames answer, but following it I have done a test.
I would also recommend you do not use the name unique for a column name, because it's an SQL reserved word.
ALTER TABLE pages
CHANGE `unique` objectId VARCHAR(128) NOT NULL COMMENT 'Document Object Identifier',
DROP KEY pubId,
ADD KEY bktest1 (pubId, eid, objectId, pub);
ALTER TABLE stat
CHANGE `unique` objectId VARCHAR(128) NOT NULL COMMENT 'Document Object Identifier',
DROP KEY `unique`,
ADD UNIQUE KEY bktest2 (objectId, isc, haa);
mysql> explain SELECT `c`.`objectId`, `c`.`pub` FROM `pages` `c` JOIN `stat` `s` ON `c`.`objectId`=`s`.`objectId` WHERE `s`.`isc`='1' AND `s`.`haa`='0' AND (`pubID`='24') ORDER BY `eid` ASC LIMIT 0, 10;
+----+-------------+-------+------------+--------+-------------------------+---------+---------+-----------------------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------+---------+---------+-----------------------------+------+----------+--------------------------+
| 1 | SIMPLE | c | NULL | ref | unique,unique_2,bktest1 | bktest1 | 4 | const | 1 | 100.00 | Using where; Using index |
| 1 | SIMPLE | s | NULL | eq_ref | bktest2,haa,isc | bktest2 | 388 | test.c.objectId,const,const | 1 | 100.00 | Using index |
+----+-------------+-------+------------+--------+-------------------------+---------+---------+-----------------------------+------+----------+--------------------------+
By creating the multi-column indexes, this makes them covering indexes, and you see "Using index" in the EXPLAIN report.
It's important to put eid second in the bktest1 index, so you avoid a filesort.
This is the best you can hope to optimize this query without denormalizing or partitioning the tables.
Next you should make sure your buffer pool is large enough to hold all the requested data.
I have a large live database where around 1000 users are updating 2 or more updates every minute. at the same time there are 4 users are getting reports and adding new items. the main 2 tables contains around 2 Million and 4 Million rows till present.
Queries using these tables are taking too much time, even simple queries like:
"SELECT COUNT(*) FROM MyItemsTable" and "SELECT COUNT(*) FROM MyTransactionsTable"
are taking 10 seconds and 26 seconds
large reports now are taking 15mins !!! toooooo much time.
All the table that I'm using are innodb
is there any way to solve this problem before I read about reputation ??
Thank you in advance for any help
Edit
Here is the structure and indexes of MyItemsTable:
CREATE TABLE `pos_MyItemsTable` (
`itemid` bigint(15) NOT NULL,
`uploadid` bigint(15) NOT NULL,
`itemtypeid` bigint(15) NOT NULL,
`statusid` int(1) NOT NULL,
`uniqueid` varchar(10) DEFAULT NULL,
`referencenb` varchar(30) DEFAULT NULL,
`serialnb` varchar(25) DEFAULT NULL,
`code` varchar(50) DEFAULT NULL,
`user` varchar(16) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`pass` varchar(100) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`expirydate` date DEFAULT NULL,
`userid` bigint(15) DEFAULT NULL,
`insertdate` datetime DEFAULT NULL,
`updateuser` bigint(15) DEFAULT NULL,
`updatedate` datetime DEFAULT NULL,
`counternb` int(1) DEFAULT '0',
PRIMARY KEY (`itemid`),
UNIQUE KEY `referencenb_unique` (`referencenb`),
KEY `MyItemsTable_r04` (`itemtypeid`),
KEY `MyItemsTable_r05` (`uploadid`),
KEY `FK_MyItemsTable` (`statusid`),
KEY `ind_MyItemsTable_serialnb` (`serialnb`),
KEY `uniqueid_key` (`uniqueid`),
KEY `ind_MyItemsTable_insertdate` (`insertdate`),
KEY `ind_MyItemsTable_counternb` (`counternb`),
CONSTRAINT `FK_MyItemsTable` FOREIGN KEY (`statusid`) REFERENCES `MyItemsTable_statuses` (`statusid`),
CONSTRAINT `MyItemsTable_r04` FOREIGN KEY (`itemtypeid`) REFERENCES `itemstypes` (`itemtypeid`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `MyItemsTable_r05` FOREIGN KEY (`uploadid`) REFERENCES `uploads` (`uploadid`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Just having few indexes does not mean your tables and queries are optimized.
Try to identify the querties that run the slowest and add specific indexes there.
Selecting * from a huge table .. where you have columns that contain text / images / files
will be aways slow. Try to limit the selection of such fat columns when you don't need them.
future readings:
http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/
and some more advanced configurations:
http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/
http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/
source
UPDATE:
try to use composite keys for some of the heaviest queries,
by placing the main fields that are compared in ONE index:
`MyItemsTable_r88` (`itemtypeid`,`statusid`, `serialnb`), ...
this will give you faster results for queries that complare only columns from the index :
SELECT * FROM my_table WHERE `itemtypeid` = 5 AND `statusid` = 0 AND `serialnb` > 500
and extreamlly fast if you search and select values from the index:
SELECT `serialnb` FROM my_table WHERE `statusid` = 0 `itemtypeid` IN(1,2,3);
This are really basic examples you will have to read a bit more and analyze the data for the best results.
I'm trying to run a more or less simple query on a MySQL table with about 100000 entries (Size: 2319.58MB; avg: 24KB pre row).
This is the query:
SELECT
n0_.id AS id0,
n0_.sentTime AS sentTime1,
n0_.deliveredTime AS deliveredTime2,
n0_.queuedTime AS queuedTime3,
n0_.message AS message4,
n0_.failReason AS failReason5,
n0_.targetNumber AS targetNumber6,
n0_.sid AS sid7,
n0_.targetNumber AS targetNumber8,
n0_.sid AS sid9,
n0_.subject AS subject10,
n0_.targetAddress AS targetAddress11,
n0_.priority AS priority12,
n0_.targetUrl AS targetUrl13,
n0_.discr AS discr14,
n0_.notification_state_id AS notification_state_id15,
n0_.user_id AS user_id16
FROM
notifications n0_
INNER JOIN notification_states n1_ ON n0_.notification_state_id = n1_.id
WHERE
n0_.discr IN (
'twilioCallNotification', 'twilioSMSNotification',
'emailNotification', 'httpNotification'
)
ORDER BY
n0_.id desc
LIMIT
25 OFFSET 0
The query between 100 and 200sec.
This is the create statement of the table:
CREATE TABLE notifications (
id INT AUTO_INCREMENT NOT NULL,
notification_state_id INT DEFAULT NULL,
user_id INT DEFAULT NULL,
sentTime DATETIME DEFAULT NULL,
deliveredTime DATETIME DEFAULT NULL,
queuedTime DATETIME NOT NULL,
message LONGTEXT NOT NULL,
failReason LONGTEXT DEFAULT NULL,
discr VARCHAR(255) NOT NULL,
targetNumber VARCHAR(1024) DEFAULT NULL,
sid VARCHAR(34) DEFAULT NULL,
subject VARCHAR(1024) DEFAULT NULL,
targetAddress VARCHAR(1024) DEFAULT NULL,
priority INT DEFAULT NULL,
targetUrl VARCHAR(1024) DEFAULT NULL,
INDEX IDX_6000B0D350C80464 (notification_state_id),
INDEX IDX_6000B0D3A76ED395 (user_id),
PRIMARY KEY (id)
) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE=InnoDB;
ALTER TABLE notifications ADD CONSTRAINT FK_6000B0D350C80464 FOREIGN KEY (notification_state_id) REFERENCES notification_states (id) ON DELETE CASCADE;
ALTER TABLE notifications ADD CONSTRAINT FK_6000B0D3A76ED395 FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE CASCADE;
These queries are created by Doctrine2 (ORM). We are using MySQL 5.5.31 on a virtual machine (1GB ram, single core) running Debian 6.0.7. Why is the performance of the query so bad? Is it a database design fault or is the vbox just to small to handle this amount of data?
Add an index on discr column :
ALTER TABLE `notifications` ADD INDEX `idx_discr` (`discr` ASC) ;
You're filtering by notifications.discr but don't appear to have that indexed. Have you tried adding an index on it?