MySQL adding longtext column making query extremely slow - any performance tip? - mysql

I have this table called stories that currently has 12 million records, on production.
CREATE TABLE `stories` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`headline` varchar(255) DEFAULT NULL,
`author_id` int(11) DEFAULT NULL,
`body` longtext NOT NULL,
`published_at` datetime DEFAULT NULL,
`type_id` int(11) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`aasm_state` varchar(255) NOT NULL,
`deleted` tinyint(1) DEFAULT '0',
`word_count` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `index_stories_on_cms_story_id` (`cms_story_id`),
KEY `typeid` (`type_id`),
KEY `index_stories_on_published_at` (`published_at`),
KEY `index_stories_on_updated_at` (`updated_at`),
KEY `index_stories_on_aasm_state_and_published_at_and_deleted` (`aasm_state`,`published_at`,`deleted`),
KEY `idx_author_id` (`author_id`)
) ENGINE=InnoDB AUTO_INCREMENT=511625276 DEFAULT CHARSET=utf8;
And I am performing the following queries: (just fetching the id runs fine)
SELECT `stories`.id
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER
BY `stories`.`id` ASC
LIMIT 1000;
...
1000 rows in set (0.59 sec)
However, when I add the longtext column to it, I get:
mysql> SELECT `stories`.id
, `stories`.body
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER BY `stories`.`id` ASC LIMIT 1000;
...
1000 rows in set (6 min 34.11 sec)
Any performance tip on how to deal with this table?

Typically a relational DBMS will apply ORDER BY after retrieving the initial result set - so it needs to load up all those stories then sort them. I don't have access to your record set, but at a guess, applying the sorting before retrieving the bulk content may improve performance:
SELECT *
FROM (
SELECT `stories`.id
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER BY `stories`.`id` ASC
LIMIT 1000
) ids
INNER JOIN stories bulk
ON ids.id=bulk.id
(BTW you might consider researching indexes more - what you have put here looks rather suspect).

I recommend this order for the index:
INDEX(`aasm_state`,`deleted`,id)
put the = tests first
end with range that matches the ORDER BY; hopefully this will avoid having to gather lots of rows, and sort them before getting to the LIMIT.
This index may help all variants of the query.

Related

Improve Laravel Eloquent Query

I have this relation in my model...
$this->hasMany('App\Inventory')->where('status',1)
->whereNull('deleted_at')
->where(function($query){
$query
->where('count', '>=', 1)
->orWhere(function($aQuery){
$aQuery
->where('count', '=' , 0)
->whereHas('containers', function($bQuery){
$bQuery->whereIn('status', [0,1]);
});
});
})
->orderBy('updated_at','desc')
->with('address', 'cabin');
And Sql query generated are :
select
*
from
`inventories`
where
`inventories`.`user_id` = 17
and `inventories`.`user_id` is not null
and `status` = 1
and `deleted_at` is null
and (
`count` >= 1
or (
`count` = 0
and exists (
select
*
from
`containers`
where
`inventories`.`id` = `containers`.`inventory_id`
and `status` in (0, 1)
)
)
)
and `inventories`.`deleted_at` is null
order by
`updated_at` desc
limit
10 offset 0
Unfortunately this take more than 2sec in MySql,
There are anyways to improve and reduce the query time for this?!
Each inventory has many containers. when inventory count is 0 (0 mean out of stock but sometimes there are disabled containers that mean inventory is not out of stock yet.) the real count is depend on count of containers with status [0,1] (containers have other statuses...).
I have an idea to have a column on inventory to count containers with [0,1] status, and update it in other processes to improve this query. but this take too much time and need to modify other process.
Inventories show create table
CREATE TABLE `inventories` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint unsigned NOT NULL,
`cabin_id` bigint unsigned NOT NULL,
`address_id` bigint unsigned NOT NULL,
`count` mediumint NOT NULL,
`status` mediumint NOT NULL,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`available_at` datetime DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=37837 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Containers show create table
CREATE TABLE `containers` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`inventory_id` bigint unsigned NOT NULL,
`order_id` bigint unsigned DEFAULT NULL,
`status` tinyint unsigned NOT NULL DEFAULT '1',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=64503 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Used Solution due comments (Thanks to #ysth #vixducis #Breezer ):
Changed Containers engine from MyISAM to InnoDB ,
Added INDEX to containers.inventory_id
And optimize code like below and limit whereHas select query
$this->hasMany('App\Inventory')->where('status',1)
->whereNull('deleted_at')
->where(function($query){
$query
->where('count', '>=', 1)
->orWhere('count', '=' , 0)
->whereHas('containers', function ($bQuery) {
$bQuery
->select('inventory_id')
->whereIn('status', [0, 1]);
});
})
->orderBy('updated_at','desc')
->with('address', 'cabin');
for whereHas we can use whereIn and subQuery like below
->whereIn('id', function ($subQuery) {
$subQuery->select('inventory_id')
->from('containers')
->whereIn('status', [0, 1]);
});
and for limiting select of dosentHave
->doesntHave('requests', 'and', function($query){
$query->select('inventory_id');
})
It looks like the containers table is still running on the MyISAM engine. While that engine is not deprecated, the development focus has shifted heavily towards InnoDB, and it should be a lot more performant. Switching to InnoDB is a good first step.
Secondly, I see that there is no index on containers.inventory_id. When experiencing performance issues when relating two tables, it's often a good idea to check whether adding an index on the column that relates the tables improves performance.
These two steps should make your query a lot faster.
when your data is big, whereHas statement sometimes run slowly because it use exists syntax. For more detailed explanation, you can read from this post.
To solve this, I prefer you to use mpyw/eloquent-has-by-non-dependent-subquery because it will use in syntax which will improve the performance. I already used this package on my project, and no problem until now.
Change to InnoDB.
inventories needs this composite index: INDEX(user_id, status, deleted_at, updated_at)
containers needs this composite index, not simply (inventory_id), but (inventory_id, status).
Redundant: inventories.user_id is not null because the test for 17 requires NOT NULL.
Redundant: deleted_at is null simply because it is in the query twice.

MySQL extremely simple date query is slow

I've went over this many times but I couldnt find a way to make this faster. I have a table with about 4 million records and I want to grab rows from a specific date range (which would only yield about 10000 results). My query takes 10 seconds to execute... why!?
SELECT *
FROM banjo_live.actions_activity
where userid IN (102,164,94,140)
AND actionsid=4
AND (actions_activity_timestamp between '2021-06-01 00:00:00'
AND '2021-06-31 23:23:23')
AND new_statusid NOT IN (10,13)
LIMIT 0, 50000
Surely this shouldnt take 10 seconds. What could be the issue?
Thanks
My table;
DROP TABLE IF EXISTS `actions_activity`;
CREATE TABLE `actions_activity` (
`actions_activity_id` int(11) NOT NULL AUTO_INCREMENT,
`orderid` int(11) NOT NULL,
`barcodeid` int(11) NOT NULL,
`skuid` int(11) NOT NULL,
`sku_code` varchar(50) CHARACTER SET latin1 COLLATE latin1_swedish_ci NULL DEFAULT NULL,
`actionsid` int(11) NOT NULL,
`action_note` text CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`starting_count` int(11) NOT NULL,
`new_count` int(11) NOT NULL,
`old_statusid` int(11) NOT NULL COMMENT 'Old Status',
`new_statusid` int(11) NOT NULL COMMENT 'New Status',
`userid` int(11) NOT NULL COMMENT 'Handled By',
`actions_activity_timestamp` timestamp(0) NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP(0),
`actions_activity_created_at` timestamp(0) NOT NULL DEFAULT CURRENT_TIMESTAMP,
`sessionid` int(11) NULL DEFAULT NULL,
PRIMARY KEY (`actions_activity_id`) USING BTREE,
INDEX `FetchingIndex`(`barcodeid`) USING BTREE,
INDEX `skuindex`(`skuid`) USING BTREE,
INDEX `searchbysession`(`sessionid`) USING BTREE,
FULLTEXT INDEX `sku_code`(`sku_code`)
) ENGINE = InnoDB AUTO_INCREMENT = 4336767 CHARACTER SET = latin1 COLLATE = latin1_swedish_ci ROW_FORMAT = Dynamic;
23:23:23 ?? -- Gordon's rewrite avoids typos like this. Or, I prefer this:
actions_activity_timestamp >= '2021-06-01' AND
actions_activity_timestamp < '2021-06-01' + INTERVAL 1 MONTH
Add a 2-column index where the second column is whichever of the other things in the WHERE is most selective:
INDEX(actionsid, ...)
Once you add an ORDER BY (cf, The Impaler), there may be a better index.
Are you really expecting 10K rows of output? That will choke most clients. Maybe there is some processing you could have SQL do so the output won't be as bulky?
First, I assume you intend:
SELECT *
FROM banjo_live.actions_activity
WHERE userid IN (102,164,94,140) AND
actionsid = 4 AND
actions_activity_timestamp >= '2021-06-01' AND
actions_activity_timestamp < '2021-07-01' AND
new_statusid NOT IN (10, 13)
LIMIT 0, 50000;
You want a composite index. Without knowing the sizes of the fields, I would suggest an index on (actionsid, userid, actions_activity_timestamp, new_statusid).

Concurrent queries on composite index with order by id drastically slow

I have a table defined as follows:
| book | CREATE TABLE `book` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`provider_id` int(10) unsigned DEFAULT '0',
`source_id` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
`title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`description` longtext COLLATE utf8_unicode_ci,
PRIMARY KEY (`id`),
UNIQUE KEY `provider` (`provider_id`,`source_id`),
KEY `idx_source_id` (`source_id`),
) ENGINE=InnoDB AUTO_INCREMENT=1605425 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
when there are about 10 concurrent read with following sql:
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '1037122800') ORDER BY `book`.`id` ASC LIMIT 1
it becomes slow, it takes about 100 ms.
however if I changed it to
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '221630001') LIMIT 1
then it is normal, it takes several ms.
I don't understand why adding order by id makes query much slower? could anyone expain?
Try to add desired columns (Select Column Name,.. ) instead of * or Refer this.
Why is my SQL Server ORDER BY slow despite the ordered column being indexed?
I'm not a mysql expert, and not able to perform a detailed analysis, but my guess would be that because you are providing values for the UNIQUE KEY in the WHERE clause, the engine can go and fetch that row directly using an index.
However, when you ask it to ORDER BY the id column, which is a PRIMARY KEY, that changes the access path. The engine now guesses that since it has an index on id, and you want to order by id, it is better to fetch that data in PK order, which will avoid a sort. In this case though, it leads to a slower result, as it has to compare every row to the criteria (a table scan).
Note that this is just conjecture. You would need to EXPLAIN both statements to see what is going on.

MARIADB: Index not used for a select with join on a range

I have a first table containing my ips stored as integer (500k rows), and a second one containing ranges of black listed ips and the reason of black listing (10M rows)
here is the table structure :
CREATE TABLE `black_lists` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`ip_start` INT(11) UNSIGNED NOT NULL,
`ip_end` INT(11) UNSIGNED NULL DEFAULT NULL,
`reason` VARCHAR(3) NOT NULL,
`excluded` TINYINT(1) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `ip_range` (`ip_end`, `ip_start`),
INDEX `ip_start` ( `ip_start`),
INDEX `ip_end` (`ip_end`),
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=10747741
;
CREATE TABLE `ips` (
`id` INT(11) NOT NULL AUTO_INCREMENT COMMENT 'Id ips',
`idhost` INT(11) NOT NULL COMMENT 'Id Host',
`ip` VARCHAR(45) NULL DEFAULT NULL COMMENT 'Ip',
`ipint` INT(11) UNSIGNED NULL DEFAULT NULL COMMENT 'Int ip',
`type` VARCHAR(45) NULL DEFAULT NULL COMMENT 'Type',
PRIMARY KEY (`id`),
INDEX `host` (`idhost`),
INDEX `index3` (`ip`),
INDEX `index4` (`idhost`, `ip`),
INDEX `ipsin` (`ipint`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=675651;
my problem is when I try to run this query no index is used and it takes an eternity to finish :
select i.ip,s1.reason
from ips i
left join black_lists s1 on i.ipint BETWEEN s1.ip_start and s1.ip_end;
I'm using MariaDB 10.0.16
True.
The optimizer has no knowledge that start..end values are non overlapping, nor anything else obvious about them. So, the best it can do is decide between
s1.ip_start <= i.ipint -- and use INDEX(ip_start), or
s1.ip_end >= i.ipint -- and use INDEX(ip_end)
Either of those could result in upwards of half the table being scanned.
In 2 steps you could achieve the desired goal for one ip; let's say #ip:
SELECT ip_start, reason
FROM black_lists
WHERE ip_start <= #ip
ORDER BY ip_start DESC
LIMIT 1
But after that, you need to see if the ip_end corresponding to that ip_start is <= #ip before deciding whether you have a black-listed item.
SELECT reason
FROM ( ... ) a -- fill in the above query
JOIN black_lists b USING(ip_start)
WHERE b.ip_end <= #ip
That will either return the reason or no rows.
In spite of the complexity, it will be very fast. But, you seem to have a set of IPs to check. That makes it more complex.
For black_lists, there seems to be no need for id. Suggest you replace the 4 indexes with only 2:
PRIMARY KEY(ip_start, ip_end),
INDEX(ip_end)
In ips, isn't ip unique? If so, get rid if id and change 5 indexes to 3:
PRIMARY KEY(idint),
INDEX(host, ip),
INDEX(ip)
You have allowed more than enough in the VARCHAR for IPv6, but not in INT UNSIGNED.
More discussion.

Why is this query being logged as "not using indexes"?

For some reason my slow query log is reporting the following query as "not using indexes" and for the life of me I cannot understand why.
Here is the query:
update scheduletask
set active = 0
where nextrun < date_sub( now(), interval 2 minute )
and enabled = 1
and active = 1;
Here is the table:
CREATE TABLE `scheduletask` (
`scheduletaskid` int(11) NOT NULL AUTO_INCREMENT,
`schedulethreadid` int(11) NOT NULL,
`taskname` varchar(50) NOT NULL,
`taskpath` varchar(100) NOT NULL,
`tasknote` text,
`recur` int(11) NOT NULL,
`taskinterval` int(11) NOT NULL,
`lastrunstart` datetime NOT NULL,
`lastruncomplete` datetime NOT NULL,
`nextrun` datetime NOT NULL,
`active` int(11) NOT NULL,
`enabled` int(11) NOT NULL,
`creatorid` int(11) NOT NULL,
`editorid` int(11) NOT NULL,
`created` datetime NOT NULL,
`edited` datetime NOT NULL,
PRIMARY KEY (`scheduletaskid`),
UNIQUE KEY `Name` (`taskname`),
KEY `IDX_NEXTRUN` (`nextrun`)
) ENGINE=InnoDB AUTO_INCREMENT=34 DEFAULT CHARSET=latin1;
Add another index like this
KEY `IDX_COMB` (`nextrun`, `enabled`, `active`)
I'm not sure how many rows your table have but the following might apply as well
Sometimes MySQL does not use an index, even if one is available. One
circumstance under which this occurs is when the optimizer estimates
that using the index would require MySQL to access a very large
percentage of the rows in the table. (In this case, a table scan is
likely to be much faster because it requires fewer seeks.)
try using the "explain" command in mysql.
http://dev.mysql.com/doc/refman/5.5/en/explain.html
I think explain only works on select statements, try:
explain select * from scheduletask where nextrun < date_sub( now(), interval 2 minute ) and enabled = 1 and active = 1;
Maybe if you use, nextrun = ..., it will macht the key IDX_NEXTRUN. In your where clause has to be one of your keys, scheduletaskid, taskname or nextrun
Sorry for the short answer but I don't have time to write a complete solution.
I believe you can fix your issue by saving date_sub( now(), interval 2 minute ) in a temporary variable before using it in the query, see here maybe: MySql How to set a local variable in an update statement (Syntax?).