MySQL extremely simple date query is slow - mysql

I've went over this many times but I couldnt find a way to make this faster. I have a table with about 4 million records and I want to grab rows from a specific date range (which would only yield about 10000 results). My query takes 10 seconds to execute... why!?
SELECT *
FROM banjo_live.actions_activity
where userid IN (102,164,94,140)
AND actionsid=4
AND (actions_activity_timestamp between '2021-06-01 00:00:00'
AND '2021-06-31 23:23:23')
AND new_statusid NOT IN (10,13)
LIMIT 0, 50000
Surely this shouldnt take 10 seconds. What could be the issue?
Thanks
My table;
DROP TABLE IF EXISTS `actions_activity`;
CREATE TABLE `actions_activity` (
`actions_activity_id` int(11) NOT NULL AUTO_INCREMENT,
`orderid` int(11) NOT NULL,
`barcodeid` int(11) NOT NULL,
`skuid` int(11) NOT NULL,
`sku_code` varchar(50) CHARACTER SET latin1 COLLATE latin1_swedish_ci NULL DEFAULT NULL,
`actionsid` int(11) NOT NULL,
`action_note` text CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`starting_count` int(11) NOT NULL,
`new_count` int(11) NOT NULL,
`old_statusid` int(11) NOT NULL COMMENT 'Old Status',
`new_statusid` int(11) NOT NULL COMMENT 'New Status',
`userid` int(11) NOT NULL COMMENT 'Handled By',
`actions_activity_timestamp` timestamp(0) NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP(0),
`actions_activity_created_at` timestamp(0) NOT NULL DEFAULT CURRENT_TIMESTAMP,
`sessionid` int(11) NULL DEFAULT NULL,
PRIMARY KEY (`actions_activity_id`) USING BTREE,
INDEX `FetchingIndex`(`barcodeid`) USING BTREE,
INDEX `skuindex`(`skuid`) USING BTREE,
INDEX `searchbysession`(`sessionid`) USING BTREE,
FULLTEXT INDEX `sku_code`(`sku_code`)
) ENGINE = InnoDB AUTO_INCREMENT = 4336767 CHARACTER SET = latin1 COLLATE = latin1_swedish_ci ROW_FORMAT = Dynamic;

23:23:23 ?? -- Gordon's rewrite avoids typos like this. Or, I prefer this:
actions_activity_timestamp >= '2021-06-01' AND
actions_activity_timestamp < '2021-06-01' + INTERVAL 1 MONTH
Add a 2-column index where the second column is whichever of the other things in the WHERE is most selective:
INDEX(actionsid, ...)
Once you add an ORDER BY (cf, The Impaler), there may be a better index.
Are you really expecting 10K rows of output? That will choke most clients. Maybe there is some processing you could have SQL do so the output won't be as bulky?

First, I assume you intend:
SELECT *
FROM banjo_live.actions_activity
WHERE userid IN (102,164,94,140) AND
actionsid = 4 AND
actions_activity_timestamp >= '2021-06-01' AND
actions_activity_timestamp < '2021-07-01' AND
new_statusid NOT IN (10, 13)
LIMIT 0, 50000;
You want a composite index. Without knowing the sizes of the fields, I would suggest an index on (actionsid, userid, actions_activity_timestamp, new_statusid).

Related

Improve Laravel Eloquent Query

I have this relation in my model...
$this->hasMany('App\Inventory')->where('status',1)
->whereNull('deleted_at')
->where(function($query){
$query
->where('count', '>=', 1)
->orWhere(function($aQuery){
$aQuery
->where('count', '=' , 0)
->whereHas('containers', function($bQuery){
$bQuery->whereIn('status', [0,1]);
});
});
})
->orderBy('updated_at','desc')
->with('address', 'cabin');
And Sql query generated are :
select
*
from
`inventories`
where
`inventories`.`user_id` = 17
and `inventories`.`user_id` is not null
and `status` = 1
and `deleted_at` is null
and (
`count` >= 1
or (
`count` = 0
and exists (
select
*
from
`containers`
where
`inventories`.`id` = `containers`.`inventory_id`
and `status` in (0, 1)
)
)
)
and `inventories`.`deleted_at` is null
order by
`updated_at` desc
limit
10 offset 0
Unfortunately this take more than 2sec in MySql,
There are anyways to improve and reduce the query time for this?!
Each inventory has many containers. when inventory count is 0 (0 mean out of stock but sometimes there are disabled containers that mean inventory is not out of stock yet.) the real count is depend on count of containers with status [0,1] (containers have other statuses...).
I have an idea to have a column on inventory to count containers with [0,1] status, and update it in other processes to improve this query. but this take too much time and need to modify other process.
Inventories show create table
CREATE TABLE `inventories` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint unsigned NOT NULL,
`cabin_id` bigint unsigned NOT NULL,
`address_id` bigint unsigned NOT NULL,
`count` mediumint NOT NULL,
`status` mediumint NOT NULL,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`available_at` datetime DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=37837 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Containers show create table
CREATE TABLE `containers` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`inventory_id` bigint unsigned NOT NULL,
`order_id` bigint unsigned DEFAULT NULL,
`status` tinyint unsigned NOT NULL DEFAULT '1',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=64503 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Used Solution due comments (Thanks to #ysth #vixducis #Breezer ):
Changed Containers engine from MyISAM to InnoDB ,
Added INDEX to containers.inventory_id
And optimize code like below and limit whereHas select query
$this->hasMany('App\Inventory')->where('status',1)
->whereNull('deleted_at')
->where(function($query){
$query
->where('count', '>=', 1)
->orWhere('count', '=' , 0)
->whereHas('containers', function ($bQuery) {
$bQuery
->select('inventory_id')
->whereIn('status', [0, 1]);
});
})
->orderBy('updated_at','desc')
->with('address', 'cabin');
for whereHas we can use whereIn and subQuery like below
->whereIn('id', function ($subQuery) {
$subQuery->select('inventory_id')
->from('containers')
->whereIn('status', [0, 1]);
});
and for limiting select of dosentHave
->doesntHave('requests', 'and', function($query){
$query->select('inventory_id');
})
It looks like the containers table is still running on the MyISAM engine. While that engine is not deprecated, the development focus has shifted heavily towards InnoDB, and it should be a lot more performant. Switching to InnoDB is a good first step.
Secondly, I see that there is no index on containers.inventory_id. When experiencing performance issues when relating two tables, it's often a good idea to check whether adding an index on the column that relates the tables improves performance.
These two steps should make your query a lot faster.
when your data is big, whereHas statement sometimes run slowly because it use exists syntax. For more detailed explanation, you can read from this post.
To solve this, I prefer you to use mpyw/eloquent-has-by-non-dependent-subquery because it will use in syntax which will improve the performance. I already used this package on my project, and no problem until now.
Change to InnoDB.
inventories needs this composite index: INDEX(user_id, status, deleted_at, updated_at)
containers needs this composite index, not simply (inventory_id), but (inventory_id, status).
Redundant: inventories.user_id is not null because the test for 17 requires NOT NULL.
Redundant: deleted_at is null simply because it is in the query twice.

MySQL adding longtext column making query extremely slow - any performance tip?

I have this table called stories that currently has 12 million records, on production.
CREATE TABLE `stories` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`headline` varchar(255) DEFAULT NULL,
`author_id` int(11) DEFAULT NULL,
`body` longtext NOT NULL,
`published_at` datetime DEFAULT NULL,
`type_id` int(11) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`aasm_state` varchar(255) NOT NULL,
`deleted` tinyint(1) DEFAULT '0',
`word_count` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `index_stories_on_cms_story_id` (`cms_story_id`),
KEY `typeid` (`type_id`),
KEY `index_stories_on_published_at` (`published_at`),
KEY `index_stories_on_updated_at` (`updated_at`),
KEY `index_stories_on_aasm_state_and_published_at_and_deleted` (`aasm_state`,`published_at`,`deleted`),
KEY `idx_author_id` (`author_id`)
) ENGINE=InnoDB AUTO_INCREMENT=511625276 DEFAULT CHARSET=utf8;
And I am performing the following queries: (just fetching the id runs fine)
SELECT `stories`.id
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER
BY `stories`.`id` ASC
LIMIT 1000;
...
1000 rows in set (0.59 sec)
However, when I add the longtext column to it, I get:
mysql> SELECT `stories`.id
, `stories`.body
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER BY `stories`.`id` ASC LIMIT 1000;
...
1000 rows in set (6 min 34.11 sec)
Any performance tip on how to deal with this table?
Typically a relational DBMS will apply ORDER BY after retrieving the initial result set - so it needs to load up all those stories then sort them. I don't have access to your record set, but at a guess, applying the sorting before retrieving the bulk content may improve performance:
SELECT *
FROM (
SELECT `stories`.id
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER BY `stories`.`id` ASC
LIMIT 1000
) ids
INNER JOIN stories bulk
ON ids.id=bulk.id
(BTW you might consider researching indexes more - what you have put here looks rather suspect).
I recommend this order for the index:
INDEX(`aasm_state`,`deleted`,id)
put the = tests first
end with range that matches the ORDER BY; hopefully this will avoid having to gather lots of rows, and sort them before getting to the LIMIT.
This index may help all variants of the query.

MARIADB: Index not used for a select with join on a range

I have a first table containing my ips stored as integer (500k rows), and a second one containing ranges of black listed ips and the reason of black listing (10M rows)
here is the table structure :
CREATE TABLE `black_lists` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`ip_start` INT(11) UNSIGNED NOT NULL,
`ip_end` INT(11) UNSIGNED NULL DEFAULT NULL,
`reason` VARCHAR(3) NOT NULL,
`excluded` TINYINT(1) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `ip_range` (`ip_end`, `ip_start`),
INDEX `ip_start` ( `ip_start`),
INDEX `ip_end` (`ip_end`),
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=10747741
;
CREATE TABLE `ips` (
`id` INT(11) NOT NULL AUTO_INCREMENT COMMENT 'Id ips',
`idhost` INT(11) NOT NULL COMMENT 'Id Host',
`ip` VARCHAR(45) NULL DEFAULT NULL COMMENT 'Ip',
`ipint` INT(11) UNSIGNED NULL DEFAULT NULL COMMENT 'Int ip',
`type` VARCHAR(45) NULL DEFAULT NULL COMMENT 'Type',
PRIMARY KEY (`id`),
INDEX `host` (`idhost`),
INDEX `index3` (`ip`),
INDEX `index4` (`idhost`, `ip`),
INDEX `ipsin` (`ipint`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=675651;
my problem is when I try to run this query no index is used and it takes an eternity to finish :
select i.ip,s1.reason
from ips i
left join black_lists s1 on i.ipint BETWEEN s1.ip_start and s1.ip_end;
I'm using MariaDB 10.0.16
True.
The optimizer has no knowledge that start..end values are non overlapping, nor anything else obvious about them. So, the best it can do is decide between
s1.ip_start <= i.ipint -- and use INDEX(ip_start), or
s1.ip_end >= i.ipint -- and use INDEX(ip_end)
Either of those could result in upwards of half the table being scanned.
In 2 steps you could achieve the desired goal for one ip; let's say #ip:
SELECT ip_start, reason
FROM black_lists
WHERE ip_start <= #ip
ORDER BY ip_start DESC
LIMIT 1
But after that, you need to see if the ip_end corresponding to that ip_start is <= #ip before deciding whether you have a black-listed item.
SELECT reason
FROM ( ... ) a -- fill in the above query
JOIN black_lists b USING(ip_start)
WHERE b.ip_end <= #ip
That will either return the reason or no rows.
In spite of the complexity, it will be very fast. But, you seem to have a set of IPs to check. That makes it more complex.
For black_lists, there seems to be no need for id. Suggest you replace the 4 indexes with only 2:
PRIMARY KEY(ip_start, ip_end),
INDEX(ip_end)
In ips, isn't ip unique? If so, get rid if id and change 5 indexes to 3:
PRIMARY KEY(idint),
INDEX(host, ip),
INDEX(ip)
You have allowed more than enough in the VARCHAR for IPv6, but not in INT UNSIGNED.
More discussion.

Why is this query being logged as "not using indexes"?

For some reason my slow query log is reporting the following query as "not using indexes" and for the life of me I cannot understand why.
Here is the query:
update scheduletask
set active = 0
where nextrun < date_sub( now(), interval 2 minute )
and enabled = 1
and active = 1;
Here is the table:
CREATE TABLE `scheduletask` (
`scheduletaskid` int(11) NOT NULL AUTO_INCREMENT,
`schedulethreadid` int(11) NOT NULL,
`taskname` varchar(50) NOT NULL,
`taskpath` varchar(100) NOT NULL,
`tasknote` text,
`recur` int(11) NOT NULL,
`taskinterval` int(11) NOT NULL,
`lastrunstart` datetime NOT NULL,
`lastruncomplete` datetime NOT NULL,
`nextrun` datetime NOT NULL,
`active` int(11) NOT NULL,
`enabled` int(11) NOT NULL,
`creatorid` int(11) NOT NULL,
`editorid` int(11) NOT NULL,
`created` datetime NOT NULL,
`edited` datetime NOT NULL,
PRIMARY KEY (`scheduletaskid`),
UNIQUE KEY `Name` (`taskname`),
KEY `IDX_NEXTRUN` (`nextrun`)
) ENGINE=InnoDB AUTO_INCREMENT=34 DEFAULT CHARSET=latin1;
Add another index like this
KEY `IDX_COMB` (`nextrun`, `enabled`, `active`)
I'm not sure how many rows your table have but the following might apply as well
Sometimes MySQL does not use an index, even if one is available. One
circumstance under which this occurs is when the optimizer estimates
that using the index would require MySQL to access a very large
percentage of the rows in the table. (In this case, a table scan is
likely to be much faster because it requires fewer seeks.)
try using the "explain" command in mysql.
http://dev.mysql.com/doc/refman/5.5/en/explain.html
I think explain only works on select statements, try:
explain select * from scheduletask where nextrun < date_sub( now(), interval 2 minute ) and enabled = 1 and active = 1;
Maybe if you use, nextrun = ..., it will macht the key IDX_NEXTRUN. In your where clause has to be one of your keys, scheduletaskid, taskname or nextrun
Sorry for the short answer but I don't have time to write a complete solution.
I believe you can fix your issue by saving date_sub( now(), interval 2 minute ) in a temporary variable before using it in the query, see here maybe: MySql How to set a local variable in an update statement (Syntax?).

mysql query optimization (wrong indexes?) avoid filesort

I will try to explain myself quickly.
I have a database called 'artikli' which has about 1M records.
On this table i run a lots of different queryies but 1 particular is causing problems (long execution time) when ORDER by is present.
This is my table structure:
CREATE TABLE IF NOT EXISTS artikli (
id int(11) NOT NULL,
name varchar(250) NOT NULL,
datum datetime NOT NULL,
kategorije_id int(11) default NULL,
id_valute int(11) default NULL,
podogovoru int(1) default '0',
cijena decimal(10,2) default NULL,
valuta int(1) NOT NULL default '0',
cijena_rezerva decimal(10,0) NOT NULL,
cijena_kupi decimal(10,0) default NULL,
cijena_akcija decimal(10,2) NOT NULL,
period int(3) NOT NULL default '30',
dostupnost enum('svugdje','samobih','samomojgrad','samomojkanton') default 'svugdje',
zemlja varchar(10) NOT NULL,
slike varchar(500) NOT NULL,
od_s varchar(34) default NULL,
od_id int(10) unsigned default NULL,
vrsta int(1) default '0',
trajanje datetime default NULL,
izbrisan int(1) default '0',
zakljucan int(1) default '0',
prijava int(3) default '0',
izdvojen decimal(1,0) NOT NULL default '0',
izdvojen_kad datetime NOT NULL,
izdvojen_datum datetime NOT NULL,
sajt int(1) default '0',
PRIMARY KEY (id),
KEY brend (brend),
KEY kanton (kanton),
KEY datum (datum),
KEY cijena (cijena),
KEY kategorije_id (kategorije_id,podogovoru,sajt,izdvojen,izdvojen_kad,datum)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And this is the query:
SELECT artikli.datum as brojx,
artikli.izdvojen as i,
artikli.izdvojen_kad as ii,
artikli.cijena as cijena, artikli.name
FROM artikli
WHERE artikli.izbrisan=0 and artikli.prodano!=3
and artikli.zavrseno=0 and artikli.od_id!=0
and (artikli.sajt=0 or (artikli.sajt=1 and artikli.dostupnost='svugdje'))
and kategorije_id IN (18)
ORDER by i DESC, ii DESC, brojx DESC
LIMIT 0,20
What i want to do is to avoid Filesort which is very slow.
It would have been a big help if you'd provided the explain plan for the query.
Why do you think its the filesort which is causing the problem? Looking at the query you seem to be applying a lot filtering - which should reduce the output set significantly - but none of can use the available indexes.
artikli.izbrisan=0 and artikli.prodano!=3
and artikli.zavrseno=0 and artikli.od_id!=0
and (artikli.sajt=0 or (artikli.sajt=1 and artikli.dostupnost='svugdje'))
and kategorije_id IN (18)
Although I don't know what the pattern of your data is, I suspect that you might get a lot more benefit by adding an index on :
kategorije_id,izbrisan,sajt
Are all those other indexes really being used already?
Although you'd get a LOT more bang for your buck by denormalizing all those booleans (assuming that the table is normalised to start with and there are not hidden functional dependencies in there).
C.
The problem is that you don't have an index on the izdvojen, izdvojen_kad and datum columns that are used by the ORDER BY.
Note that the large index you have starting with kategorije_id can't be used for sorting (although it will help somewhat with the where clause) because the columns you are sorting by are at the end of the index.
Actually, the order by is not the basis for the index you want... but the CRITERIA you want to mostly match the query... Filter the smaller set of data out, you'll get smaller set of the table... I would change the WHERE clause a bit, but you'll know your data best. Put your smallest expected condition first and ensure an index is based on that... something like
WHERE
artikli.izbrisan = 0
and artikli.zavrseno = 0
and artikli.kategorije_id IN (18)
and artikli.prodano != 3
and artikli.od_id != 0
and ( artikli.sajt = 0
or ( artikli.sajt = 1
and artikli.dostupnost='svugdje')
)
and having a compound index on (izbrisan, zavrseno, kategorije_id)... I've mode the other != comparisons after as they are not specific key values, instead, they are ALL EXCEPT the value in question.