Nodejs Mysql optimizing query - mysql

I am using mysql2 module in nodejs v8.9.4.
This is my function to get a message from message queue which meets this conditions :
status==0
if count of botId with status==1 is less than 10
if retry_after in wait table for botId+chatId and just botId is less than NOW(timestamp)
if there is no same chatId with status==1
static async Find(activeMessageIds, maxActiveMsgPerBot) {
let params = [maxActiveMsgPerBot];
let filterActiveMessageIds = ' ';
let time = Util.GetTimeStamp();
if (activeMessageIds && activeMessageIds.length) {
filterActiveMessageIds = 'q.id NOT IN (?) AND ';
params.push(activeMessageIds);
}
let q =
`select q.*
from bot_message_queue q
left join bot_message_queue_wait w on q.botId=w.botId AND q.chatId=w.chatId
left join bot_message_queue_wait w2 on q.botId=w2.botId AND w2.chatId=0
where
q.status=0 AND
q.botId NOT IN (select q2.botId from bot_message_queue q2 where q2.status=1 group by q2.botId HAVING COUNT(q2.botId)>?) AND
${filterActiveMessageIds}
q.chatId NOT IN (select q3.chatId from bot_message_queue q3 where q3.status=1 group by q3.chatId) AND
(w.retry_after IS NULL OR w.retry_after <= ?) AND
(w2.retry_after IS NULL OR w2.retry_after <= ?)
order by q.priority DESC,q.id ASC
limit 1;`;
params.push(time);
params.push(time);
let con = await DB.connection();
let result = await DB.query(q, params, con);
if (result && result.length) {
result = result[0];
let updateQ = `update bot_message_queue set status=1 where id=?;`;
await DB.query(updateQ, [result.id], con);
} else
result = null;
con.release();
return result;
}
This query runs fine on my local dev system. It also runs fine in servers phpmyadmin in couple of milliseconds.
BUT when it runs throw nodejs+mysql2 The cpu usage goes up to 100%
There is only 2K rows in this table.
CREATE TABLE IF NOT EXISTS `bot_message_queue` (
`id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`botId` int(10) UNSIGNED NOT NULL,
`chatId` varchar(50) CHARACTER SET utf8 NOT NULL,
`type` varchar(50) DEFAULT NULL,
`message` longtext NOT NULL,
`add_date` int(10) UNSIGNED NOT NULL,
`status` tinyint(2) UNSIGNED NOT NULL DEFAULT '0' COMMENT '0=waiting,1=sendig,2=sent,3=error',
`priority` tinyint(1) UNSIGNED NOT NULL DEFAULT '5' COMMENT '5=normal messages,<5 = bulk messages',
`delay_after` int(10) UNSIGNED NOT NULL DEFAULT '1000',
`send_date` int(10) UNSIGNED DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `botId` (`botId`,`status`),
KEY `botId_2` (`botId`,`chatId`,`status`,`priority`),
KEY `chatId` (`chatId`,`status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE IF NOT EXISTS `bot_message_queue_wait` (
`botId` int(10) UNSIGNED NOT NULL,
`chatId` varchar(50) CHARACTER SET utf8 NOT NULL,
`retry_after` int(10) UNSIGNED NOT NULL,
PRIMARY KEY (`botId`,`chatId`),
KEY `retry_after` (`retry_after`),
KEY `botId` (`botId`,`chatId`,`retry_after`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
UPDATE: Real table data here
UPDATE 2:
FetchMessageTime :
- Min : 1788 ms
- Max : 44285 ms
- Average : 20185.4 ms
The max was like 20ms until yesterday :( now its 40 seconds!!!
UPDATE 3: I merged these 2 joins and wheres:
left join bot_message_queue_wait w on q.botId=w.botId AND q.chatId=w.chatId
left join bot_message_queue_wait w2 on q.botId=w2.botId AND w2.chatId=0
(w.retry_after IS NULL OR w.retry_after <= ?) AND
(w2.retry_after IS NULL OR w2.retry_after <= ?)
into a single one, I hope this will work as intended!
left join bot_message_queue_wait w on q.botId=w.botId AND ( q.chatId=w.chatId OR w.chatId=0 )
and for the time being I removed the 2 wheres and the query time went back to normal.
q.botId NOT IN (select ...)
q.chatId NOT IN (select ...)
So these 2 where queries are the chock points and needs to be fixed.

NOT IN ( SELECT ... ) is difficult to optimize.
OR cannot be optimized.
In ORDER BY, mixing DESC and ASC eliminates use of an index (until 8.0). Consider changing ASC to DESC. After that, INDEX(priority, id) might help.
What is ${filterActiveMessageIds}?
The GROUP BY is not needed in
NOT IN ( SELECT q3.chatId
from bot_message_queue q3
where q3.status=1
group by q3.chatId )
INDEX(status, chatid) in this order would benefit that subquery.
INDEX(status, botid) in this order
More on index creation: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

I would replace the NOT IN subquery with a NOT EXISTS in this case, as it can perform better.
Switch the ORDER BY to either all DESC or all ASC
So to optimize the query, first, add these indexes:
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_status_botid_chatid_priori_id` (`status`,`botId`,`chatId`,`priority`,`id`);
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_priority_id` (`priority`,`id`);
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_botid_status` (`botId`,`status`);
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_chatid_status` (`chatId`,`status`);
ALTER TABLE `bot_message_queue_wait` ADD INDEX `bot_message_queue_wa_idx_chatid_botid` (`chatId`,`botId`);
Now, you can try to run this query (please note I changed the order by to all DESC, so you can change it to ASC if that's a requirement):
SELECT
bot_message_queue.*
FROM
bot_message_queue q
LEFT JOIN
bot_message_queue_wait w
ON q.botId = w.botId
AND q.chatId = w.chatId
LEFT JOIN
bot_message_queue_wait w2
ON q.botId = w2.botId
AND w2.chatId = 0
WHERE
q.status = 0
AND NOT EXISTS (
SELECT
1
FROM
bot_message_queue AS q21
WHERE
q21.status = 1
AND q.botId = q21.botId
GROUP BY
q21.botId
HAVING
COUNT(q21.botId) > ?
ORDER BY
NULL
)
AND NOT EXISTS (
SELECT
1
FROM
bot_message_queue AS q32
WHERE
q32.status = 1
AND q.chatId = q32.chatId
GROUP BY
q32.chatId
ORDER BY
NULL
)
AND (
w.retry_after IS NULL
OR w.retry_after <= ?
)
AND (
w2.retry_after IS NULL
OR w2.retry_after <= ?
)
ORDER BY
q.priority DESC,
q.id DESC LIMIT 1

Related

improve mysql select query with order by option

I have following table with around 10 million records.
and using following query to retrieve data, but it is taking more than 4, 5 seconds to hand over the response.
Is any way to improve query...?
CREATE TABLE `master` (
`organizationName` varchar(200) NOT NULL DEFAULT '',
`organizationNameQuery` varchar(200) DEFAULT NULL,
`organizationLinkedinHandle` varchar(200) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '',
`organizationDomain` varchar(110) NOT NULL DEFAULT '',
`source` varchar(10) NOT NULL DEFAULT '',
`modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
UNIQUE KEY `master_inx` (`organizationName`(80),`organizationDomain`(80),`organizationLinkedinHandle`(80),`organizationNameQuery`(80),`source`),
KEY `organizationDomain` (`organizationDomain`),
KEY `domainWithModified` (`organizationDomain`,`modified`),
KEY `modifiedInx` (`modified`)
);
Query:
SELECT *
FROM (SELECT *
FROM Organizations.master
where ( ( organizationDomain like 'linkedin.com'
|| organizationNameQuery = 'linkedin.com')
and source like 'MY_SOURCE') ) M
ORDER BY M.modified DESC limit 1;
1 row in set (4.69 sec)
UPDATE
I found by breaking OR operator i am getting result faster.
For example:
SELECT *
FROM (SELECT *
FROM Organizations.master
where ( ( organizationDomain like 'linkedin.com')
and source like 'MY_SOURCE') ) M
ORDER BY M.modified DESC limit 1;
1 row in set (0.00 sec)
SELECT *
FROM (SELECT *
FROM Organizations.master
where ( (organizationNameQuery = 'linkedin.com')
and source like 'MY_SOURCE') ) M
ORDER BY M.modified DESC limit 1;
1 row in set (0.00 sec)
Use OR, not || in that context.
The performance villain is OR. Turn the OR into UNION:
( SELECT *
FROM Organizations.master
WHERE organizationDomain = 'linkedin.com'
AND source = 'MY_SOURCE'
ORDER BY modified DESC limit 1
) UNION ALL
( SELECT *
FROM Organizations.master
WHERE organizationNameQuery = 'linkedin.com'
AND source = 'MY_SOURCE'
ORDER BY modified DESC limit 1
}
ORDER BY modified DESC LIMIT 1;
Notes:
This formulation is likely to take about 0.00s to run.
The ORDER BY and LIMIT shows up 3 times.
If you need OFFSET, things get a little tricky.
Change back to LIKE if you allow users to enter wildcards.
A leading wildcard would not be efficient.
UNION ALL is faster than UNION (aka UNION DISTINCT).
It needs two new composite indexes; the order of the 2 columns is not critical:
INDEX(organizationDomain, source),
INDEX(organizationNameQuery, source)
As I checked the query I think you can remove the like operator and use =.
SELECT * FROM (
SELECT * FROM Organizations.master
where ( (organizationDomain = 'linkedin.com' ||
organizationNameQuery = 'linkedin.com')
and source = 'MY_SOURCE')
) M
ORDER BY M.modified DESC limit 1

SQL query to select all rows with max column value

CREATE TABLE `user_activity` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`type` enum('request','response') DEFAULT NULL,
`data` longtext NOT NULL,
`created_at` datetime DEFAULT NULL,
`source` varchar(255) DEFAULT NULL,
`task_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
);
I have this data:-
Now I need to select all rows for user_id=527 where created_at value is the maximum. So I need the last 3 rows in this image.
I wrote this query:-
SELECT *
FROM user_activity
WHERE user_id = 527
AND source = 'E1'
AND task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
AND created_at = (SELECT Max(created_at)
FROM user_activity
WHERE user_id = 527
AND source = 'E1'
AND task_name IN ( 'GetReportTask',
'StopMonitoringUserTask' ));
This is very inefficient because I am running the exact same query again as an inner query except that it disregards created_at. What's the right way to do this?
I would use a correlated subquery:
SELECT ua.*
FROM user_activity ua
WHERE ua.user_id = 527 AND source = 'E1' AND
ua.task_name IN ('GetReportTask', 'StopMonitoringUserTask' ) AND
ua.created_at = (SELECT MAX(ua2.created_at)
FROM user_activity ua2
WHERE ua2.user_id = ua.user_id AND
ua2.source = ua.source AND
ua2.task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
);
Although this might seem inefficient, you can create an index on user_activity(user_id, source, task_name, created_at). With this index, the query should have decent performance.
Order by created_at desc and limit your query to return 1 row.
SELECT *
FROM user_activity
WHERE user_id = 527
AND source = 'E1'
AND task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
ORDER BY created_at DESC
LIMIT 1;
I used EverSQL and applied my own changes to come up with this single-select query that uses self-join:-
SELECT *
FROM user_activity AS ua1
LEFT JOIN user_activity AS ua2
ON ua2.user_id = ua1.user_id
AND ua2.source = ua1.source
AND ua2.task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
AND ua1.created_at < ua2.created_at
WHERE ua1.user_id = 527
AND ua1.source = 'E1'
AND ua1.task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
AND ua2.created_at IS NULL;
However, I noticed that the response times of both queries were similar. I tried to use Explain to identify any performance differences; and from what I understood from its output, there are no noticeable differences because proper indexing is in place. So for readability and maintainability, I'll just use the nested query.

Updating a column based on other 2 column's values

I have user_contents table. Here is the DDL
CREATE TABLE `user_contents` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`content_type` int(11) NOT NULL,
`order_id` int(11) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
CONSTRAINT `user_contents_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `user` (`id`)
)
order_id is the newly added column. I need to update values of order_id based on the values of content_type and user_id. content_type can be 0 or 1.
Based on content_type and user_id i have to update order_id as shown in the above result. For same user_id and content_type order_id need to be incremented from 0.
Can some one help me with the update query
I am using mysql db of version 5.7.23-0ubuntu0.16.04.1
Edit : - - Now the requirement is slightly changed. Instead of data_type int for user_id, it is changed to varchar holding values like DAL001, HAL001 etc
Try the following query, to update order_id values. This employs User-defined session variables.
This query basically consists of two parts. First part determines order_id for every id, based on the defined logic.
Second part joins with the user_contents table using id and updates the order_id values.
UPDATE user_contents AS uc
JOIN
(
SELECT
dt.id,
#oid := IF(#uid = dt.user_id AND
#ct = dt.content_type,
#oid + 1,
0) AS order_id,
#uid := dt.user_id,
#ct := dt.content_type
FROM
(
SELECT
id,
user_id,
content_type
FROM user_contents
ORDER BY user_id, content_type
) AS dt
CROSS JOIN (SELECT #oid := 0,
#uid := 0,
#ct := 0) AS user_init_params
) AS dt2 ON dt2.id = uc.id
SET uc.order_id = dt2.order_id
It would be better to use a view to achieve what you want. Here is one option which should work without window functions and without sessions variables:
CREATE VIEW user_contents_view AS (
SELECT
id,
user_id,
content_type,
(SELECT COUNT(*) FROM user_contents uc2
WHERE uc2.user_id = uc1.user_id AND
uc2.content_type = uc1.content_type AND
uc2.id < uc1.id) order_id
FROM user_contents uc1
);
Demo
The main problem with suggesting to do an update here is that the order_id column apparently is derived data. This would mean that you might have to more updates again in the future. So, a view avoids this problem completely by just generating the output you want when you actually need it.
string SQL = "SELECT MAX(order_id) FROM user_contents
WHERE user_id = 'label1' AND content_type ='label2'";
string sql = "UPDATE user_contents SET order_id='" +bb+ "' WHERE sl='1'";
After getting maximum order id increment the orderid and pass to some variable and update using update query.

fetch datas from two tables and differentiate between them

I have two tables and want displays rows from the two one in the same page ordered by date created.
Here my query:
SELECT R.*, R.id as id_return
FROM return R
UNION
ALL
SELECT A.*, A.id as id_buy
FROM buy A
WHERE
R.id_buyer = '$user' AND R.id_buyer = A.id_buyer AND (R.stats='1' OR R.stats='3') OR A.stats='4'
ORDER
BY R.date, A.date DESC LIMIT $from , 20
With this query i get this error message:
Warning: mysqli_fetch_array() expects parameter 1 to be mysqli_result, boolean given in ...
And here how i think i can differentiate between the results: (Knowing if the result is from the table RETURN or from the table BUY)
if(isset($hist_rows["id_return"])) {
// show RETURN rows
} else {
// show BUY rows
}
Please what is wrong with the query, and if the method to differentiate between tables are correct ?
EDIT
Here my tables sample:
CREATE TABLE IF NOT EXISTS `return` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_buyer` INT(12) NOT NULL,
`id_seller` INT(12) NOT NULL,
`message` TEXT NOT NULL,
`stats` INT(1) NOT NULL,
`date` varchar(30) NOT NULL,
`update` varchar(30)
PRIMARY KEY (`id`)
)
CREATE TABLE IF NOT EXISTS `buy` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_buyer` INT(12) NOT NULL,
`product` INT(12) NOT NULL,
`title` VARCHAR(250) NOT NULL,
`stats` INT(1) NOT NULL,
`date` varchar(30) NOT NULL
PRIMARY KEY (`id`)
)
Be sure the two table return and buy have the same number (and type sequence) of colummns .. if not the query fails
try select only the column you need from both the table and be sure that these are in correspondenting number and type
SELECT R.col1, R.col2, R.id as id_return
FROM return R
UNION ALL
SELECT A.col1, A.col2, A.id as id_buy
FROM buy A
WHERE
........
Looking to your code you should select the same number and type of column form boith the table eg de sample below:
(where i have added the different column and selecting null from the table where are not present)
I have aslore referred the proper where condition to each table ..
SELECT
R.'from return' as `source_table`
, R.`id`
, R.`id_buyer`
, null as product
, null as title
, R.`id_seller` as id_seller
, R-`message`
, R.`stats`
, R.`date`
, R.`update`
FROM return R
WHERE R.id_buyer = '$user'
AND (R.stats='1' OR R.stats='3')
UNION ALL
SELECT
A.'from buy'
, A.`id`
, A.`id_buyer`
, A.`product`
, A.`title`
, null
, null
, A.`stats`
, A.`date`
, null
FROM buy A
WHERE
A.id_buyer = '$user'
AND A.stats='4'
ORDER BY `source table`, date DESC LIMIT $from , 20
for retrive te value of the first column you should use in your case
echo $hist_rows["source_table"];
Otherwise i the two table are in some way related you should look at a join (left join) for link the two table and select the the repated column
(but this is another question)
But if you need left join you can try
SELECT
R.`id`
, R.`id_buyer`
, R.`id_seller` as id_seller
, R-`message`
, R.`stats`
, R.`date`
, R.`update`
, A.`id`
, A.`id_buyer`
, A.`product`
, A.`title`
, null
, null
, A.`stats`
, A.`date`
FROM return R
LEFT JOIN buy A ON R.id_buyer = A.id_buyer
AND R.id_buyer = '$user'
AND (R.stats='1' OR R.stats='3')
AND A.stats='4'
ORDER BY R.date DESC LIMIT $from , 20
When you use union all, the queries need to have exactly the same columns in the same order. If the types are not quite the same, then they are converted to the same type.
So, you don't want union all. I'm guessing you want a join. Something like this:
SELECT r.co1, r.col2, . . ., r.id as id_return,
b.col1, b.col2, . . ., b.id as id_buy
FROM return r JOIN
buy b
ON r.id_buyer = b.id_buyer
WHERE r.id_buyer = '$user' and
(r.stats in (1, 3) OR A.stats = 4)
ORDER BY R.date, A.date DESC
LIMIT $from, 20;
This query is only a guess as to what you might want.
Since you're using a union, select a string that you set identifying each query:
SELECT 'R', R.*, R.id as id_return
FROM return R
UNION
ALL
SELECT 'A', A.*, A.id as id_buy
This way your string 'R' or 'A' is the first column, showing you where it came from. We can't really know why it's failing without the full query, but I'd guess your $from might be empty?
As for your
Warning: mysqli_fetch_array() expects parameter 1 to be mysqli_result, boolean given in ...
Run the query directly first to get the sql sorted out before putting it into your PHP script. The boolean false indicates the query failed.

Stock calculation during sales

I have a StockCard table with below schema:
create table StockCard(
id int(9) zerofill primary key auto_increment,
ref_page smallint(3) not null,
ref_number int(9) not null,
sur_key int unsigned null,
description varchar(255),
warehouse_product_id int(9) zerofill not null,
sc_date datetime not null,
qty int not null,
price double(15,2) not null,
reserved_qty int not null,
left_qty int not null,
status boolean not null, /* true = buy, false = sell*/
CONSTRAINT FOREIGN KEY (warehouse_product_id) REFERENCES Warehouse_Product(id),
CONSTRAINT FOREIGN KEY (ref_page) REFERENCES Page(id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
For now, when I want to sell a product I use LIMIT to know whether I got enough left_qty/stock for selling that product. Below is my query, which runs in a for loop as long as avail_stock still below ordered_qty:
SELECT SUM(StockCard.left_qty) AS avail_qty FROM StockCard
LEFT JOIN Warehouse_Product ON Warehouse_Product.id = StockCard.warehouse_product_id
WHERE Warehouse_Product.product_id = {product_id}
AND StockCard.status = 1 AND left_qty > 0
ORDER BY sc_date ASC, id ASC
LIMIT {row_limit}
//this row_limit will increase in every loop as long as the SUM smaller than the ordered qty
My question here, is it true that by using my approach above it will improve my query performance or should I just delete the LIMIT? Feel free to give other approach..
Here is the simplified version of my issue: http://sqlfiddle.com/#!9/36ef5
Don't use LIMIT, it would be a lot of queries generated. See the fiddle here, first you need to calculate the sum of previous (see x2 or minqty) and sum of previous and current (see x1 or maxqty) then use them to filter what you need.
SELECT x3.*
FROM
( SELECT sc1.*, IFNULL(SUM(sc2.left_qty),0) maxqty
FROM StockCard sc1
LEFT JOIN StockCard sc2 ON sc2.id <= sc1.id
WHERE sc1.status = 1 AND sc2.status = 1 AND sc1.warehouse_product_id = 1 AND sc2.warehouse_product_id = 1
GROUP BY sc1.id
ORDER BY sc2.sc_date ASC, sc2.id ASC ) x1
LEFT JOIN
( SELECT sc1.*, IFNULL(SUM(sc2.left_qty),0)+1 minqty
FROM StockCard sc1
LEFT JOIN StockCard sc2 ON sc2.id < sc1.id
WHERE sc1.status = 1 AND sc2.status = 1 AND sc1.warehouse_product_id = 1 AND sc2.warehouse_product_id = 1
GROUP BY sc1.id
ORDER BY sc2.sc_date ASC, sc2.id ASC ) x2
ON x1.id = x2.id
LEFT JOIN StockCard x3 ON x3.id <= x1.id
AND x3.warehouse_product_id = sc1.warehouse_product_id
WHERE IFNULL(x2.minqty,0) <= 15 -- > change these
AND x1.maxqty >= 15 --> change these
For example when the table contains: 15, 2 and 3, minqty and maxqty would be:
qty min max
15 1 15
2 16 17
3 18 21