SQL query to select all rows with max column value - mysql

CREATE TABLE `user_activity` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`type` enum('request','response') DEFAULT NULL,
`data` longtext NOT NULL,
`created_at` datetime DEFAULT NULL,
`source` varchar(255) DEFAULT NULL,
`task_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
);
I have this data:-
Now I need to select all rows for user_id=527 where created_at value is the maximum. So I need the last 3 rows in this image.
I wrote this query:-
SELECT *
FROM user_activity
WHERE user_id = 527
AND source = 'E1'
AND task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
AND created_at = (SELECT Max(created_at)
FROM user_activity
WHERE user_id = 527
AND source = 'E1'
AND task_name IN ( 'GetReportTask',
'StopMonitoringUserTask' ));
This is very inefficient because I am running the exact same query again as an inner query except that it disregards created_at. What's the right way to do this?

I would use a correlated subquery:
SELECT ua.*
FROM user_activity ua
WHERE ua.user_id = 527 AND source = 'E1' AND
ua.task_name IN ('GetReportTask', 'StopMonitoringUserTask' ) AND
ua.created_at = (SELECT MAX(ua2.created_at)
FROM user_activity ua2
WHERE ua2.user_id = ua.user_id AND
ua2.source = ua.source AND
ua2.task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
);
Although this might seem inefficient, you can create an index on user_activity(user_id, source, task_name, created_at). With this index, the query should have decent performance.

Order by created_at desc and limit your query to return 1 row.
SELECT *
FROM user_activity
WHERE user_id = 527
AND source = 'E1'
AND task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
ORDER BY created_at DESC
LIMIT 1;

I used EverSQL and applied my own changes to come up with this single-select query that uses self-join:-
SELECT *
FROM user_activity AS ua1
LEFT JOIN user_activity AS ua2
ON ua2.user_id = ua1.user_id
AND ua2.source = ua1.source
AND ua2.task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
AND ua1.created_at < ua2.created_at
WHERE ua1.user_id = 527
AND ua1.source = 'E1'
AND ua1.task_name IN ( 'GetReportTask', 'StopMonitoringUserTask' )
AND ua2.created_at IS NULL;
However, I noticed that the response times of both queries were similar. I tried to use Explain to identify any performance differences; and from what I understood from its output, there are no noticeable differences because proper indexing is in place. So for readability and maintainability, I'll just use the nested query.

Related

Nodejs Mysql optimizing query

I am using mysql2 module in nodejs v8.9.4.
This is my function to get a message from message queue which meets this conditions :
status==0
if count of botId with status==1 is less than 10
if retry_after in wait table for botId+chatId and just botId is less than NOW(timestamp)
if there is no same chatId with status==1
static async Find(activeMessageIds, maxActiveMsgPerBot) {
let params = [maxActiveMsgPerBot];
let filterActiveMessageIds = ' ';
let time = Util.GetTimeStamp();
if (activeMessageIds && activeMessageIds.length) {
filterActiveMessageIds = 'q.id NOT IN (?) AND ';
params.push(activeMessageIds);
}
let q =
`select q.*
from bot_message_queue q
left join bot_message_queue_wait w on q.botId=w.botId AND q.chatId=w.chatId
left join bot_message_queue_wait w2 on q.botId=w2.botId AND w2.chatId=0
where
q.status=0 AND
q.botId NOT IN (select q2.botId from bot_message_queue q2 where q2.status=1 group by q2.botId HAVING COUNT(q2.botId)>?) AND
${filterActiveMessageIds}
q.chatId NOT IN (select q3.chatId from bot_message_queue q3 where q3.status=1 group by q3.chatId) AND
(w.retry_after IS NULL OR w.retry_after <= ?) AND
(w2.retry_after IS NULL OR w2.retry_after <= ?)
order by q.priority DESC,q.id ASC
limit 1;`;
params.push(time);
params.push(time);
let con = await DB.connection();
let result = await DB.query(q, params, con);
if (result && result.length) {
result = result[0];
let updateQ = `update bot_message_queue set status=1 where id=?;`;
await DB.query(updateQ, [result.id], con);
} else
result = null;
con.release();
return result;
}
This query runs fine on my local dev system. It also runs fine in servers phpmyadmin in couple of milliseconds.
BUT when it runs throw nodejs+mysql2 The cpu usage goes up to 100%
There is only 2K rows in this table.
CREATE TABLE IF NOT EXISTS `bot_message_queue` (
`id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`botId` int(10) UNSIGNED NOT NULL,
`chatId` varchar(50) CHARACTER SET utf8 NOT NULL,
`type` varchar(50) DEFAULT NULL,
`message` longtext NOT NULL,
`add_date` int(10) UNSIGNED NOT NULL,
`status` tinyint(2) UNSIGNED NOT NULL DEFAULT '0' COMMENT '0=waiting,1=sendig,2=sent,3=error',
`priority` tinyint(1) UNSIGNED NOT NULL DEFAULT '5' COMMENT '5=normal messages,<5 = bulk messages',
`delay_after` int(10) UNSIGNED NOT NULL DEFAULT '1000',
`send_date` int(10) UNSIGNED DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `botId` (`botId`,`status`),
KEY `botId_2` (`botId`,`chatId`,`status`,`priority`),
KEY `chatId` (`chatId`,`status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE IF NOT EXISTS `bot_message_queue_wait` (
`botId` int(10) UNSIGNED NOT NULL,
`chatId` varchar(50) CHARACTER SET utf8 NOT NULL,
`retry_after` int(10) UNSIGNED NOT NULL,
PRIMARY KEY (`botId`,`chatId`),
KEY `retry_after` (`retry_after`),
KEY `botId` (`botId`,`chatId`,`retry_after`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
UPDATE: Real table data here
UPDATE 2:
FetchMessageTime :
- Min : 1788 ms
- Max : 44285 ms
- Average : 20185.4 ms
The max was like 20ms until yesterday :( now its 40 seconds!!!
UPDATE 3: I merged these 2 joins and wheres:
left join bot_message_queue_wait w on q.botId=w.botId AND q.chatId=w.chatId
left join bot_message_queue_wait w2 on q.botId=w2.botId AND w2.chatId=0
(w.retry_after IS NULL OR w.retry_after <= ?) AND
(w2.retry_after IS NULL OR w2.retry_after <= ?)
into a single one, I hope this will work as intended!
left join bot_message_queue_wait w on q.botId=w.botId AND ( q.chatId=w.chatId OR w.chatId=0 )
and for the time being I removed the 2 wheres and the query time went back to normal.
q.botId NOT IN (select ...)
q.chatId NOT IN (select ...)
So these 2 where queries are the chock points and needs to be fixed.
NOT IN ( SELECT ... ) is difficult to optimize.
OR cannot be optimized.
In ORDER BY, mixing DESC and ASC eliminates use of an index (until 8.0). Consider changing ASC to DESC. After that, INDEX(priority, id) might help.
What is ${filterActiveMessageIds}?
The GROUP BY is not needed in
NOT IN ( SELECT q3.chatId
from bot_message_queue q3
where q3.status=1
group by q3.chatId )
INDEX(status, chatid) in this order would benefit that subquery.
INDEX(status, botid) in this order
More on index creation: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
I would replace the NOT IN subquery with a NOT EXISTS in this case, as it can perform better.
Switch the ORDER BY to either all DESC or all ASC
So to optimize the query, first, add these indexes:
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_status_botid_chatid_priori_id` (`status`,`botId`,`chatId`,`priority`,`id`);
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_priority_id` (`priority`,`id`);
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_botid_status` (`botId`,`status`);
ALTER TABLE `bot_message_queue` ADD INDEX `bot_message_queue_idx_chatid_status` (`chatId`,`status`);
ALTER TABLE `bot_message_queue_wait` ADD INDEX `bot_message_queue_wa_idx_chatid_botid` (`chatId`,`botId`);
Now, you can try to run this query (please note I changed the order by to all DESC, so you can change it to ASC if that's a requirement):
SELECT
bot_message_queue.*
FROM
bot_message_queue q
LEFT JOIN
bot_message_queue_wait w
ON q.botId = w.botId
AND q.chatId = w.chatId
LEFT JOIN
bot_message_queue_wait w2
ON q.botId = w2.botId
AND w2.chatId = 0
WHERE
q.status = 0
AND NOT EXISTS (
SELECT
1
FROM
bot_message_queue AS q21
WHERE
q21.status = 1
AND q.botId = q21.botId
GROUP BY
q21.botId
HAVING
COUNT(q21.botId) > ?
ORDER BY
NULL
)
AND NOT EXISTS (
SELECT
1
FROM
bot_message_queue AS q32
WHERE
q32.status = 1
AND q.chatId = q32.chatId
GROUP BY
q32.chatId
ORDER BY
NULL
)
AND (
w.retry_after IS NULL
OR w.retry_after <= ?
)
AND (
w2.retry_after IS NULL
OR w2.retry_after <= ?
)
ORDER BY
q.priority DESC,
q.id DESC LIMIT 1

fetch datas from two tables and differentiate between them

I have two tables and want displays rows from the two one in the same page ordered by date created.
Here my query:
SELECT R.*, R.id as id_return
FROM return R
UNION
ALL
SELECT A.*, A.id as id_buy
FROM buy A
WHERE
R.id_buyer = '$user' AND R.id_buyer = A.id_buyer AND (R.stats='1' OR R.stats='3') OR A.stats='4'
ORDER
BY R.date, A.date DESC LIMIT $from , 20
With this query i get this error message:
Warning: mysqli_fetch_array() expects parameter 1 to be mysqli_result, boolean given in ...
And here how i think i can differentiate between the results: (Knowing if the result is from the table RETURN or from the table BUY)
if(isset($hist_rows["id_return"])) {
// show RETURN rows
} else {
// show BUY rows
}
Please what is wrong with the query, and if the method to differentiate between tables are correct ?
EDIT
Here my tables sample:
CREATE TABLE IF NOT EXISTS `return` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_buyer` INT(12) NOT NULL,
`id_seller` INT(12) NOT NULL,
`message` TEXT NOT NULL,
`stats` INT(1) NOT NULL,
`date` varchar(30) NOT NULL,
`update` varchar(30)
PRIMARY KEY (`id`)
)
CREATE TABLE IF NOT EXISTS `buy` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_buyer` INT(12) NOT NULL,
`product` INT(12) NOT NULL,
`title` VARCHAR(250) NOT NULL,
`stats` INT(1) NOT NULL,
`date` varchar(30) NOT NULL
PRIMARY KEY (`id`)
)
Be sure the two table return and buy have the same number (and type sequence) of colummns .. if not the query fails
try select only the column you need from both the table and be sure that these are in correspondenting number and type
SELECT R.col1, R.col2, R.id as id_return
FROM return R
UNION ALL
SELECT A.col1, A.col2, A.id as id_buy
FROM buy A
WHERE
........
Looking to your code you should select the same number and type of column form boith the table eg de sample below:
(where i have added the different column and selecting null from the table where are not present)
I have aslore referred the proper where condition to each table ..
SELECT
R.'from return' as `source_table`
, R.`id`
, R.`id_buyer`
, null as product
, null as title
, R.`id_seller` as id_seller
, R-`message`
, R.`stats`
, R.`date`
, R.`update`
FROM return R
WHERE R.id_buyer = '$user'
AND (R.stats='1' OR R.stats='3')
UNION ALL
SELECT
A.'from buy'
, A.`id`
, A.`id_buyer`
, A.`product`
, A.`title`
, null
, null
, A.`stats`
, A.`date`
, null
FROM buy A
WHERE
A.id_buyer = '$user'
AND A.stats='4'
ORDER BY `source table`, date DESC LIMIT $from , 20
for retrive te value of the first column you should use in your case
echo $hist_rows["source_table"];
Otherwise i the two table are in some way related you should look at a join (left join) for link the two table and select the the repated column
(but this is another question)
But if you need left join you can try
SELECT
R.`id`
, R.`id_buyer`
, R.`id_seller` as id_seller
, R-`message`
, R.`stats`
, R.`date`
, R.`update`
, A.`id`
, A.`id_buyer`
, A.`product`
, A.`title`
, null
, null
, A.`stats`
, A.`date`
FROM return R
LEFT JOIN buy A ON R.id_buyer = A.id_buyer
AND R.id_buyer = '$user'
AND (R.stats='1' OR R.stats='3')
AND A.stats='4'
ORDER BY R.date DESC LIMIT $from , 20
When you use union all, the queries need to have exactly the same columns in the same order. If the types are not quite the same, then they are converted to the same type.
So, you don't want union all. I'm guessing you want a join. Something like this:
SELECT r.co1, r.col2, . . ., r.id as id_return,
b.col1, b.col2, . . ., b.id as id_buy
FROM return r JOIN
buy b
ON r.id_buyer = b.id_buyer
WHERE r.id_buyer = '$user' and
(r.stats in (1, 3) OR A.stats = 4)
ORDER BY R.date, A.date DESC
LIMIT $from, 20;
This query is only a guess as to what you might want.
Since you're using a union, select a string that you set identifying each query:
SELECT 'R', R.*, R.id as id_return
FROM return R
UNION
ALL
SELECT 'A', A.*, A.id as id_buy
This way your string 'R' or 'A' is the first column, showing you where it came from. We can't really know why it's failing without the full query, but I'd guess your $from might be empty?
As for your
Warning: mysqli_fetch_array() expects parameter 1 to be mysqli_result, boolean given in ...
Run the query directly first to get the sql sorted out before putting it into your PHP script. The boolean false indicates the query failed.

Enum type and MySQL

i have simple task and i got stucked on it.
I have table login_history
`login_history_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`login_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`login_action` enum('login','logout') NOT NULL,
`user_id` int(11) unsigned NOT NULL, (this one is foreign key)
TASK: Write a query which will find a user who had most logouts on Wednesdays in September 2012.
As you can see i have login_action which is enum type and i need to find which user had most logouts on some specific day.. This is what i done so far i just need little push in right direction, someone to tell me where i am wrong here..
SELECT fullname FROM user WHERE user_id = (
SELECT user_id FROM login_history WHERE (user_id,login_action) = (
SELECT user_id, COUNT(login_action) FROM login_history WHERE login_action = 'logout' AND login_time = (
SELECT login_time FROM login_history WHERE YEAR(login_time) = 2012 AND MONTH(login_time) = 9 AND DAYOFWEEK(login_time) = 3)));
Try this:
select u.fullname from (select count(*) n,user_id
from login_history where
login_time between '2012-09-01' and '2012-10-01' and dayofweek(login_time) = 3 and login_action = 'logout'
group by user_id order by n desc limit 1) a, user u where a.user_id = u.user_id
For good performance, make sure you have a key on login_time column.

How do I combine these two Select queries with an OR case

I want to select all rows where WHERE (uid = {$uid} OR uid = **HERE** ) where **HERE** is the cids retreived from query 2 below.
Query 1:
SELECT * FROM `t_activities`
WHERE (`uid` = {$uid} OR `uid` = **HERE** )
AND `del` = 0
GROUP BY `fid`
ORDER BY `time` DESC
LIMIT 10
And Query 2:
SELECT `cid` FROM `t_con` WHERE `uid` = {$uid} AND `flag` = 1
SELECT * FROM `t_activities`
WHERE (`uid` = {$uid} OR `uid` in (SELECT `cid`
FROM `t_con`
WHERE `uid` = {$uid} AND `flag` = 1))
AND `del` = 0
GROUP BY `fid`
ORDER BY `time` DESC
LIMIT 10
You can do this as a join as well:
SELECT *
FROM `t_activities` ta left outer join
(SELECT `cid`
FROM `t_con`
WHERE `uid` = {$uid} AND `flag` = 1)
) tc
on ta = tc.cid
WHERE (`uid` = {$uid} OR tc.`uid` is not null) AND `del` = 0
GROUP BY `fid`
ORDER BY `time` DESC
LIMIT 10
By the way, as a SQL statement the "GROUP BY fid" looks very strange. This is allowed in mysql, but I think it is a bad practice. It is much better to be explicit about what you are doing:
SELECT fid, min(<field1>) as Field1, . . .
This helps prevent mistakes when you go back to the query or try to modify it.

Count enumerated values?

If my table looks like this:
CREATE TABLE `daily_individual_tracking` (
`daily_individual_tracking_id` int(10) unsigned NOT NULL auto_increment,
`daily_individual_tracking_date` date NOT NULL default ''0000-00-00'',
`sales` enum(''no'',''yes'') NOT NULL COMMENT ''no'',
`repairs` enum(''no'',''yes'') NOT NULL COMMENT ''no'',
`shipping` enum(''no'',''yes'') NOT NULL COMMENT ''no'',
PRIMARY KEY (`daily_individual_tracking_id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1
basically the fields can be either yes or no.
How can I count how many yes's their are for each column over a date range?
Thanks!!
You can either run three queries like this:
SELECT COUNT(*)
FROM daily_individual_tracking
WHERE sales = 'YES'
AND daily_individual_tracking_date BETWEEN '2010-01-01' AND '2010-03-31'
Or if you want you can get all three at once like this:
SELECT (
SELECT COUNT(*)
FROM daily_individual_tracking
WHERE sales = 'YES'
AND daily_individual_tracking_date BETWEEN '2010-01-01' AND '2010-03-31'
) AS sales_count, (
SELECT COUNT(*)
FROM daily_individual_tracking
WHERE repairs = 'YES'
AND daily_individual_tracking_date BETWEEN '2010-01-01' AND '2010-03-31'
) AS repairs_count, (
SELECT COUNT(*)
FROM daily_individual_tracking
WHERE shipping = 'YES'
AND daily_individual_tracking_date BETWEEN '2010-01-01' AND '2010-03-31'
) AS shipping_count
Another way to do it is to use SUM instead of COUNT. You could try this too to see how it affects the performance:
SELECT
SUM(sales = 'YES') AS sales_count,
SUM(repairs = 'YES') AS repairs_count,
SUM(shipping = 'YES') AS shipping_count
FROM daily_individual_tracking
WHERE daily_individual_tracking_date BETWEEN '2010-01-01' AND '2010-03-31'