Mysql subquery optimization - count - mysql

i have these subqueries in a main query used to fetch some events:
SELECT [...],
(SELECT COUNT(*) FROM WEventUser WHERE WEventUser.eID=e.eID AND favorited=1) as numfavorited,
(SELECT COUNT(*) FROM WEventUser WHERE WEventUser.eID=e.eID AND subscribed=1) as numsubscribed,
(SELECT COUNT(*) FROM WEventUser WHERE eID=e.eID AND WEventUser.uID=2 AND favorited=1) as favorited,
(SELECT COUNT(*) FROM WEventUser WHERE eID=e.eID AND WEventUser.uID=2 AND subscribed=1) as subscribed,
[...] WHERE...etc.
structure of WEventUser is quite simple
CREATE TABLE IF NOT EXISTS `WEventUser` (
`eID` int(10) unsigned NOT NULL auto_increment,
`uID` int(10) unsigned NOT NULL,
`favorited` int(1) unsigned default '0',
`subscribed` int(1) unsigned default '0',
PRIMARY KEY (`eID`,`uID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
These subqueries are really expensive. Can you help me finding an alternative (like a single join)?
Thanks in advance!
EDIT:
I'm selecting from a main WEvents table that is:
CREATE TABLE IF NOT EXISTS `wevents` (
`eID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`uID` int(10) unsigned DEFAULT NULL,
`ecID` int(10) unsigned NOT NULL,
`eName` varchar(64) NOT NULL,
`eDescription` longtext,
`eIsActive` varchar(1) NOT NULL DEFAULT '0',
`eIsValidated` tinyint(4) NOT NULL DEFAULT '-1',
`eDateAdded` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`eDateModified` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`eID`,`ecID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

You shouldn't use a subqueries, it is enough to count values in the COUNT function, e.g. -
SELECT [...],
COUNT(IF(wu.favorited = 1, 1, NULL)) as numfavorited,
COUNT(IF(wu.subscribed = 1, 1, NULL)) as numsubscribed,
COUNT(IF(wu.uID=2 AND wu.favorited=1, 1, NULL)) as favorited,
COUNT(IF(wu.uID=2 AND wu.favorited = 1, 1, NULL)) as subscribed,
[...]
FROM
WEventUser wu
WHERE...etc.
You can easily use this one if you want to join WEventUser with another table.

Related

How can I reduce execution time of a query

When I run this query, it takes almost around 30 min to complete. How can I reduce the execution time?
INSERT INTO vfusion.attendance_report_data_2
SELECT
CONCAT(attendance_checkin.userid, UNIX_TIMESTAMP(DATE(IFNULL(attendance_checkin.work_date, 0)))) AS id,
attendance_checkin.userid,
attendance_checkin.work_date,
attendance_checkin.checkintime_data as in_time,
attendance_checkout.checkouttime_data as out_time,
IFNULL(attendance_checkin.work_shift,0) as work_shift
FROM
vfusion.attendance_checkin
INNER JOIN
vfusion.attendance_checkout ON attendance_checkin.userid = attendance_checkout.userid
AND attendance_checkin.work_date = attendance_checkout.work_date
ON DUPLICATE KEY
UPDATE
in_time = in_time,
out_time = out_time,
work_shift = attendance_checkin.work_shift
These are my tables - I have a lot of data in this table
CREATE TABLE attendance_checkout
(
id_attendance_checkout BIGINT(11) NOT NULL,
userid INT(11) DEFAULT NULL,
work_date DATE DEFAULT NULL,
checkouttime_data DATETIME DEFAULT NULL,
work_shift INT(11) DEFAULT NULL,
PRIMARY KEY (id_attendance_checkout)
) ENGINE=INNODB DEFAULT CHARSET=LATIN1;
CREATE TABLE attendance_checkin
(
id_attendance_checkin BIGINT(11) NOT NULL,
userid INT(11) DEFAULT NULL,
work_date DATE DEFAULT NULL,
checkintime_data DATETIME DEFAULT NULL,
work_shift INT(11) DEFAULT NULL,
PRIMARY KEY (id_attendance_checkin)
) ENGINE=INNODB DEFAULT CHARSET=LATIN1;
CREATE TABLE attendance_report_data_2
(
id_attendance_report_data BIGINT(11) NOT NULL,
userid INT(11) NOT NULL DEFAULT '0',
work_date DATE NOT NULL DEFAULT '0000-00-00',
in_time DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
out_time DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
work_shift INT(11) NOT NULL DEFAULT '0',
PRIMARY KEY (id_attendance_report_data , in_time , out_time , work_date , userid , work_shift)
) ENGINE=INNODB DEFAULT CHARSET=LATIN1
I need to run this query randomly but for taking log time I can't run it.
Because it's stuck all other
Create an index for the combination of columns [userid] and [work_date] in table [vfusion.attendance_checkout]
This speeds up the JOIN

group_concat() on bit fields returns garbage in Mysql

Table Structure
CREATE TABLE `academicyears` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`campusid` int(11) DEFAULT NULL,
`academicyear` text NOT NULL,
`month` tinyint(4) DEFAULT NULL,
`flag` bit(1) DEFAULT b'1',
PRIMARY KEY (`id`)
);
My Query
SELECT GROUP_CONCAT(ay.`flag`)
FROM academicyears ay
GROUP BY ay.`campusid`
Result
Try this;)
SELECT GROUP_CONCAT(ay.`flag` + 0)
FROM academicyears ay
GROUP BY ay.`campusid`
And check reference here.

How can I combine these two queries into one?

I am writing a script for my custom forums that determines if new threads or replies have been posted in a certain board since the user's last visit to that board(board_marks). I have two tables: one that stores threads, another that stores replies. I can easily find if there are any new replies in a board by using a left join. I want to add a bit that finds if there are new threads in that board. How would I do this?
My Tables:
CREATE TABLE IF NOT EXISTS `forum_threads` (
`thread_id` int(15) NOT NULL AUTO_INCREMENT,
`board_id` int(15) NOT NULL,
`author_id` int(15) NOT NULL,
`updater_id` int(15) NOT NULL,
`title` text NOT NULL,
`content` text NOT NULL,
`date_posted` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`date_updated` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`views` int(15) NOT NULL,
`status` tinyint(1) NOT NULL,
`type` tinyint(1) NOT NULL COMMENT '0 normal, 1 sticky, 2 global.',
PRIMARY KEY (`thread_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `forum_messages` (
`message_id` int(15) NOT NULL AUTO_INCREMENT,
`thread_id` int(15) NOT NULL,
`author_id` int(15) NOT NULL,
`modifier_id` int(15) DEFAULT NULL,
`content` text NOT NULL,
`date_posted` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`date_modified` timestamp NULL DEFAULT NULL,
`status` tinyint(1) NOT NULL,
PRIMARY KEY (`message_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
My Script:
$this->db->select('m.message_id, m.thread_id, m.date_posted, t.thread_id, t.board_id');
$this->db->from('forum_messages AS m');
$this->db->join('forum_threads AS t', 'm.thread_id = t.thread_id', 'left');
$this->db->where('t.board_id', $board_id);
$this->db->where('m.date_posted > "'.$last_mark['date_marked'].'"');
if($query_new_messages = $this->db->get()){
if($query_new_messages->num_rows() > 0){
$contains_new_posts = TRUE;
} else {
$contains_new_posts = FALSE;
}
}
The actual query:
SELECT
m.message_id, m.thread_id, m.date_posted, t.thread_id, t.board_id
FROM (forum_messages AS m)
LEFT JOIN forum_threads AS t ON m.thread_id = t.thread_id
WHERE `t`.`board_id` = '10' AND `m`.`date_posted` > "2014-03-02 06:01:31"
PS: I am doing this in CodeIgniter.
An OR condition will get you the data you want:
SELECT
m.message_id, m.thread_id, m.date_posted, t.thread_id, t.board_id
FROM (forum_messages AS m)
LEFT JOIN forum_threads AS t ON m.thread_id = t.thread_id
WHERE `t`.`board_id` = '10'
AND (`m`.`date_posted` > "2014-03-02 06:01:31"
OR `t`.`date_posted` > "2014-03-02 06:01:31")
You should benchmark this, however. This solution is probably not very optimized. You should benchmark doing two fast queries - one for posts and one for threads - vs doing one potentially slower one.

Advanced query running slowly

Whenever I am running this query, it takes about 25-30 seconds for it to run. As you can see, the most advanced thing here is to calculate two coalesces within subqueries.
SELECT
g.name,
g.id,
(
SELECT
COALESCE (
SUM(result2 / result1) * (
SUM(IF(result2 != 0, 1, 0)) * 0.1
),
0
) AS res
FROM
gump.war gwr
WHERE
started = 1
AND (UNIX_TIMESTAMP(time) + 7 * 24 * 60 * 60) > UNIX_TIMESTAMP()
AND gwr.guild1 = g.id
AND gwr.winner = g.id
) + (
SELECT
COALESCE (
SUM(result1 / result2) * (
SUM(IF(result1 != 0, 1, 0)) * 0.1
),
0
) AS res1
FROM
gumb.war gwr
WHERE
started = 1
AND (UNIX_TIMESTAMP(time) + 7 * 24 * 60 * 60) > UNIX_TIMESTAMP()
AND gwr.guild2 = g.id
AND gwr.winner = g.id
) AS avg
FROM
gumb.guild g
ORDER BY
avg DESC,
g.point DESC,
g.experience DESC LIMIT 10;
Table structures/schemas:
CREATE TABLE `guild` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(12) NOT NULL DEFAULT '',
`owner` int(10) unsigned NOT NULL DEFAULT '0',
`level` tinyint(2) DEFAULT NULL,
`experience` int(11) DEFAULT NULL,
`win` int(11) NOT NULL DEFAULT '0',
`draw` int(11) NOT NULL DEFAULT '0',
`loss` int(11) NOT NULL DEFAULT '0',
`point` int(11) NOT NULL DEFAULT '0',
`account` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=0 DEFAULT CHARSET=latin1;
CREATE TABLE `war` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`guild1` int(10) unsigned NOT NULL DEFAULT '0',
`guild2` int(10) unsigned NOT NULL DEFAULT '0',
`time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`type` tinyint(2) unsigned NOT NULL DEFAULT '0',
`price` int(10) unsigned NOT NULL DEFAULT '0',
`score` int(10) unsigned NOT NULL DEFAULT '0',
`started` tinyint(1) NOT NULL DEFAULT '0',
`winner` int(11) NOT NULL DEFAULT '-1',
`result1` int(11) NOT NULL DEFAULT '0',
`result2` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=0 DEFAULT CHARSET=latin1;
Indexes will definitely help, indexing fields used in JOIN criteria and in WHERE clauses carries the most impact.
Generic syntax example:
CREATE INDEX idx_col1col2 ON tbl_Test (Col1, Col2)
You don't likely want to just cram every field used into one index, and you likely shouldn't create an index for each field either.
There are many resources for helping you understand how to build your indexes, here are a couple items:
MySQL CREATE INDEX Syntax
MySQL Index Optimization

MySQL Extremely Slow Query

I am trying to learn about optimizations to MySQL, table engines and when to use them, etc.
I have a query that is running up against the time-out limit of 10 minutes and which needs to complete in seconds because its function is a user-generated report.
The Query:
SELECT em.employeeId, tsk.taskId
FROM employee em INNER JOIN
task tsk
ON tsk.employeeId = em.employeeId
WHERE em.employeeId <> 'Not Done'
AND tsk.employeeId (
SELECT employeeId FROM task
WHERE templateId
IN ( '5', '6', '7', '8' )
AND tsk.status = 'Done'
)
AND tsk.employeeId IN
(
SELECT employeeId FROM task
WHERE templateId IN
( '55', '56', '57', '58' )
AND status = 'Not Done'
)
Explain:
# id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra
1, PRIMARY, tsk, ALL, , , , , 61326, Using where
1, PRIMARY, em, eq_ref, PRIMARY, PRIMARY, 4, newhire.tsk.employeeId, 1, Using index
3, DEPENDENT SUBQUERY, task, ALL, , , , , 61326, Using where
2, DEPENDENT SUBQUERY, task, ALL, , , , , 61326, Using where
The DB server uses MyISAM as default, so most schemas including this one are MyISAM.
I also realize that the text searches (status=Done or status LIKE 'Done') are adding a lot to the query.
EDIT1:
# Table, Create Table
employee, CREATE TABLE `employee` (
`employeeId` int(10) unsigned NOT NULL AUTO_INCREMENT,
`lastName` varchar(255) NOT NULL,
`firstName` varchar(255) NOT NULL,
`applicantId` varchar(255) NOT NULL,
`fEmployeeId` varchar(255) DEFAULT NULL,
`opId` varchar(255) DEFAULT NULL,
`rehire` tinyint(3) unsigned NOT NULL DEFAULT '0',
`sDate` date DEFAULT NULL,
`oDate` date DEFAULT NULL,
`cDate` date DEFAULT NULL,
`additionalDate` date DEFAULT NULL,
`additionalType` varchar(255) DEFAULT NULL,
`processingDate` date DEFAULT NULL,
`created` datetime NOT NULL,
`recruiterId` int(10) unsigned NOT NULL,
`processorId` int(10) unsigned DEFAULT NULL,
`position` tinyint(3) unsigned NOT NULL DEFAULT '1',
`status` varchar(255) NOT NULL,
`campus` varchar(255) DEFAULT NULL,
`phone` varchar(255) DEFAULT NULL,
`email` varchar(255) DEFAULT NULL,
`requisition` varchar(255) DEFAULT NULL,
`Position` varchar(255) DEFAULT NULL,
`department` varchar(255) DEFAULT NULL,
`jobClass` varchar(255) DEFAULT NULL,
`hiringManager` varchar(255) DEFAULT NULL,
`badge` varchar(255) DEFAULT NULL,
`currentAddress` varchar(255) DEFAULT NULL,
`holding` tinyint(3) unsigned DEFAULT '0',
PRIMARY KEY (`employeeId`)
) ENGINE=MyISAM AUTO_INCREMENT=3959 DEFAULT CHARSET=latin1
EDIT 2:
# Table, Create Table
task, CREATE TABLE `task` (
`taskId` int(10) unsigned NOT NULL AUTO_INCREMENT,
`templateId` int(10) unsigned NOT NULL,
`employeeId` int(10) unsigned NOT NULL,
`name` varchar(255) NOT NULL,
`description` text,
`naAvailable` tinyint(3) unsigned DEFAULT '0',
`fileRequired` tinyint(3) unsigned DEFAULT '0',
`fileHrCatalog` int(10) unsigned DEFAULT NULL,
`quickFileName` varchar(255) DEFAULT NULL,
`fileUploaded` tinyint(3) unsigned DEFAULT '0',
`fileExt` varchar(255) DEFAULT NULL,
`level` tinyint(3) unsigned NOT NULL,
`status` varchar(255) NOT NULL,
`due` date DEFAULT NULL,
`daysDue` int(10) unsigned DEFAULT NULL,
`routeIncentives` tinyint(3) unsigned DEFAULT '0',
`requiresAudit` tinyint(3) unsigned DEFAULT '0',
`auditStatus` varchar(255) DEFAULT NULL,
`auditUser` int(10) unsigned DEFAULT NULL,
`auditDate` datetime DEFAULT NULL,
`stampOption` tinyint(3) unsigned DEFAULT '0',
`done` tinyint(3) unsigned DEFAULT '0',
`doneBy` int(10) unsigned DEFAULT NULL,
`doneWhen` datetime DEFAULT NULL,
`sortOrder` tinyint(3) unsigned NOT NULL DEFAULT '255',
PRIMARY KEY (`taskId`),
KEY `status` (`status`,`templateId`)
) ENGINE=MyISAM AUTO_INCREMENT=176802 DEFAULT CHARSET=latin1
I would write the query as below, but to help the optimization, have a covering indexes on your tables.
Employee table -- index on ( status, employeeID )
Task table -- index on ( employeeid, templateid, status )
By the first join, you are prequalifying to get the first task as a "Done" status.
The second join is looking for the OTHER task you are interested in that is NOT Done.
Doing subqueries (especially correlated sub queries) can be harder on performance. By doing a JOIN, it's either there or its not...
SELECT
em.employeeId,
tsk.taskId
FROM
employee em
INNER JOIN task tsk1
ON em.employeeId = tsk1.employeeId
AND tsk1.templateID in ( '5', '6', '7', '8' )
AND tsk1.status = 'Done'
INNER JOIN task tsk2
ON em.employeeId = tsk2.employeeId
AND tsk2.templateID in ( '55', '56', '57', '58' )
AND tsk2.status = 'Not Done'
WHERE
em.status <> 'Not Done'
Your first change should be to create an index on task that covers both the status and templateId columns:
ALTER TABLE task ADD INDEX (status, templateId);
That'll prevent the full-table scans of 61326 rows each time that table is accessed in your query.
Also, it looks like you might have made a typo here:
SELECT employeeId FROM task
WHERE templateId
IN ( '5', '6', '7', '8' )
AND tsk.status = 'Done'
That tsk.status should be just status like the 2nd subquery.