Make MySQL query more performant - mysql

I have two tables, users and points. Currently users has 84,263 rows, while points has 1,636,119 rows. Each user can have 0 or multiple points and I need to extract which point was created last.
show create table users
CREATE TABLE `users` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`password` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`remember_token` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`role` varchar(15) COLLATE utf8_unicode_ci DEFAULT 'consument',
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`updated_at` timestamp NOT NULL DEFAULT current_timestamp(),
`deleted_at` timestamp NULL DEFAULT NULL,
`email_verified_at` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`email_verify_token` text COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `users_email_unique` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=84345 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
show create table points
CREATE TABLE `points` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL,
`tablet_id` int(10) unsigned DEFAULT NULL,
`parent_company` int(10) unsigned NOT NULL,
`company_id` int(10) unsigned NOT NULL,
`points` int(10) unsigned NOT NULL,
`mutation_type` tinyint(3) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`updated_at` timestamp NOT NULL DEFAULT current_timestamp(),
PRIMARY KEY (`id`),
KEY `points_user_id_foreign` (`user_id`),
KEY `points_company_id_foreign` (`company_id`),
KEY `points_parent_company_index` (`parent_company`),
KEY `points_tablet_id_index` (`tablet_id`),
KEY `points_mutation_type_company_id_created_at_index` (`mutation_type`,`company_id`,`created_at`),
KEY `created_at_user_id` (`created_at`,`user_id`),
CONSTRAINT `points_company_id_foreign` FOREIGN KEY (`company_id`) REFERENCES `companies` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `points_parent_company_foreign` FOREIGN KEY (`parent_company`) REFERENCES `parent_company` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `points_tablet_id_foreign` FOREIGN KEY (`tablet_id`) REFERENCES `tablets` (`id`) ON DELETE SET NULL ON UPDATE CASCADE,
CONSTRAINT `points_user_id_foreign` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1798627 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Queries I tried, but are taking too long (we're talking in minutes, not seconds):
select
`users`.`id`,
`users`.`email`,
`users`.`role`,
`users`.`created_at`,
`users`.`updated_at`,
max(pt.created_at) as `last_transaction`
from `users`
left join points as pt on pt.user_id = users.id
where `users`.`role` = 'consument' and `users`.`deleted_at` is null
group by users.id
select
`users`.`id`,
`users`.`email`,
`users`.`role`,
`users`.`created_at`,
`users`.`updated_at`,
pt.created_at as `last_transaction`
from `users`
left join (select points.user_id, points.created_at from points order by points.created_at desc) as pt on pt.user_id = users.id
where `users`.`role` = 'consument' and `users`.`deleted_at` is null
group by users.id
Why am I not limiting the results and returning only 100 at a time? Because I am using Yajra DataTables for Laravel and when limiting results, it only returns limited results and it does not recognize that there are more. So instead of 84,263 rows, I only get 100 rows and that's it.

Basically your "users" table has a "role" column. It is not indexed. So your queries are doing full table scan on "users" table which has 84263 rows. One way to to optimize it would be to have an index on "role" column. But I can see "consument" is the default value & you are querying by that value. Now suppose 95% of users are having "consument" role. Then even adding index on "role" won't help much. You would have to add more condition to filter out the query & have an index for that condition.
Your first query is better as it would avoid unnecessary inner query of second one.
If you need to return 84263 rows, then that is a sperate issue. Somehow you would have to introduce pagination. You would have to break your queries to multiple queries. Suppose in each call you return 500 users data. You can sort it by id. And in subsequent call, you can ask for next 500 where id is greater than the last id returned in the previous query (for the very first call last id value would be 0). Then the queries can use "id" as index.
You can check the query plan using "explain" keyword & can have better understanding.

Edit
I tried with adding an index on role on users table with 1000 users ans 50000 points, your first query took ~4seconds which is way too long.
So I tried this query which took ~0.5 second, still too long :
select
`users`.`id`,
`users`.`email`,
`users`.`role`,
`users`.`created_at`,
`users`.`updated_at`,
pt.created_at as `last_transaction`
from `users`
left join points pt on pt.id = (select pt2.id from points pt2 WHERE pt2.user_id = users.id ORDER BY pt2.created_at DESC limit 1)
where `users`.`role` = 'consument' and `users`.`deleted_at` is null
So I added an index on points.created_at and now query took 0.05 second, which is more acceptable

It looks like you want a result set with some columns from your users table, and the most recent created_at value from the points table for each user.
So-called compound covering indexes usually help speed these sorts of queries. So, let's start with what you need from points. This subquery gets it.
SELECT user_id, MAX(created_at) last_transaction
FROM points
GROUP BY user_id
This gives you a virtual table with each user_id and the created_at value you want. The following index
CREATE INDEX points_maxcreated ON points (user_id, created_at DESCENDING);
will let MySQL satisfy the subquery with an almost miraculously fast loose index scan.
Then, let's consider the rest of your query.
select
`users`.`id`,
`users`.`email`,
`users`.`role`,
`users`.`created_at`,
`users`.`updated_at`
from `users`
where `users`.`role` = 'consument' and `users`.`deleted_at` is null
For this you want the following index
CREATE INDEX users_del_role_etc
ON users
(deleted_at, role, id, email, created_at, updated_at);
MySQL can satisfy your query directly from this index. Think of these indexes as being stored in order. MySQL random accesses the index to the first eligible row (null deleted_at, role = 'consument') and then reads the index, not the table, row by row to get the data you want.
Putting it all together, you get
select
`users`.`id`,
`users`.`email`,
`users`.`role`,
`users`.`created_at`,
`users`.`updated_at`,
`subquery`.`last_transaction`
from `users`
left join (
SELECT user_id, MAX(created_at) last_transaction
FROM points
GROUP BY user_id
) subquery ON users.id = subquery.user_id
where `users`.`role` = 'consument' and `users`.`deleted_at` is null
This should be reasonably speedy for the query you gave us. Nevertheless, a query that you expect to return tens of thousands of rows you also should expect to take some time. There's no magic that makes SQL handle very large result sets fast. It's designed to retrieve small result sets fast from vast tables.
With respect, your understanding of how to paginate rows from your result set isn't quite right. It's hard to believe your user will actually examine tens of thousands of rows. Without an ORDER BY operation in your query, LIMIT is a very inexpensive operation. If you need ORDER BY ... LIMIT to paginate your results, ask another question, because that performance can also be managed.

Related

Concurrent queries on composite index with order by id drastically slow

I have a table defined as follows:
| book | CREATE TABLE `book` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`provider_id` int(10) unsigned DEFAULT '0',
`source_id` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
`title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`description` longtext COLLATE utf8_unicode_ci,
PRIMARY KEY (`id`),
UNIQUE KEY `provider` (`provider_id`,`source_id`),
KEY `idx_source_id` (`source_id`),
) ENGINE=InnoDB AUTO_INCREMENT=1605425 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
when there are about 10 concurrent read with following sql:
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '1037122800') ORDER BY `book`.`id` ASC LIMIT 1
it becomes slow, it takes about 100 ms.
however if I changed it to
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '221630001') LIMIT 1
then it is normal, it takes several ms.
I don't understand why adding order by id makes query much slower? could anyone expain?
Try to add desired columns (Select Column Name,.. ) instead of * or Refer this.
Why is my SQL Server ORDER BY slow despite the ordered column being indexed?
I'm not a mysql expert, and not able to perform a detailed analysis, but my guess would be that because you are providing values for the UNIQUE KEY in the WHERE clause, the engine can go and fetch that row directly using an index.
However, when you ask it to ORDER BY the id column, which is a PRIMARY KEY, that changes the access path. The engine now guesses that since it has an index on id, and you want to order by id, it is better to fetch that data in PK order, which will avoid a sort. In this case though, it leads to a slower result, as it has to compare every row to the criteria (a table scan).
Note that this is just conjecture. You would need to EXPLAIN both statements to see what is going on.

In MySQL is it faster to execute one JOIN + one LIKE statement or two JOINs?

I have to create a cron job, which is simple in itself, but because it will run every minute I'm worried about performance. I have two tables, one has user names and the other has details about their network. Most of the time a user will belong to just one network, but it is theoretically possible that they might belong to more, but even then very few, maybe two or three. So, in order to reduce the number of JOINs, I saved the network ids separated by | in a field in the user table, e.g.
|1|3|9|
The (simplified for this question) user table structure is
TABLE `users` (
`u_id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE,
`userid` VARCHAR(500) NOT NULL UNIQUE,
`net_ids` VARCHAR(500) NOT NULL DEFAULT '',
PRIMARY KEY (`u_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The (also simplified) network table structure is
CREATE TABLE `network` (
`n_id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE,
`netname` VARCHAR(500) NOT NULL UNIQUE,
`login_time` DATETIME DEFAULT NULL,
`timeout_mins` TINYINT UNSIGNED NOT NULL DEFAULT 10,
PRIMARY KEY (`n_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I have to send a warning when timeout occurs, my query is
SELECT N.netname, N.timeout_mins, N.n_id, U.userid FROM
(SELECT netname, timeout_mins, n_id FROM network
WHERE is_open = 1 AND notify = 1
AND TIMESTAMPDIFF(SECOND, TIMESTAMPADD(MINUTE, timeout_mins, login_time), NOW()) < 60) AS N
INNER JOIN users AS U ON U.net_ids LIKE CONCAT('%|', N.n_id, '|%');
I made N a subquery to reduce the number of rows joined. But I would like to know if it would be faster to add a third table with u_id and n_id as columns, removed the net_ids column from users and then do a join on all three tables? Because I read that using LIKE slows things down.
Which is the most effcient query to use in this case? One JOIN and a LIKE or two JOINS?
P.S. I did some experimentation and the initial values for using two JOINS are higher than using a JOIN and a LIKE. However, repeated runs of the same query seems to speed things up a lot, I suspect something is cached somewhere, either in my app or the database, and both become comparable, so I did not find this data satisfactory. It also contradicts what I was expecting based on what I have been reading.
I used this table:
TABLE `user_net` (
`u_id` BIGINT UNSIGNED NOT NULL,
`n_id` BIGINT UNSIGNED NOT NULL,
INDEX `u_id` (`u_id`),
FOREIGN KEY (`u_id`) REFERENCES `users`(`u_id`),
INDEX `n_id` (`n_id`),
FOREIGN KEY (`n_id`) REFERENCES `network`(`n_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
and this query:
SELECT N.netname, N.timeout_mins, N.n_id, U.userid FROM
(SELECT netname, timeout_mins, n_id FROM network
WHERE is_open = 1 AND notify = 1
AND TIMESTAMPDIFF(SECOND, TIMESTAMPADD(MINUTE, timeout_mins, login_time), NOW()) < 60) AS N
INNER JOIN user_net AS UN ON N.n_id = UN.n_id
INNER JOIN users AS U ON UN.u_id = U.u_id;
You should define composite indexes for the user_net table. One of them can (and should) be the primary key.
TABLE `user_net` (
`u_id` BIGINT UNSIGNED NOT NULL,
`n_id` BIGINT UNSIGNED NOT NULL,
PRIMARY KEY (`u_id`, `n_id`),
INDEX `uid_nid` (`n_id`, `u_id`),
FOREIGN KEY (`u_id`) REFERENCES `users`(`u_id`),
FOREIGN KEY (`n_id`) REFERENCES `network`(`n_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I would also rewrite your query to:
SELECT N.netname, N.timeout_mins, N.n_id, U.userid
FROM network N
INNER JOIN user_net AS UN ON N.n_id = UN.n_id
INNER JOIN users AS U ON UN.u_id = U.u_id
WHERE N.is_open = 1
AND N.notify = 1
AND TIMESTAMPDIFF(SECOND, TIMESTAMPADD(MINUTE, N.timeout_mins, N.login_time), NOW()) < 60
While your subquery will probably not hurt much, there is no need for it.
Note that the last condition cannot use an index, because you have to combine two columns. If your MySQL version is at least 5.7.6 you can define an indexed virtual (calculated) column.
CREATE TABLE `network` (
`n_id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE,
`netname` VARCHAR(500) NOT NULL UNIQUE,
`login_time` DATETIME DEFAULT NULL,
`timeout_mins` TINYINT UNSIGNED NOT NULL DEFAULT 10,
`is_open` TINYINT UNSIGNED,
`notify` TINYINT UNSIGNED,
`timeout_dt` DATETIME AS (`login_time` + INTERVAL `timeout_mins` MINUTE),
PRIMARY KEY (`n_id`),
INDEX (`timeout_dt`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Now change the query to:
SELECT N.netname, N.timeout_mins, N.n_id, U.userid
FROM network N
INNER JOIN user_net AS UN ON N.n_id = UN.n_id
INNER JOIN users AS U ON UN.u_id = U.u_id
WHERE N.is_open = 1
AND N.notify = 1
AND N.timeout_dt < NOW() + INTERVAL 60 SECOND
and it will be able to use the index.
You can also try to replace
INDEX (`timeout_dt`)
with
INDEX (`is_open`, `notify`, `timeout_dt`)
and see if it is of any help.
Reformulate to avoid hiding columns inside functions. I can't grok your date expression, but note this:
login_time < NOW() - INTERVAL timeout_mins MINUTE
If you can achieve something like that, then this index should help:
INDEX(is_open, notify, login_time)
If that is not good enough, let's see the other formulation so we can compare them.
Having stuff separated by comma (or |) is likely to be a really bad idea.
Bottom line: Assume that JOINs are not a performance problem, write the queries with as many JOINs as needed. Then let's optimize that.

MySQL update column with value from a different table

I have two tables with the following structure and example content. Table one has the membership_no set to the correct values, but table two has some incorrect values in the membership_no column. I am needing to query both tables and check to see when the membership_no values are not equal, then update table two's membership_no column with the value from table one.
Table One:
id membership_no
====================
800960 800960
800965 800965
Table Two:
id membership_no
====================
800960 800970
800965 800975
Update query so far. It is not catching all of the incorrect values from table two.
UPDATE
tabletwo
INNER JOIN
tableone ON tabletwo.id = tableone.id
SET
tabletwo.membership_no = tableone.membership_no;
EDIT: Including SHOW CREATE and SELECT queries for unmatched membership_no column values.
Table One SHOW:
CREATE TABLE `n2z7m3_kiduka_accounts_j15` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`membership_no` int(11) NOT NULL,
...
`membershipyear` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=800987 DEFAULT CHARSET=utf8
Table Two SHOW:
CREATE TABLE `n2z7m3_kiduka_accounts` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`membership_no` int(11) NOT NULL,
...
`membershipyear` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=801072 DEFAULT CHARSET=utf8
SELECT query for unmatched membership_no column values:
SELECT
u.name,
a.membership_no as 'Joomla 1.5 accounts table',
j.membership_no as 'Joomla 3.0 accounts table'
FROM
n2z7m3_kiduka_accounts_j15 AS a
INNER JOIN n2z7m3_users AS u ON a.user_id = u.id
INNER JOIN n2z7m3_kiduka_accounts AS j ON a.user_id = j.membership_no
and a.membership_no != j.membership_no
ORDER BY u.name;
While Tim's Answer is perfectly valid, another variation is to add the filter qualifier to the ON clause such that:
UPDATE tabletwo
INNER JOIN
tableone ON tabletwo.id = tableone.id AND tabletwo.membership_no <> tableone.membership_no
SET
tabletwo.membership_no = tableone.membership_no;
This means that you don't have the WHERE filter so it will process all rows, but will act on only those with differing membership_no values. Because it is an INNER JOIN the results will be both tables or no tables (Skipped/NULL result).
EDIT:
If you suspect you have a problem still, what does the MySQL command respond, do you have a specific error notice? With 80k columns, it may take a while for the comand to actually process , so are you giving the command time to complete or is PHP or the system causing the command to abort due to execution time expiry? (Update your execution time on PHP and MySQL and rerun query just to see if that causes it to complete successfully?)
Suggestion
As another sggestion I think your UNIQUE KEY should also be your AI key so for both tables:
DROP INDEX `user_id` ON <table> #removes the current unique index.
then
CREATE UNIQUE INDEX `id` ON <table> #addes unique index to the A_I column.
You just need to add a WHERE clause:
UPDATE
tabletwo
INNER JOIN
tableone
ON tabletwo.id = tableone.id
SET
tabletwo.membership_no = tableone.membership_no
WHERE tabletwo.membership_no <> tableone.membership_no

Ordering in MySQL Bogs Down

I've been working on a small Perl program that works with a table of articles, displaying them to the user if they have not been already read. It has been working nicely and it has been quite speedy, overall. However, this afternoon, the performance has degraded from fast enough that I wasn't worried about optimizing the query to a glacial 3-4 seconds per query. To select articles, I present this query:
SELECT channelitem.ciid, channelitem.cid, name, description, url, creationdate, author
FROM `channelitem`
WHERE ciid NOT
IN (
SELECT ciid
FROM `uninet_channelitem_read`
WHERE uid = '1030'
)
AND (
cid =117
OR cid =308
OR cid =310
)
ORDER BY `channelitem`.`creationdate` DESC
LIMIT 0 , 100
The list of possible cid's varies and could be quite a bit more. In any case, I noted that about 2-3 seconds of the total time to make the query is devoted to "ORDER BY." If I remove that, it only takes about a half second to give me the query back. If I drop the subquery, the performance goes back to normal... but the subquery didn't seem to be problematic until just this afternoon, after working fine for a week or so.
Any ideas what could be slowing it down so much? What might I do to try to get the performance back up to snuff? The table being queried has 45,000 rows. The subquery's table has fewer than 3,000 rows at present.
Update: Incidentally, if anyone has suggestions on how to do multiple queries or some other technique that would be more efficient to accomplish what I am trying to do, I am all ears. I'm really puzzled how to solve the problem at this point. Can I somehow apply the order by before the join to make it apply to the real table and not the derived table? Would that be more efficient?
Here is the latest version of the query, derived from suggestions from #Gordon, below
SELECT channelitem.ciid, channelitem.cid, name, description, url, creationdate, author
FROM `channelitem`
LEFT JOIN (
SELECT ciid, dateRead
FROM `uninet_channelitem_read`
WHERE uid = '1030'
)alreadyRead ON channelitem.ciid = alreadyRead.ciid
WHERE (
alreadyRead.ciid IS NULL
)
AND `cid`
IN ( 6648, 329, 323, 6654, 6647 )
ORDER BY `channelitem`.`creationdate` DESC
LIMIT 0 , 100
Also, I should mention what my db structure looks like with regards to these two tables -- maybe someone can spot something odd about the structure:
CREATE TABLE IF NOT EXISTS `channelitem` (
`newsversion` int(11) NOT NULL DEFAULT '0',
`cid` int(11) NOT NULL DEFAULT '0',
`ciid` int(11) NOT NULL AUTO_INCREMENT,
`description` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`url` varchar(222) DEFAULT NULL,
`creationdate` datetime DEFAULT NULL,
`urgent` varchar(10) DEFAULT NULL,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`lastchanged` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`author` varchar(255) NOT NULL,
PRIMARY KEY (`ciid`),
KEY `newsversion` (`newsversion`),
KEY `cid` (`cid`),
KEY `creationdate` (`creationdate`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1638554365 ;
CREATE TABLE IF NOT EXISTS `uninet_channelitem_read` (
`ciid` int(11) NOT NULL,
`uid` int(11) NOT NULL,
`dateRead` datetime NOT NULL,
PRIMARY KEY (`ciid`,`uid`),
KEY `ciid` (`ciid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
It never hurts to try the left outer join version of such a query:
SELECT ci.ciid, ci.cid, ci.name, ci.description, ci.url, ci.creationdate, ci.author
FROM `channelitem` ci left outer join
(SELECT ciid
FROM `uninet_channelitem_read`
WHERE uid = '1030'
) cr
on ci.ciid = cr.ciid
where cr.ciid is null and
ci.cid in (117, 308, 310)
ORDER BY ci.`creationdate` DESC
LIMIT 0 , 100
This query will be faster with an index on uninet_channelitem_read(ciid) and probably on channelitem(cid, ciid, createddate).
The problem could be that you need to create an index on the channelitem table for the column creationdate. Indexes help a database to run queries faster. Here is a link about MySQL Indexing

MySQL optimization - large table joins

To start out here is a simplified version of the tables involved.
tbl_map has approx 4,000,000 rows, tbl_1 has approx 120 rows, tbl_2 contains approx 5,000,000 rows. I know the data shouldn't be consider that large given that Google, Yahoo!, etc use much larger datasets. So I'm just assuming that I'm missing something.
CREATE TABLE `tbl_map` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`tbl_1_id` bigint(20) DEFAULT '-1',
`tbl_2_id` bigint(20) DEFAULT '-1',
`rating` decimal(3,3) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tbl_1_id` (`tbl_1_id`),
KEY `tbl_2_id` (`tbl_2_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `tbl_1` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `tbl_2` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`data` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The Query in interest: also, instead of ORDER BY RAND(), ORDERY BY t.id DESC. The query is taking as much as 5~10 seconds and causes a considerable wait when users view this page.
EXPLAIN SELECT t.data, t.id , tm.rating
FROM tbl_2 AS t
JOIN tbl_map AS tm
ON t.id = tm.tbl_2_id
WHERE tm.tbl_1_id =94
AND tm.rating IS NOT NULL
ORDER BY t.id DESC
LIMIT 200
1 SIMPLE tm ref tbl_1_id, tbl_2_id tbl_1_id 9 const 703438 Using where; Using temporary; Using filesort
1 SIMPLE t eq_ref PRIMARY PRIMARY 8 tm.tbl_2_id 1
I would just liked to speed up the query, ensure that I have proper indexes, etc.
I appreciate any advice from DB Gurus out there! Thanks.
SUGGESTION : Index the table as follows:
ALTER TABLE tbl_map ADD INDEX (tbl_1_id,rating,tbl_2_id);
As per Rolando, yes, you definitely need an index on the map table but I would expand to ALSO include the tbl_2_id which is for your ORDER BY clause of Table 2's ID (which is in the same table as the map, so just use that index. Also, since the index now holds all 3 fields, and is based on the ID of the key search and criteria of null (or not) of rating, the 3rd element has them already in order for your ORDER BY clause.
INDEX (tbl_1_id,rating, tbl_2_id);
Then, I would just have the query as
SELECT STRAIGHT_JOIN
t.data,
t.id ,
tm.rating
FROM
tbl_map tm
join tbl_2 t
on tm.tbl_2_id = t.id
WHERE
tm.tbl_1_id = 94
AND tm.rating IS NOT NULL
ORDER BY
tm.tbl_2_id DESC
LIMIT 200