Multiple indexing in a table mysql - mysql

I have a table structure like this
`CREATE TABLE `like_user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sender_id` int(11) NOT NULL,
`receiver_id` int(11) NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `index_user` (`sender_id`,`receiver_id`))`
I have indexed both sender_id and receiver_id. If I try to query this
`Select * from like_user where sender_id = 10`
The index works fine but on the other way around it doesn't.
`Select * from like_user where receiver_id = 11`
How can I make the index work on both the conditions.
The use case is that sender_id is the one who is liking a user and the person who sender id is liking is stored in receiver_id. So If sender wants to see all the users he likes, then indexing works, but if the receiver_id wants to see which senders have liked him, indexing stops working. how we can resolve it?

Only prefix can be used. Postfix cannot. I think that two separate indices, one by sender and another by receiver, will be reasonable:
CREATE TABLE `like_user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sender_id` int(11) NOT NULL,
`receiver_id` int(11) NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY (`sender_id`),
KEY (`receiver_id`)
);
One of these indices will be used for each table copy. For example, for
SELECT *
FROM like_user t1
JOIN like_user t2 ON t1.sender_id = t2.receiver_id;
the first table copy (t1) will use KEY (`sender_id`) whereas another table copy will use KEY (`receiver_id`).

Related

How to optimize a MySQL select with rows that do not have matching values in the other table

This question is more or less the same as this one: MySQL select rows that do not have matching column in other table; however, the solution there is not not practical for large data sets.
This table has ~120,000 rows.
CREATE TABLE `tblTimers` (
`TimerID` int(11) NOT NULL,
`TaskID` int(11) NOT NULL,
`UserID` int(11) NOT NULL,
`StartDateTime` datetime NOT NULL,
`dtStopTime` datetime NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `tblTimers`
ADD PRIMARY KEY (`TimerID`);
ALTER TABLE `tblTimers`
MODIFY `TimerID` int(11) NOT NULL AUTO_INCREMENT;
This table has about ~70,000 rows.
CREATE TABLE `tblWorkDays` (
`WorkDayID` int(11) NOT NULL,
`TaskID` int(11) NOT NULL,
`UserID` int(11) NOT NULL,
`WorkDayDate` date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `tblWorkDays`
ADD PRIMARY KEY (`WorkDayID`);
ALTER TABLE `tblWorkDays`
MODIFY `WorkDayID` int(11) NOT NULL AUTO_INCREMENT;
tblWorkDays should have one line per TaskID per UserID per WorkDayDate, but due to a bug, a few work days are missing despite there being timers for those days; so, I am trying to create a report that shows any timer that does not have a work day associated with it.
SELECT A.TimerID FROM tblTimers A
LEFT JOIN tblWorkDays B ON A.TaskID = B.TaskID AND A.UserID = B.UserID AND DATE(A.StartDateTime) = B.WorkDayDate
WHERE B.WorkDayID IS NULL
Doing this causes the server to time out; so, I am looking for if there is a way to do this more efficiently?
You don't have any indexes on the columns you're joining on, so it has to do full scans of both tables. Try adding the following:
ALTER TABLE tblTimers ADD INDEX (TaskID, UserID);
ALTER TABLE tblWorkDays ADD INDEX (TaskID, UserID);

MARIADB: Index not used for a select with join on a range

I have a first table containing my ips stored as integer (500k rows), and a second one containing ranges of black listed ips and the reason of black listing (10M rows)
here is the table structure :
CREATE TABLE `black_lists` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`ip_start` INT(11) UNSIGNED NOT NULL,
`ip_end` INT(11) UNSIGNED NULL DEFAULT NULL,
`reason` VARCHAR(3) NOT NULL,
`excluded` TINYINT(1) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `ip_range` (`ip_end`, `ip_start`),
INDEX `ip_start` ( `ip_start`),
INDEX `ip_end` (`ip_end`),
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=10747741
;
CREATE TABLE `ips` (
`id` INT(11) NOT NULL AUTO_INCREMENT COMMENT 'Id ips',
`idhost` INT(11) NOT NULL COMMENT 'Id Host',
`ip` VARCHAR(45) NULL DEFAULT NULL COMMENT 'Ip',
`ipint` INT(11) UNSIGNED NULL DEFAULT NULL COMMENT 'Int ip',
`type` VARCHAR(45) NULL DEFAULT NULL COMMENT 'Type',
PRIMARY KEY (`id`),
INDEX `host` (`idhost`),
INDEX `index3` (`ip`),
INDEX `index4` (`idhost`, `ip`),
INDEX `ipsin` (`ipint`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=675651;
my problem is when I try to run this query no index is used and it takes an eternity to finish :
select i.ip,s1.reason
from ips i
left join black_lists s1 on i.ipint BETWEEN s1.ip_start and s1.ip_end;
I'm using MariaDB 10.0.16
True.
The optimizer has no knowledge that start..end values are non overlapping, nor anything else obvious about them. So, the best it can do is decide between
s1.ip_start <= i.ipint -- and use INDEX(ip_start), or
s1.ip_end >= i.ipint -- and use INDEX(ip_end)
Either of those could result in upwards of half the table being scanned.
In 2 steps you could achieve the desired goal for one ip; let's say #ip:
SELECT ip_start, reason
FROM black_lists
WHERE ip_start <= #ip
ORDER BY ip_start DESC
LIMIT 1
But after that, you need to see if the ip_end corresponding to that ip_start is <= #ip before deciding whether you have a black-listed item.
SELECT reason
FROM ( ... ) a -- fill in the above query
JOIN black_lists b USING(ip_start)
WHERE b.ip_end <= #ip
That will either return the reason or no rows.
In spite of the complexity, it will be very fast. But, you seem to have a set of IPs to check. That makes it more complex.
For black_lists, there seems to be no need for id. Suggest you replace the 4 indexes with only 2:
PRIMARY KEY(ip_start, ip_end),
INDEX(ip_end)
In ips, isn't ip unique? If so, get rid if id and change 5 indexes to 3:
PRIMARY KEY(idint),
INDEX(host, ip),
INDEX(ip)
You have allowed more than enough in the VARCHAR for IPv6, but not in INT UNSIGNED.
More discussion.

MySQL Query Optimization for large tables

I have a query that take 50 seconds
SELECT `security_tasks`.`itemid` AS `itemid`
FROM `security_tasks`
INNER JOIN `relations` ON (`relations`.`user_id` = `security_tasks`.`user_id` AND `relations`.`relation_type_id` = `security_tasks`.`relation_type_id` AND `relations`.`relation_with` = 3001 )
Records in security_tasks = 841321 || Records in relations = 234254
CREATE TABLE `security_tasks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`itemid` int(11) DEFAULT NULL,
`relation_type_id` int(11) DEFAULT NULL,
`Task_id` int(2) DEFAULT '0',
`job_id` int(2) DEFAULT '0',
`task_type_id` int(2) DEFAULT '0',
`name` int(2) DEFAULT '0'
PRIMARY KEY (`id`),
KEY `itemid` (`itemid`),
KEY `relation_type_id` (`relation_type_id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1822995 DEFAULT CHARSET=utf8;
CREATE TABLE `relations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`relation_with` int(11) DEFAULT NULL,
`relation_type_id` int(11) DEFAULT NULL,
`manager_level` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `relation_with` (`relation_with`),
KEY `relation_type_id` (`relation_type_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1082882 DEFAULT CHARSET=utf8;
what can i do to make it fast, like 1 or 2 seconds fast
EXPLAIN :
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE relations ref user_id,relation_with,relation_type_id relation_with 5 const 169 Using where
1 SIMPLE security_tasks ref relation_type_id,user_id user_id 5 transparent.relations.user_id 569 Using where
UPDATE :
adding a composite key minimized the time to 20 seconds
ALTER TABLE security_tasks ADD INDEX (user_id, relation_type_id) ; ALTER TABLE relations ADD INDEX (user_id, relation_type_id) ; ALTER TABLE relations ADD INDEX (relation_with) ;
The problem is when the relations table has large data for the selected user (relations.relation_with` = 3001 )
any ideas ?
Adjust your compound index slightly, don't do just two, but all three parts
ALTER TABLE relations ADD INDEX (user_id, relation_type_id, relation_with)
The index does not just have to be on the joined columns, but SHOULD be based on joined columns PLUS anything else that makes sense as querying criteria is concerned (within reason, takes time to learn more efficiencies). So, in the case suggested, you know the join on the user and type, but are also specific to the relation with... so that is added to the same index.
Additionally, your security task table, you could add the itemID to the index to make it a covering index (ie: covers the join conditions AND the data element(s) you want to retrieve). This too is a technique, and should NOT include all other elements in a query, but since this is a single column might make sense for your scenario. So, look into "covering indexes", but in essence, a covering index qualifies the join, but since it also has this "itemid", the engine does not have to go back to the raw data pages of the entire security tasks table to get that one column. It's part of the index so it grabs whatever qualified the join and comes along for the ride and you are done.
ALTER TABLE security_tasks ADD INDEX (user_id, relation_type_id, itemid) ;
And for readability purposes, especially with long table names, it's good to use aliases
SELECT
st.itemid
FROM
security_tasks st
INNER JOIN relations r
ON st.user_id = r.user_id
AND st.relation_type_id = r.relation_type_id
AND r.relation_with = 3001

Is it possible to merge two tables by primary key?

I have two tables, which I need to merge, and they are:
CREATE TABLE IF NOT EXISTS `legacy_bookmarks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`url` text,
`title` text,
`snippet` text,
`datetime` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `datetime` (`datetime`),
FULLTEXT KEY `title` (`title`,`snippet`)
)
And:
CREATE TABLE IF NOT EXISTS `legacy_links` (
`id` mediumint(11) NOT NULL AUTO_INCREMENT,
`user_id` mediumint(11) NOT NULL,
`bookmark_id` int(11) NOT NULL,
`status` enum('public','private') NOT NULL DEFAULT 'public',
UNIQUE KEY `id` (`id`),
KEY `bookmark_id` (`bookmark_id`)
)
As you can see, "legacy_links" contains the ID for "legacy_bookmarks". Am I able to merge the two, based on this relationship?
I can easily change the name of the ID column in "legacy_bookmarks" to "bookmark_id", if that makes things any easier.
Just so you know, the order of the columns, and their types, must be exact, because the data from this combined table is then to be imported into the new "bookmarks" table.
Also, I'd need to able to include additional columns (a "modification" column, populated with the "datetime" values), and change the order of the ones I have.
Any takers?
[Up to you to change the order of the columns]
CREATE TABLE `legacy_linkss` AS
SELECT l.id, l.url, l.title, l.snippet, l.datetime AS modification, b.user_id, b.status
FROM
`legacy_links` l
JOIN `legacy_bookmarks` b ON b.id = l.bookmark_id
;
Afterwards, after checking the consistency and adding manually the constraints, you may:
DROP TABLE `legacy_links`;
DROP TABLE `legacy_bookmarks`;
RENAME TABLE `legacy_linkss` TO `legacy_links`;
Yes, it's called a join, and you would do it like so:
SELECT *
FROM legacy_bookmarks lb
INNER JOIN legacy_links ll ON ll.bookmark_id = lb.id

Mysql query optimisation, EXPLAIN and slow execution

Having some real issues with a few queries, this one inparticular. Info below.
tgmp_games, about 20k rows
CREATE TABLE IF NOT EXISTS `tgmp_games` (
`g_id` int(8) NOT NULL AUTO_INCREMENT,
`site_id` int(6) NOT NULL,
`g_name` varchar(255) NOT NULL,
`g_link` varchar(255) NOT NULL,
`g_url` varchar(255) NOT NULL,
`g_platforms` varchar(128) NOT NULL,
`g_added` datetime NOT NULL,
`g_cover` varchar(255) NOT NULL,
`g_impressions` int(8) NOT NULL,
PRIMARY KEY (`g_id`),
KEY `g_platforms` (`g_platforms`),
KEY `site_id` (`site_id`),
KEY `g_link` (`g_link`),
KEY `g_release` (`g_release`),
KEY `g_genre` (`g_genre`),
KEY `g_name` (`g_name`),
KEY `g_impressions` (`g_impressions`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
tgmp_reviews - about 200k rows
CREATE TABLE IF NOT EXISTS `tgmp_reviews` (
`r_id` int(8) NOT NULL AUTO_INCREMENT,
`site_id` int(6) NOT NULL,
`r_source` varchar(128) NOT NULL,
`r_date` date NOT NULL,
`r_score` int(3) NOT NULL,
`r_copy` text NOT NULL,
`r_link` text NOT NULL,
`r_int_link` text NOT NULL,
`r_parent` int(8) NOT NULL,
`r_platform` varchar(12) NOT NULL,
`r_impressions` int(8) NOT NULL,
PRIMARY KEY (`r_id`),
KEY `site_id` (`site_id`),
KEY `r_parent` (`r_parent`),
KEY `r_platform` (`r_platform`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;
Here is the query, takes 3 seconds ish
SELECT * FROM tgmp_games g
RIGHT JOIN tgmp_reviews r ON g_id = r.r_parent
WHERE g.site_id = '34'
GROUP BY g_name
ORDER BY g_impressions DESC LIMIT 15
EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE r ALL r_parent NULL NULL NULL 201133 Using temporary; Using filesort
1 SIMPLE g eq_ref PRIMARY,site_id PRIMARY 4 engine_comp.r.r_parent 1 Using where
I am just trying to grab the 15 most viewed games, then grab a single review (doesnt really matter which, I guess highest rated would be ideal, r_score) for each game.
Can someone help me figure out why this is so horribly inefficient?
I don't understand what is the purpose of having a GROUP BY g_name in your query, but this makes MySQL performing aggregates on the columns selected, or all columns from both table. So please try to exclude it and check if it helps.
Also, RIGHT JOIN makes database to query tgmp_reviews first, which is not what you want. I suppose LEFT JOIN is a better choice here. Please, try to change the join type.
If none of the first options helps, you need to redesign your query. As you need to obtain 15 most viewed games for the site, the query will be:
SELECT g_id
FROM tgmp_games g
WHERE site_id = 34
ORDER BY g_impressions DESC
LIMIT 15;
This is the very first part that should be executed by the database, as it provides the best selectivity. Then you can get the desired reviews for the games:
SELECT r_parent, max(r_score)
FROM tgmp_reviews r
WHERE r_parent IN (/*1st query*/)
GROUP BY r_parent;
Such construct will force database to execute the first query first (sorry for the tautology) and will give you the maximal score for each of the wanted games. I hope you will be able to use the obtained results for your purpose.
Your MyISAM table is small, you can try converting it to see if that resolves the issue. Do you have a reason for using MyISAM instead of InnoDB for that table?
You can also try running an analyze on each table to update the statistics to see if the optimizer chooses something different.