Sql - query taking long - mysql

I have two tables. One is called map_life, and the second one is called scripts. The map_life table has a lot of rows, that are identified by a column called lifeid. The rows at the table scripts are identified by the column objectid. I want to create a query that gets all the rows from the table map_life and also adds the column scriptfrom scripts table if lifeidmatches objectid, and that the objecttype is npc.
I created the following query.
SELECT id
,lifeid
,x_pos
,y_pos
,foothold
,min_click_pos
,max_click_pos
,respawn_time
,flags
,script.script
FROM map_life life
LEFT JOIN scripts script
ON script.objectid = life.lifeid
AND script.script_type = 'npc'
However, that query takes a lot of time. Any way I can tune it? Thanks.
EDIT: I have ran EXPLAIN command, there are the results.
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
1,"SIMPLE","life","ALL","","","","",47600,""
1,"SIMPLE","script","ref","PRIMARY","PRIMARY","1","const",1834,"Using where"
EDIT 2: Here are the create statmenets of each table.
map_life
DROP TABLE IF EXISTS `mcdb`.`map_life`;
CREATE TABLE `mcdb`.`map_life` (
`id` bigint(21) unsigned NOT NULL AUTO_INCREMENT,
`mapid` int(11) NOT NULL,
`life_type` enum('npc','mob','reactor') NOT NULL,
`lifeid` int(11) NOT NULL,
`life_name` varchar(50) DEFAULT NULL COMMENT 'For reactors, specifies a handle so scripts may interact with them; for NPC/mob, this field is useless',
`x_pos` smallint(6) NOT NULL DEFAULT '0',
`y_pos` smallint(6) NOT NULL DEFAULT '0',
`foothold` smallint(6) NOT NULL DEFAULT '0',
`min_click_pos` smallint(6) NOT NULL DEFAULT '0',
`max_click_pos` smallint(6) NOT NULL DEFAULT '0',
`respawn_time` int(11) NOT NULL DEFAULT '0',
`flags` set('faces_left') NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`lifeid`) USING BTREE,
KEY `lifetype` (`mapid`,`life_type`)
) ENGINE=InnoDB AUTO_INCREMENT=47557 DEFAULT CHARSET=latin1;
scripts
DROP TABLE IF EXISTS `mcdb`.`scripts`;
CREATE TABLE `mcdb`.`scripts` (
`script_type` enum('npc','reactor','quest','item','map','map_enter','map_first_enter') NOT NULL,
`helper` tinyint(3) NOT NULL DEFAULT '-1' COMMENT 'Represents the quest state for quests, and the index of the script for NPCs (NPCs may have multiple scripts).',
`objectid` int(11) NOT NULL DEFAULT '0',
`script` varchar(40) NOT NULL DEFAULT '',
PRIMARY KEY (`script_type`,`helper`,`objectid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Lists all the scripts that belong to NPCs/reactors/etc. ';

You should probably add an index to the 'script_type' field depending on the type. If it's not using a type that can be indexed, you should change the type if possible and index
Here is a link that discusses more about indexes with MySQL, http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html

Your primary key on scripts is:
PRIMARY KEY (`script_type`,`helper`,`objectid`)
The order of multi-column keys is important.
From the docs:
Any index that does not span all AND levels in the WHERE clause is not
used to optimize the query. In other words, to be able to use an
index, a prefix of the index must be used in every AND group.
Your primary key on scripts does include both the script_type and objectid columns, which are both used in the join's ON clause:
ON script.objectid = life.lifeid
AND script.script_type = 'npc'
but the primary key also includes the helper column between those two, so MySQL can only use the primary key index for searching using the first column (script_type).
So, for every join, MySQL must search through all scripts records where script_type is 'npc' to find the particular objectid record to join on.
MySQL could full utilize the primary key index if your ON clause included all three columns like this:
ON script.objectid = life.lifeid
AND script.helper = 1
AND script.script_type = 'npc'
If you often query the scripts table without specifying the helper column, consider changing the order of the columns in the primary key to put the helper column last:
PRIMARY KEY (`script_type`,`objectid`,`helper`)
Then, your original ON clause is appropriate for the index because the index prefix includes all of the columns in your predicate (script_type,objectid):
ON script.objectid = life.lifeid
AND script.script_type = 'npc'
Alternatively, add an additional index with just the two columns mentioned in the ON clause:
KEY `scrypt_type_objectid` (`script_type`,`objectid`)

Related

MySQL composite index effect on joins

I have the following SQL query (DB is MySQL 5):
select
event.full_session_id,
DATE(min(event.date)),
event_exe.user_id,
COUNT(DISTINCT event_pat.user_id)
FROM
event AS event
JOIN event_participant AS event_pat ON
event.pat_id = event_pat.id
JOIN event_participant AS event_exe on
event.exe_id = event_exe.id
WHERE
event_pat.user_id <> event_exe.user_id
GROUP BY
event.full_session_id;
"SHOW CREATE TABLE event":
CREATE TABLE `event` (
`id` int(12) NOT NULL AUTO_INCREMENT,
`date` datetime NOT NULL,
`session_id` varchar(64) DEFAULT NULL,
`full_session_id` varchar(72) DEFAULT NULL,
`pat_id` int(12) DEFAULT NULL,
`exe_id` int(12) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `SESSION_IDX` (`full_session_id`),
KEY `PAT_ID_IDX` (`pat_id`),
KEY `DATE_IDX` (`date`),
KEY `SESSLOGPATEXEC_IDX` (`full_session_id`,`date`,`pat_id`,`exe_id`)
) ENGINE=MyISAM AUTO_INCREMENT=371955 DEFAULT CHARSET=utf8
"SHOW CREATE TABLE event_participant":
CREATE TABLE `event_participant` (
`id` int(12) NOT NULL AUTO_INCREMENT,
`user_id` varchar(64) NOT NULL,
`alt_user_id` varchar(64) NOT NULL,
`username` varchar(128) NOT NULL,
`usertype` varchar(32) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `ALL_UNQ` (`user_id`,`alt_user_id`,`username`,`usertype`),
KEY `USER_ID_IDX` (`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=5397 DEFAULT CHARSET=utf8
Also, the query itself seems ugly, but this is legacy code on a production system, so we are not expected to change it (at least for now).
The problem is that, there is around 36 million record on the event table (in the production system), so there have been frequent crashes of the DB machine due to using temporary;using filesort processing (they provided these EXPLAIN outputs, unfortunately, I don't have them right now. I'll try to update them to this post later.)
The customer asks for a "quick fix" by adding indices. Currently we have indices on full_session_id, pat_id, date (separately) on event and user_id on event_participant.
Thus I'm thinking of creating a composite index (pat_id, exe_id, full_session_id, date) on event- this index comprises of the fields in the join (equivalent to where ?), then group by, then aggregate (min) parts.
This is just an idea because we currently don't have that kind of data volume to test, so we try the best we could first.
My question is:
Could the index above help in the performance ? (It's quite confusing on the effect because I have found two really contrasting results: https://dba.stackexchange.com/questions/158385/compound-index-on-inner-join-table
versus Separate Join clause in a Composite Index, where the latter suggests that composite index on joins won't work and the former that it'll work.
Does this path (adding indices) have hopes ? Or should we forget it and just try to optimize the query instead ?
Thanks in advance for your help :)
Update:
I have updated the full table description for the two related tables.
MySQL version is 5.1.69. But I think we don't need to worry about the ambiguous data issue mentioned in the comments, because it seems there won't be ambiguity for our data. Specifically, for each full_session_id, there is only one "event_exe.user_id" returned (it's just a business logic in the application)
So, what do you think about my 2 questions ?

MySQL query with multiple joins taking too long to execute

I have 3 tables. The first one is called map_life, the second one is called scripts and the third one is called npc_data.
I'm running the following query to get all the properties from map_life while also getting the script column from scripts and the storage_cost column from npc_data if the ids match.
SELECT life.*
, script.script
, npc.storage_cost
FROM map_life life
LEFT
JOIN scripts script
ON script.objectid = life.lifeid
AND script.script_type = 'npc'
LEFT
JOIN npc_data npc
ON npc.npcid = life.lifeid
As you can see, map_life id is lifeid, while scripts id is objectid and npc_data id is npcid.
This query is taking about 5 seconds to execute, and I have no idea why. Here's the CREATE statements for all those 3 tables, maybe I'm missing something?
CREATE TABLE `mcdb83`.`map_life` (
`id` bigint(21) unsigned NOT NULL AUTO_INCREMENT,
`mapid` int(11) NOT NULL,
`life_type` enum('npc','mob','reactor') NOT NULL,
`lifeid` int(11) NOT NULL,
`life_name` varchar(50) DEFAULT NULL COMMENT 'For reactors, specifies a handle so scripts may interact with them; for NPC/mob, this field is useless',
`x_pos` smallint(6) NOT NULL DEFAULT '0',
`y_pos` smallint(6) NOT NULL DEFAULT '0',
`foothold` smallint(6) NOT NULL DEFAULT '0',
`min_click_pos` smallint(6) NOT NULL DEFAULT '0',
`max_click_pos` smallint(6) NOT NULL DEFAULT '0',
`respawn_time` int(11) NOT NULL DEFAULT '0',
`flags` set('faces_left') NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `lifetype` (`mapid`,`life_type`)
) ENGINE=InnoDB AUTO_INCREMENT=32122 DEFAULT CHARSET=latin1;
CREATE TABLE `mcdb83`.`scripts` (
`script_type` enum('npc','reactor','quest','item','map_enter','map_first_enter') NOT NULL,
`helper` tinyint(3) NOT NULL DEFAULT '-1' COMMENT 'Represents the quest state for quests, and the index of the script for NPCs (NPCs may have multiple scripts).',
`objectid` int(11) NOT NULL DEFAULT '0',
`script` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`script_type`,`helper`,`objectid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Lists all the scripts that belong to NPCs/reactors/etc. ';
CREATE TABLE `mcdb83`.`npc_data` (
`npcid` int(11) NOT NULL,
`storage_cost` int(11) NOT NULL DEFAULT '0',
`flags` set('maple_tv','is_guild_rank') NOT NULL DEFAULT '',
PRIMARY KEY (`npcid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
For this query:
SELECT l.*, s.script, npc.storage_cost
FROM map_life l LEFT JOIN
scripts s
ON s.objectid = l.lifeid AND
s.script_type = 'npc' LEFT JOIN
npc_data npc
ON npc.npcid = l.lifeid;
You want indexes on: scripts(object_id, script_type, script) and npc_data(npcid, storage_cost). The order of the columns in these indexes is important.
map_life.lifeid does not have any indexes defined, therefore the joins will result in full table scans. Define an index on map_life.lifeid field.
In scripts table the primary key is defined on the following fields in that order: script_type, helper, objectid. The join is done on objectid and there is a constant filter criterion on script_type. Because the order of the fields in the index is wrong, MySQL cannot use the primary key for this query. For this query the order of the fields in the index should b: objectid, script_type, helper.
The above will significantly speed up the joins. You may further increase the speed of the query if your indexes actually cover all fields that are in the query because in this case MySQL does not even have to touch the tables.
Consider adding an index with the following fields and order to the scripts table: object_id, script_type, script and npcid, storage_cost index to npc_data table. However, these indexes may slow down insert / update / delete statements, so do some performance testing before adding these indexes to production environment.

MySQL Query Optimization for large tables

I have a query that take 50 seconds
SELECT `security_tasks`.`itemid` AS `itemid`
FROM `security_tasks`
INNER JOIN `relations` ON (`relations`.`user_id` = `security_tasks`.`user_id` AND `relations`.`relation_type_id` = `security_tasks`.`relation_type_id` AND `relations`.`relation_with` = 3001 )
Records in security_tasks = 841321 || Records in relations = 234254
CREATE TABLE `security_tasks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`itemid` int(11) DEFAULT NULL,
`relation_type_id` int(11) DEFAULT NULL,
`Task_id` int(2) DEFAULT '0',
`job_id` int(2) DEFAULT '0',
`task_type_id` int(2) DEFAULT '0',
`name` int(2) DEFAULT '0'
PRIMARY KEY (`id`),
KEY `itemid` (`itemid`),
KEY `relation_type_id` (`relation_type_id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1822995 DEFAULT CHARSET=utf8;
CREATE TABLE `relations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`relation_with` int(11) DEFAULT NULL,
`relation_type_id` int(11) DEFAULT NULL,
`manager_level` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `relation_with` (`relation_with`),
KEY `relation_type_id` (`relation_type_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1082882 DEFAULT CHARSET=utf8;
what can i do to make it fast, like 1 or 2 seconds fast
EXPLAIN :
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE relations ref user_id,relation_with,relation_type_id relation_with 5 const 169 Using where
1 SIMPLE security_tasks ref relation_type_id,user_id user_id 5 transparent.relations.user_id 569 Using where
UPDATE :
adding a composite key minimized the time to 20 seconds
ALTER TABLE security_tasks ADD INDEX (user_id, relation_type_id) ; ALTER TABLE relations ADD INDEX (user_id, relation_type_id) ; ALTER TABLE relations ADD INDEX (relation_with) ;
The problem is when the relations table has large data for the selected user (relations.relation_with` = 3001 )
any ideas ?
Adjust your compound index slightly, don't do just two, but all three parts
ALTER TABLE relations ADD INDEX (user_id, relation_type_id, relation_with)
The index does not just have to be on the joined columns, but SHOULD be based on joined columns PLUS anything else that makes sense as querying criteria is concerned (within reason, takes time to learn more efficiencies). So, in the case suggested, you know the join on the user and type, but are also specific to the relation with... so that is added to the same index.
Additionally, your security task table, you could add the itemID to the index to make it a covering index (ie: covers the join conditions AND the data element(s) you want to retrieve). This too is a technique, and should NOT include all other elements in a query, but since this is a single column might make sense for your scenario. So, look into "covering indexes", but in essence, a covering index qualifies the join, but since it also has this "itemid", the engine does not have to go back to the raw data pages of the entire security tasks table to get that one column. It's part of the index so it grabs whatever qualified the join and comes along for the ride and you are done.
ALTER TABLE security_tasks ADD INDEX (user_id, relation_type_id, itemid) ;
And for readability purposes, especially with long table names, it's good to use aliases
SELECT
st.itemid
FROM
security_tasks st
INNER JOIN relations r
ON st.user_id = r.user_id
AND st.relation_type_id = r.relation_type_id
AND r.relation_with = 3001

Unique (multiple columns) and null in one column

I have simple categories table. Category can have parent category (par_cat column) or null if it is main category and with the same parent category there shouldn't be 2 or more categories with the same name or url.
Code for this table:
CREATE TABLE IF NOT EXISTS `categories` (
`id` int(10) unsigned NOT NULL,
`par_cat` int(10) unsigned DEFAULT NULL,
`lang` varchar(2) COLLATE utf8_unicode_ci NOT NULL DEFAULT 'pl',
`name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`url` varchar(120) COLLATE utf8_unicode_ci NOT NULL,
`active` tinyint(3) unsigned NOT NULL DEFAULT '1',
`accepted` tinyint(3) unsigned NOT NULL DEFAULT '1',
`priority` int(10) unsigned NOT NULL DEFAULT '1000',
`entries` int(10) unsigned NOT NULL DEFAULT '0',
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=3 ;
ALTER TABLE `categories`
ADD PRIMARY KEY (`id`),
ADD UNIQUE KEY `categories_name_par_cat_unique` (`name`,`par_cat`),
ADD UNIQUE KEY `categories_url_par_cat_unique` (`url`,`par_cat`),
ADD KEY `categories_par_cat_foreign` (`par_cat`);
ALTER TABLE `categories`
MODIFY `id` int(10) unsigned NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=3;
ALTER TABLE `categories`ADD CONSTRAINT `categories_par_cat_foreign`
FOREIGN KEY (`par_cat`) REFERENCES `categories` (`id`);
The problem is that even if I have unique keys it doesn't work. If I try to insert into database 2 categories that have par_cat set to null and same name and url, those 2 categories can be inserted into database without a problem (and they shouldn't). However if I select for those categories other par_cat (for example 1 assuming category with id 1 exists), only first record will be inserted (and that's desired behaviour).
Question - how to handle this case? I read that:
A UNIQUE index creates a constraint such that all values in the index
must be distinct. An error occurs if you try to add a new row with a
key value that matches an existing row. This constraint does not apply
to NULL values except for the BDB storage engine. For other engines, a
UNIQUE index permits multiple NULL values for columns that can contain
NULL. If you specify a prefix value for a column in a UNIQUE index,
the column values must be unique within the prefix.
however if I have unique on multiple columns I expected it's not the case (only par_cat can be null, name and url cannot be null). Because par_cat references to id of the same table but some categories don't have parent category it should allow null values.
This works as defined by the SQL standard. NULL means unknown. If you have two records of par_cat = NULL and name = 'X', then the two NULLs are not regarded to hold the same value. Thus they don't violate the unique key constraint. (Well, one could argue that the NULLs still might mean the same value, but applying this rule would make working with unique indexes and nullable fields almost impossible, for NULL could as well mean 1, 2 or whatever other value. So they did well to define it such as they did in my opinion.)
As MySQL does not support functional indexes where you could have an index on ISNULL(par_cat,-1), name, your only option is to make par_cat a NOT NULL column with 0 or -1 or whatever for "no parent", if you want your constraints to work.
I see that this was asked in 2014.
However it is often requested from MySQL: https://bugs.mysql.com/bug.php?id=8173 and https://bugs.mysql.com/bug.php?id=17825 for example.
People can click on affects me to try and get attention from MySQL.
Since MySQL 5.7 we can now use the following workaround:
ALTER TABLE categories
ADD generated_par_cat INT UNSIGNED AS (ifNull(par_cat, 0)) NOT NULL,
ADD UNIQUE INDEX categories_name_generated_par_cat (name, generated_par_cat),
ADD UNIQUE INDEX categories_url_generated_par_cat (url, generated_par_cat);
The generated_par_cat is a virtual generated column, so it has no storage space. When a user inserts (or updates) then the unique indexes cause the value of generated_par_cat to be generated on the fly which is a very quick operation.
Just in case you come from Laravel...
This is Laravel's Migration version for Virtual Column to workaround the UNIQUE issue when one of the columns is NULL in value
$table->integer('generated_par_cat')->virtualAs('ifNull(par_cat, 0)');
$table->unique(['name', 'generated_par_cat'], 'name_par_cat_unique');

Increase SELECT performance for table with only primary keys

I have a table with 8 columns, as shown below in the create statement.
Rows have to be unique, that is, no two rows can have the exact same value in each column. To this end I defined each column to be a Primary Key.
However, performing a select as show below takes extremely long as, i suppose, MySQL will have to scan each row to find results. As the table is pretty large, this takes a lot of time.
Do you have any suggestion how I could increase performance?
EDIT create statement:
CREATE TABLE `volatilities` (
`instrument` varchar(45) NOT NULL DEFAULT '',
`baseCurrencyId` int(11) NOT NULL,
`tenor` varchar(45) NOT NULL DEFAULT '',
`tenorUnderlying` varchar(45) NOT NULL DEFAULT '',
`strike` double NOT NULL DEFAULT '0',
`evalDate` date NOT NULL DEFAULT '0000-00-00',
`volatility` double NOT NULL DEFAULT '0',
`underlying` varchar(45) NOT NULL DEFAULT '',
PRIMARY KEY (`instrument`,`baseCurrencyId`,`tenor`,`tenorUnderlying`,`strike`,`evalDate`,`volatility`,`underlying`)) ENGINE=InnoDB DEFAULT CHARSET=utf8
Select statement:
SELECT evalDate,
max(case when strike = 0.25 then volatility end) as '0.25'
FROM niels_testdb.volatilities
WHERE
instrument = 'Swaption' and tenor = '3M'
and tenorUnderlying = '3M' and strike = 0.25
GROUP BY
evalDate
One of your requirements is that all the rows need to have unique values. So that is why you created the table with composite primary keys for all columns. But your table WOULD allow duplicated values for every column, as long as the rows themselves were unique.
Take a look at this sql fiddler post: http://sqlfiddle.com/#!2/85ae6
In there you'll see that the column instrument and tenor do have duplicate values.
I'd say you need to investigate more how unique keys work and what primary keys are.
My suggestion is to re-think your requirements and investigate what needs to be unique and why and have a different structure to support your decision. Composite primary keys, in this case, is not the way to go.