Related
Below is the query that will going to run on two tables with 60+ million and 400+ million records. Only the table name will be different, otherwise query is same for both the tables.
SELECT * FROM
(
SELECT A.CUSIP, A.ISIN, A.SEDOL, A.LocalCode, A.MIC, A.ExchgCD, A.PrimaryExchgCD, A.Currency, A.Open, A.High, A.Low, A.Close, A.Mid, A.Ask, A.Last,
A.Bid, A.Bidsize, A.Asksize, A.TradedVolume, A.SecID, A.PriceDate, A.MktCloseDate, A.VolFlag, A.IssuerName, A.TotalTrades, A.CloseType, A.SectyCD,
row_number() OVER (partition by A.CUSIP order by A.MktCloseDate desc) as 'rank'
from EDI_Price04 A
WHERE A.CUSIP IN (
"91879Q109", "583840509", "583840608", "59001A102", "552848103") AND (A.PrimaryExchgCD = A.ExchgCD) AND A.CloseType='CC'
) t WHERE t.rank <= 3;
When A.CUSIP IN () condition have 10-15 values, the query complete in 2-3sec. With 400 values it took 28sec. But I want to make A.CUSIP IN () take 2k-3k value at a time.
This is my table structure.
CREATE TABLE `EDI_Price04` (
`MIC` varchar(6) NOT NULL DEFAULT '',
`LocalCode` varchar(60) NOT NULL DEFAULT '' COMMENT 'PricefileSymbol',
`ISIN` varchar(12) DEFAULT NULL,
`Currency` varchar(3) NOT NULL DEFAULT '',
`PriceDate` date DEFAULT NULL,
`Open` double DEFAULT NULL,
`High` double DEFAULT NULL,
`Low` double DEFAULT NULL,
`Close` double DEFAULT NULL,
`Mid` double DEFAULT NULL,
`Ask` double DEFAULT NULL,
`Last` double DEFAULT NULL,
`Bid` double DEFAULT NULL,
`BidSize` int(11) DEFAULT NULL,
`AskSize` int(11) DEFAULT NULL,
`TradedVolume` bigint(20) DEFAULT NULL,
`SecID` int(11) NOT NULL DEFAULT '0',
`MktCloseDate` date NOT NULL DEFAULT '0000-00-00',
`Volflag` char(1) DEFAULT NULL,
`IssuerName` varchar(255) DEFAULT NULL,
`SectyCD` varchar(3) DEFAULT NULL,
`SecurityDesc` varchar(255) DEFAULT NULL,
`SEDOL` varchar(7) DEFAULT NULL,
`CUSIP` varchar(9) DEFAULT NULL COMMENT 'USCode',
`PrimaryExchgCD` varchar(6) DEFAULT NULL,
`ExchgCD` varchar(6) NOT NULL DEFAULT '',
`TradedValue` double DEFAULT NULL,
`TotalTrades` int(11) DEFAULT NULL,
`Comment` varchar(255) DEFAULT NULL,
`Repush` tinyint(4) NOT NULL DEFAULT '0',
`CloseType` varchar(2) NOT NULL DEFAULT '',
PRIMARY KEY (`MIC`,`LocalCode`,`Currency`,`SecID`,`MktCloseDate`,`ExchgCD`,`Repush`,`CloseType`),
KEY `idx_EDI_Price04_0` (`MIC`),
KEY `idx_EDI_Price04_1` (`LocalCode`),
KEY `idx_EDI_Price04_2` (`ISIN`),
KEY `idx_EDI_Price04_3` (`PriceDate`),
KEY `idx_EDI_Price04_4` (`SEDOL`),
KEY `idx_EDI_Price04_5` (`CUSIP`),
KEY `idx_EDI_Price04_6` (`PrimaryExchgCD`),
KEY `idx_EDI_Price04_7` (`ExchgCD`),
KEY `idx_EDI_Price04_8` (`CloseType`),
KEY `idx_EDI_Price04_9` (`MktCloseDate`),
KEY `idx_EDI_Price04_CUSIP_ExchgCD_CloseType_MktCloseDate` (`CUSIP`,`ExchgCD`,`CloseType`,`MktCloseDate`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
For this query:
SELECT *
FROM (SELECT a.*
ROW_NUMBER() OVER (PARTITION BY A.CUSIP ORDER BY A.MktCloseDate DESC) as rank
FROM EDI_Price04 A
WHERE A.CUSIP IN ('91879Q109', '583840509', '583840608', '59001A102', '552848103') AND
A.PrimaryExchgCD = A.ExchgCD AND
A.CloseType = 'CC'
) t
WHERE t.rank <= 3;
The place to start is with an index. For this query, you want an index on EDI_Price04(CloseType, CUSIP, ExchgCD, MktCloseDate).
Unfortunately, the condition A.PrimaryExchgCD = A.ExchgCD prevents index seeks. If you were to make changes to the query/data, then one approach would be to add a flag when these are the same, rather then looking at the values separately. That would allow an index on:
EDI_Price04(CloseType, IsPrimary, CUSIP, PrimaryExchgCD, ExchgCD, MktCloseDate)
PRIMARY KEY (id),
UNIQUE(`MIC`,`LocalCode`,`Currency`,`SecID`,`MktCloseDate`,
`ExchgCD`,`Repush`,`CloseType`),
-- KEY `idx_EDI_Price04_0` (`MIC`),
KEY `idx_EDI_Price04_1` (`LocalCode`),
KEY `idx_EDI_Price04_2` (`ISIN`),
KEY `idx_EDI_Price04_3` (`PriceDate`),
KEY `idx_EDI_Price04_4` (`SEDOL`),
-- KEY `idx_EDI_Price04_5` (`CUSIP`),
KEY `idx_EDI_Price04_6` (`PrimaryExchgCD`),
KEY `idx_EDI_Price04_7` (`ExchgCD`),
KEY `idx_EDI_Price04_8` (`CloseType`),
KEY `idx_EDI_Price04_9` (`MktCloseDate`),
KEY `idx_EDI_Price04_CUSIP_ExchgCD_CloseType_MktCloseDate` (`CUSIP`,
`ExchgCD`, `CloseType`, `MktCloseDate`)
KEY (CUSIP, MktCloseDate)
Having so many columns in the PK costs in space and insert time. So, I added an id, which needs to be AUTO_INCREMENT.
Keys 0 and 5 are redundant because of the rule "If you have INDEX(a,b), INDEX(a) redundant.
I added (CUSIP, MktCloseDate) in hopes that it will optimize the RANK expression.
Got this confusing issue where I need to get data from several tables and join them into one result.
This query was working fine:
select simcard.*,customer.name as cuname,mi,ma,co, tot from simcard, customer,
(SELECT s.sim_id AS ssim_id, min(datetime) AS mi, max(datetime) AS ma, FLOOR(sum(t.WEIGHT) / 1000) AS tot,
(SELECT count(datetime)
FROM transactions t
WHERE (t.SIM_ID = ssim_id) AND t.ROWTYPE LIKE '$D' GROUP BY t.sim_id) as co
FROM simcard s, transactions t
WHERE (s.sim_id = t.sim_id) AND t.ROWTYPE LIKE '$Z'
AND ( s.customer_id =1 ) GROUP BY s.sim_id) as T
WHERE (sim_id = ssim_id) AND (simcard.customer_id = customer.id) GROUP BY simcard.SIM_ID
But when I add a new customer and new simcard to that customer, it will return empty. I'm guessing this is because there is not transaction with that sim_id in the transaction able.
I've tried to left join but I'm just getting errors.
I've narrowed it down to the first subquery, removing the second one did not result in an non-empty result, but because both subqueries uses the transaction table I'm guessing that both will need some left join.
Structure:
DROP TABLE IF EXISTS `customer`;
CREATE TABLE IF NOT EXISTS `customer` (
`ID` int(10) NOT NULL AUTO_INCREMENT,
`NAME` varchar(40) DEFAULT NULL,
`API` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`ID`),
KEY `CUSTOMER_ID` (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
-- --------------------------------------------------------
--
-- Tabellstruktur `simcard`
--
DROP TABLE IF EXISTS `simcard`;
CREATE TABLE IF NOT EXISTS `simcard` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`SIM_ID` char(20) DEFAULT NULL,
`CUSTOMER_ID` char(10) DEFAULT NULL,
`SCALES_ID` char(10) DEFAULT NULL,
`NAME` varchar(40) DEFAULT NULL,
`ACTIVE` char(1) DEFAULT 'N',
`EMAIL` varchar(255) DEFAULT NULL,
`TARGET_WEIGHT` varchar(255) DEFAULT NULL,
`LICENSE` tinyint(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`ID`),
KEY `SIM_ID` (`SIM_ID`),
KEY `CUSTOMER_ID` (`CUSTOMER_ID`),
KEY `SCALES_ID` (`SCALES_ID`),
KEY `SIM_ID_2` (`SIM_ID`,`CUSTOMER_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
-- --------------------------------------------------------
--
-- Tabellstruktur `transactions`
--
DROP TABLE IF EXISTS `transactions`;
CREATE TABLE IF NOT EXISTS `transactions` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`SCALES_ID` char(10) DEFAULT NULL,
`SIM_ID` char(20) DEFAULT NULL,
`ROWTYPE` char(2) DEFAULT NULL,
`DATETIME` datetime DEFAULT NULL,
`TRANSACTIONTIME` int(11) DEFAULT NULL,
`NAME` varchar(255) DEFAULT NULL,
`TRANSACTIONNUMBER` int(11) DEFAULT NULL,
`DATA0` varchar(255) DEFAULT NULL,
`DATA1` varchar(255) DEFAULT NULL,
`DATA2` varchar(255) DEFAULT NULL,
`DATA3` varchar(255) DEFAULT NULL,
`DATA4` varchar(255) DEFAULT NULL,
`DATA5` varchar(255) DEFAULT NULL,
`DATA6` varchar(255) DEFAULT NULL,
`DATA7` varchar(255) DEFAULT NULL,
`DATA8` varchar(255) DEFAULT NULL,
`MEMORY` varchar(255) DEFAULT NULL,
`MATERIAL` varchar(255) DEFAULT NULL,
`DENSITY` int(11) DEFAULT NULL,
`VEHICLETYPE` char(1) DEFAULT NULL,
`TOOLNAME` varchar(255) DEFAULT NULL,
`WEIGHT` int(11) DEFAULT NULL,
`TRANSACTIONID` int(11) DEFAULT NULL,
`group_id` int(11) NOT NULL DEFAULT 0,
PRIMARY KEY (`ID`),
KEY `SIM_ID` (`SIM_ID`),
KEY `TRANSACTIONID` (`TRANSACTIONID`),
KEY `SCALES_ID` (`SCALES_ID`),
KEY `SIM_ID_2` (`SIM_ID`,`TRANSACTIONID`),
KEY `idx_transactions_ROWTYPE` (`ROWTYPE`),
KEY `idx_transactions_DATETIME` (`DATETIME`),
KEY `idx_transactions_DATA0` (`DATA0`),
KEY `idx_transactions_DATA1` (`DATA1`),
KEY `idx_transactions_DATA2` (`DATA2`),
KEY `idx_transactions_DATA3` (`DATA3`),
KEY `idx_transactions_DATA4` (`DATA4`),
KEY `idx_transactions_DATA5` (`DATA5`),
KEY `idx_transactions_DATA6` (`DATA6`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
COMMIT;
Expected result:
ID SIM_ID CUSTOMER_ID SCALES_ID NAME ACTIVE EMAIL TARGET_WEIGHT LICENSE cuname mi ma co tot
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
8 10100421287868 1 61 M-61M Y NULL NULL 0 testcustomer 2018-04-26 00:00:00 2018-08-08 00:00:00 14908 529446
I am trying to execute following query
SELECT
a.sessionID AS `sessionID`,
firstSeen, birthday, gender,
isAnonymous, LanguageCode
FROM transactions AS trx
INNER JOIN actions AS a ON a.sessionID = trx.SessionID
WHERE a.ActionType = 'PURCHASE'
GROUP BY trx.TransactionNumber
Explain provides the following output
1 SIMPLE trx ALL TransactionNumber,SessionID NULL NULL NULL 225036 Using temporary; Using filesort
1 SIMPLE a ref sessionID sessionID 98 infinitiExport.trx.SessionID 1 Using index
The problem is that I am trying to use one field for join and different field for GROUP BY.
How can I tell MySQL to use different indices for same table?
CREATE TABLE `transactions` (
`SessionID` varchar(32) NOT NULL DEFAULT '',
`date` datetime DEFAULT NULL,
`TransactionNumber` varchar(32) NOT NULL DEFAULT '',
`CustomerECommerceTrackID` int(11) DEFAULT NULL,
`SKU` varchar(45) DEFAULT NULL,
`AmountPaid` double DEFAULT NULL,
`Currency` varchar(10) DEFAULT NULL,
`Quantity` int(11) DEFAULT NULL,
`Name` tinytext NOT NULL,
`Category` varchar(45) NOT NULL DEFAULT '',
`customerInfoXML` text,
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
KEY `TransactionNumber` (`TransactionNumber`),
KEY `SessionID` (`SessionID`)
) ENGINE=InnoDB AUTO_INCREMENT=212007 DEFAULT CHARSET=utf8;
CREATE TABLE `actions` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`sessionActionDate` datetime DEFAULT NULL,
`actionURL` varchar(255) DEFAULT NULL,
`sessionID` varchar(32) NOT NULL DEFAULT '',
`ActionType` varchar(64) DEFAULT NULL,
`CustomerID` int(11) DEFAULT NULL,
`IPAddressID` int(11) DEFAULT NULL,
`CustomerDeviceID` int(11) DEFAULT NULL,
`customerInfoXML` text,
PRIMARY KEY (`id`),
KEY `ActionType` (`ActionType`),
KEY `CustomerDeviceID` (`CustomerDeviceID`),
KEY `sessionID` (`sessionID`)
) ENGINE=InnoDB AUTO_INCREMENT=15042833 DEFAULT CHARSET=utf8;
Thanks
EDIT 1: My indexes were broken, I had to add (SessionID, TransactionNumber) index to transactions table, however now, when I try to include trx.customerInfoXML table mysql stops using index
EDIT 2 Another answer does not really solved my problem because it's not standard sql syntax and generally not a good idea to force indices.
For ORM users such syntax is a unattainable luxury.
EDIT 3 I updated my indices and it solved the problem, see EDIT 1
I'm not sure why this query is taking 4 minutes to complete:
SELECT
su.sid,u.uid,u.display_name,u.locale
FROM user u
LEFT JOIN subscription_user su ON su.uid = u.uid
ORDER BY u.display_name DESC
LIMIT 0,25;
Well, I know it's due to the order, remove it and it's very fast. If I change to using INNER JOIN instead it's fast but the issue is not all users may be in the subscription_user table.
CREATE TABLE `user` (
`uid` int(11) NOT NULL AUTO_INCREMENT,
`password` varchar(100) DEFAULT NULL,
`user_type` varchar(10) NOT NULL DEFAULT 'user',
`display_name` varchar(50) NOT NULL,
`email` varchar(100) NOT NULL,
`locale` varchar(8) DEFAULT 'en',
`last_login` datetime DEFAULT NULL,
`auth_type` varchar(10) DEFAULT NULL,
`auth_data` varchar(500) DEFAULT NULL,
`inactive` tinyint(4) NOT NULL DEFAULT '0',
`receive_email` tinyint(4) NOT NULL DEFAULT '1',
`stateid` int(10) DEFAULT NULL,
`owner_group_id` int(11) DEFAULT NULL,
`signature` varchar(500) DEFAULT NULL,
`raw_signature` varchar(500) DEFAULT NULL,
`round_robin` smallint(5) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`uid`),
UNIQUE KEY `email` (`email`),
KEY `stateid` (`stateid`) USING BTREE,
KEY `user_type` (`user_type`) USING BTREE,
KEY `name` (`display_name`)
) ENGINE=InnoDB AUTO_INCREMENT=28343 DEFAULT CHARSET=latin1;
CREATE TABLE `subscription_user` (
`sid` varchar(50) NOT NULL,
`uid` int(11) NOT NULL,
`deleted` tinyint(4) NOT NULL DEFAULT '0',
`forum_user` varchar(50) NOT NULL,
PRIMARY KEY (`sid`,`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
When you have an SQL query, the index can only really help you if the first column in the index is part of the query.
Your query joins su.uid = u.uid and the optimizer will not be able to use that to reference the first column in the subscription primary key index.
You should either reverse the order of the columns in the primary key, or alternatively, you should add a foreign key index, or an independent index on the uid
I have 3 tables in which I'm trying to preform joins on, and inserting the resulting data into another table. The query is taking anywhere between 15-30 mins depending on the dataset. The tables I'm selecting from and joining on are at least 25k records each but will quickly grow to be 500k+.
I tried adding indexes on the fields but still isn't helping that much. Are there any other things I can try or are joins on this scale just going to take this long?
Here is the query I'm trying to perform:
INSERT INTO audience.topitem
(runs_id, total_training_count, item, standard_index_value, significance, seed_count, nonseed_count, prod, model_type, level_1, level_2, level_3, level_4, level_5)
SELECT 5, seed_count + nonseed_count AS total_training_count,
ii.item, standard_index_value, NULL, seed_count, nonseed_count,
standard_index_value * seed_count AS prod, 'site', topic_L1, topic_L2, topic_L3, topic_L4, topic_L5
FROM audience.item_indexes ii
LEFT JOIN audience.usercounts uc ON ii.item = uc.item AND ii.runs_id = uc.runs_id
LEFT JOIN categorization.categorization at on ii.item = at.url
WHERE ii.runs_id = 5
Table: audience.item_indexes
CREATE TABLE `item_indexes` (
`item` varchar(1024) DEFAULT NULL,
`standard_index_value` float DEFAULT NULL,
`runs_id` int(11) DEFAULT NULL,
`model_type` enum('site','term','combo') DEFAULT NULL,
KEY `item_idx` (`item`(333))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Table: audience.usercounts
CREATE TABLE `usercounts` (
`item` varchar(1024) DEFAULT NULL,
`seed_count` int(11) DEFAULT NULL,
`nonseed_count` int(11) DEFAULT NULL,
`significance` float(19,6) DEFAULT NULL,
`runs_id` int(11) DEFAULT NULL,
`model_type` enum('site','term','combo') DEFAULT NULL,
KEY `item_idx` (`item`(333))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Table: audience.topitem
CREATE TABLE `topitem` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`total_training_count` int(11) DEFAULT NULL,
`item` varchar(1024) DEFAULT NULL,
`standard_index_value` float(19,6) DEFAULT NULL,
`significance` float(19,6) DEFAULT NULL,
`seed_count` int(11) DEFAULT NULL,
`nonseed_count` int(11) DEFAULT NULL,
`prod` float(19,6) DEFAULT NULL,
`cat_type` varchar(32) DEFAULT NULL,
`cat_level` int(11) DEFAULT NULL,
`conf` decimal(19,9) DEFAULT NULL,
`level_1` varchar(64) DEFAULT NULL,
`level_2` varchar(64) DEFAULT NULL,
`level_3` varchar(64) DEFAULT NULL,
`level_4` varchar(64) DEFAULT NULL,
`level_5` varchar(64) DEFAULT NULL,
`runs_id` int(11) DEFAULT NULL,
`model_type` enum('site','term','combo') DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=825 DEFAULT CHARSET=utf8;
Table: categorization.categorization
CREATE TABLE `AT_categorization` (
`url` varchar(760) NOT NULL ,
`language` varchar(10) DEFAULT NULL,
`category` text,
`entity` text,
`source` varchar(255) DEFAULT NULL,
`topic_L1` varchar(45) NOT NULL DEFAULT '',
`topic_L2` varchar(45) NOT NULL DEFAULT '',
`topic_L3` varchar(45) NOT NULL DEFAULT '',
`topic_L4` varchar(45) NOT NULL DEFAULT '',
`topic_L5` varchar(45) NOT NULL DEFAULT '',
`last_refreshed` datetime DEFAULT NULL,
PRIMARY KEY (`url`,`topic_L1`,`topic_L2`,`topic_L3`,`topic_L4`,`topic_L5`),
UNIQUE KEY `inx_url` (`url`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
If you add the following indexes, your query will run faster:
CREATE INDEX runs_idx ON audience.item_indexes (runs_id);
ALTER TABLE audience.usercounts
DROP INDEX item_idx,
ADD INDEX item_idx (runs_id, item(333));
Also, item_indexes is utf8, but AT_categorization is latin1, which keeps any indexes from being used. To address this issue, change AT_categorization to utf8:
ALTER TABLE AT_categorization CHARSET=utf8;
Lastly, for the AT_categorization table, the two indexes
PRIMARY KEY (`url`,`topic_L1`,`topic_L2`,`topic_L3`,`topic_L4`,`topic_L5`),
UNIQUE KEY `inx_url` (`url`)
are redundant. So you could DROP these, and simply have the url field be the primary key:
ALTER TABLE AT_categorization
DROP PRIMARY KEY,
DROP KEY `inx_url`,
ADD PRIMARY KEY (url);