query slows down if add field to where condition - mysql

I have a table Mysql fiddle with about 500k records.
CREATE TABLE IF NOT EXISTS `p_transactions` (
`transaction_id` bigint(10) unsigned NOT NULL,
`amount` decimal(19,2) NOT NULL,
`dt` bigint(1) NOT NULL,
`transaction_status` int(1) NOT NULL,
`transaction_type` varchar(15) NOT NULL,
`payment_method` varchar(25) NOT NULL,
`notes` text NOT NULL,
`member_id` int(10) unsigned NOT NULL,
`new_amount` decimal(19,2) NOT NULL,
`paid_amount` decimal(19,2) NOT NULL,
`secret_code` char(40) NOT NULL,
`internal_status` varchar(40) NOT NULL,
`ip_addr` varchar(15) NOT NULL,
`description` text NOT NULL,
`seller_transaction_id` varchar(50) DEFAULT NULL,
`return_url` varchar(255) DEFAULT NULL,
`fail_url` varchar(255) DEFAULT NULL,
`success_url` varchar(255) DEFAULT NULL,
`result_url` varchar(255) DEFAULT NULL,
`user_fee` decimal(19,3) DEFAULT '0.000',
`currency` char(255) DEFAULT 'USD',
`gateway_transaction_id` char(255) DEFAULT NULL,
`load_amount` decimal(19,2) NOT NULL,
`transaction_mode` varchar(1) NOT NULL DEFAULT '',
`p_fee` decimal(19,2) NOT NULL,
`country` varchar(2) NOT NULL,
`email` varchar(255) NOT NULL,
`vat` decimal(19,2) NOT NULL DEFAULT '0.00',
`name` varchar(255) NOT NULL,
`bdate` varchar(255) NOT NULL,
`child_method` varchar(255) NOT NULL,
`processing_fee` decimal(19,2) NOT NULL DEFAULT '0.00',
`flat_fee` varchar(1) NOT NULL DEFAULT 'n',
`user_fee_sum` decimal(19,2) NOT NULL DEFAULT '0.00',
`p_fee_sum` decimal(19,2) NOT NULL DEFAULT '0.00',
`dt_open` bigint(1) NOT NULL DEFAULT '0',
`user_fee_type` varchar(1) NOT NULL DEFAULT 'r',
`custom_gateway_fee` decimal(19,2) NOT NULL DEFAULT '0.00',
`paid_currency` varchar(3) NOT NULL DEFAULT 'USD',
`paid_microtime` bigint(10) unsigned NOT NULL,
`check_ballance` varchar(1) NOT NULL DEFAULT 'n',
PRIMARY KEY (`transaction_id`),
KEY `member_id` (`member_id`),
KEY `payment_method` (`payment_method`),
KEY `child_method` (`child_method`),
KEY `check_ballance` (`check_ballance`),
KEY `dt` (`dt`),
KEY `transaction_type` (`transaction_type`),
KEY `paid_microtime` (`paid_microtime`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
When I execute query
SELECT *
FROM `p_transactions`
WHERE dt >= 1517443200
AND dt <= 1523404799
AND member_id = 2051
ORDER BY `paid_microtime` DESC
LIMIT 50;
it runs 0,000 sec. (+ 0,016 sec. network)
but if I add to query this condition AND transaction_status = 7
SELECT *
FROM `p_transactions`
WHERE dt >= 1517443200
AND dt <= 1523404799
AND member_id = 2051
AND transaction_status = 7
ORDER BY `paid_microtime` DESC
LIMIT 50
query run 12,938 sec. (+ 0,062 sec. network)
Please help me to find out the reason of such behavior
PS. There was index on transaction_status and it increased execution time even more.

Add a suitable index, such as:
ON payzoff_transactions (member_id, dt)
or
ON payzoff_transactions (member_id, dt, transaction_status)
We want member_id column as the leading column in the index, because of the equality comparison, and we expect the result to be a substantially smaller subset of the entire table. We want dt column after that, because of the "range scan" on that.
Including additional columns in the index may allow MySQL to check that condition using values from the index, without a visit/lookup of the row in the underlying table pages.
Either of these indexes would be suitable for both of the queries shown in the question.
Use EXPLAIN to see the execution plan... which index is being used.
There's really no getting around the "Using filesort" operation, since we're pulling a small subset of the entire table.
(If we were pulling the entire table (or a huge subset), we might be able to avoid an expensive sort operation with an access plan that pulls rows in reverse index order, with that has an index with leading column of paid_microtime.)

For the original query have these
INDEX(member_id, dt)
INDEX(member_id, paid_microtime)
For the secondary query, have
INDEX(transaction_status, member_id, dt)
INDEX(transaction_status, member_id, paid_microtime)
Without getting into the details of the distribution of the data values, we cannot explain why one query so much slower; however, my 4 indexes should make both queries run faster most of the time.
More discussion of how I came up with those indexes (and why (member_id, dt, transaction_status) is not so good): http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Related

Mysql Query taking long time where using variable

I had an mysql event and runs eery day at 9:45 AM.
Begin
SET #v_ym :=(SELECT extract(year_month from DATE_SUB(SYSDATE(),INTERVAL 1 DAY)));
SELECT CAST(#ym AS CHAR);
select ssaname,extract(year_month from date_sub(sysdate(),interval 1 day)) ym,
omcr.btscount_ssa(ssaname) btscount,sum(case when duration>30 then duration else 0 end) dur_30 from
btsoutage.bts_faults
where ym=#v_ym and ssaname is not null
group by ssaname;
END;
in the query [ym is yearmonth and ym is indexed] when i substitute with variable #v_ym it is taking full table scan and the table is locked for further inserts. where as when i given the value directly it is using index and the output is fast.
The table contains more than 10 million records.
Create table is
CREATE TABLE `bts_faults` (
`bts_name` varchar(250) DEFAULT NULL,
`make` varchar(10) DEFAULT NULL,
`occuredtime` datetime DEFAULT NULL,
`clearedtime` datetime DEFAULT NULL,
`duration` int(10) DEFAULT NULL,
`reason` varchar(100) DEFAULT NULL,
`site_type` varchar(10) DEFAULT NULL,
`tech` varchar(5) DEFAULT NULL,
`fault_id` bigint(20) NOT NULL AUTO_INCREMENT,
`ssaname` varchar(20) DEFAULT NULL,
`fault_type` int(1) DEFAULT '0',
`remarks` varchar(250) DEFAULT NULL,
`bts_section` varchar(100) DEFAULT NULL,
`vendor` varchar(50) DEFAULT NULL,
`occureddate` date DEFAULT NULL,
`cleareddate` date DEFAULT NULL,
`ym` varchar(6) DEFAULT NULL,
`updatedate` datetime DEFAULT NULL,
`USERNAME` varchar(100) DEFAULT NULL,
`mask` int(1) DEFAULT '0',
`mask_cat` varchar(10) DEFAULT NULL,
`outage_cat` varchar(20) DEFAULT NULL,
`site_category` varchar(50) DEFAULT NULL,
`escalated_time` datetime DEFAULT NULL,
`zone` varchar(20) DEFAULT NULL,
`zone_fault_reason` varchar(500) DEFAULT NULL,
`zone_fault_remarks` varchar(500) DEFAULT NULL,
`zone_username` varchar(20) DEFAULT NULL,
`zone_updatetime` datetime DEFAULT NULL,
`zone_fault_duration` int(11) DEFAULT NULL,
`fault_category` varchar(250) DEFAULT NULL,
`remarks_1` varchar(2500) DEFAULT NULL,
PRIMARY KEY (`fault_id`),
UNIQUE KEY `UIDX_BTS_FAULTS` (`bts_name`,`occuredtime`),
KEY `indx_btsfaults_ym` (`ym`),
KEY `indx_btsfaults_cleareddate` (`cleareddate`),
KEY `Index_btsfaults_btsname` (`bts_name`),
KEY `index_btsfaults_ssaname` (`ssaname`),
KEY `indx_btsfaults_occureddate` (`occureddate`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3807469710 DEFAULT CHARSET=latin1
The Explain Plan for the 2 type are
What percentage of the table is in the "current month"? If that is more than something like 20%, then there is no fix -- a table scan is likely to be faster. If it is less than 20%, then, as you suspect, #variables may be the villain. In that case, change the test to be
WHERE ym = CAST(
extract(year_month from DATE_SUB(SYSDATE(),INTERVAL 1 DAY))
AS CHAR)
AND ...
Much faster would be to build and maintain a Summary Table with a PRIMARY KEY of day and ssaname. This would have the subtotals for each day. It would be maintained either as the data is INSERTed or each night after midnight.
Then the 9:45 query becomes very fast. Maybe so fast that you don't even need to do it just once a day, but instead "on-demand".
More discussion: http://mysql.rjweb.org/doc.php/summarytables
I suggest you use NOW() instead of SYSDATE() -- The former is constant throughout a statement; the latter is not.
bts_faults looks like it might be a terabyte in size. If so, you probably don't want to here ways to make is smaller.
If the Auto_inc value is at 3.8B, yet there are only 10M rows, does this mean that you are purging 'old' data? Do you want to discuss speeding up the Deletes? (Start a new Question if you do.)

Optimizing MySQL query to reduced runtime

Below is the query that will going to run on two tables with 60+ million and 400+ million records. Only the table name will be different, otherwise query is same for both the tables.
SELECT * FROM
(
SELECT A.CUSIP, A.ISIN, A.SEDOL, A.LocalCode, A.MIC, A.ExchgCD, A.PrimaryExchgCD, A.Currency, A.Open, A.High, A.Low, A.Close, A.Mid, A.Ask, A.Last,
A.Bid, A.Bidsize, A.Asksize, A.TradedVolume, A.SecID, A.PriceDate, A.MktCloseDate, A.VolFlag, A.IssuerName, A.TotalTrades, A.CloseType, A.SectyCD,
row_number() OVER (partition by A.CUSIP order by A.MktCloseDate desc) as 'rank'
from EDI_Price04 A
WHERE A.CUSIP IN (
"91879Q109", "583840509", "583840608", "59001A102", "552848103") AND (A.PrimaryExchgCD = A.ExchgCD) AND A.CloseType='CC'
) t WHERE t.rank <= 3;
When A.CUSIP IN () condition have 10-15 values, the query complete in 2-3sec. With 400 values it took 28sec. But I want to make A.CUSIP IN () take 2k-3k value at a time.
This is my table structure.
CREATE TABLE `EDI_Price04` (
`MIC` varchar(6) NOT NULL DEFAULT '',
`LocalCode` varchar(60) NOT NULL DEFAULT '' COMMENT 'PricefileSymbol',
`ISIN` varchar(12) DEFAULT NULL,
`Currency` varchar(3) NOT NULL DEFAULT '',
`PriceDate` date DEFAULT NULL,
`Open` double DEFAULT NULL,
`High` double DEFAULT NULL,
`Low` double DEFAULT NULL,
`Close` double DEFAULT NULL,
`Mid` double DEFAULT NULL,
`Ask` double DEFAULT NULL,
`Last` double DEFAULT NULL,
`Bid` double DEFAULT NULL,
`BidSize` int(11) DEFAULT NULL,
`AskSize` int(11) DEFAULT NULL,
`TradedVolume` bigint(20) DEFAULT NULL,
`SecID` int(11) NOT NULL DEFAULT '0',
`MktCloseDate` date NOT NULL DEFAULT '0000-00-00',
`Volflag` char(1) DEFAULT NULL,
`IssuerName` varchar(255) DEFAULT NULL,
`SectyCD` varchar(3) DEFAULT NULL,
`SecurityDesc` varchar(255) DEFAULT NULL,
`SEDOL` varchar(7) DEFAULT NULL,
`CUSIP` varchar(9) DEFAULT NULL COMMENT 'USCode',
`PrimaryExchgCD` varchar(6) DEFAULT NULL,
`ExchgCD` varchar(6) NOT NULL DEFAULT '',
`TradedValue` double DEFAULT NULL,
`TotalTrades` int(11) DEFAULT NULL,
`Comment` varchar(255) DEFAULT NULL,
`Repush` tinyint(4) NOT NULL DEFAULT '0',
`CloseType` varchar(2) NOT NULL DEFAULT '',
PRIMARY KEY (`MIC`,`LocalCode`,`Currency`,`SecID`,`MktCloseDate`,`ExchgCD`,`Repush`,`CloseType`),
KEY `idx_EDI_Price04_0` (`MIC`),
KEY `idx_EDI_Price04_1` (`LocalCode`),
KEY `idx_EDI_Price04_2` (`ISIN`),
KEY `idx_EDI_Price04_3` (`PriceDate`),
KEY `idx_EDI_Price04_4` (`SEDOL`),
KEY `idx_EDI_Price04_5` (`CUSIP`),
KEY `idx_EDI_Price04_6` (`PrimaryExchgCD`),
KEY `idx_EDI_Price04_7` (`ExchgCD`),
KEY `idx_EDI_Price04_8` (`CloseType`),
KEY `idx_EDI_Price04_9` (`MktCloseDate`),
KEY `idx_EDI_Price04_CUSIP_ExchgCD_CloseType_MktCloseDate` (`CUSIP`,`ExchgCD`,`CloseType`,`MktCloseDate`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
For this query:
SELECT *
FROM (SELECT a.*
ROW_NUMBER() OVER (PARTITION BY A.CUSIP ORDER BY A.MktCloseDate DESC) as rank
FROM EDI_Price04 A
WHERE A.CUSIP IN ('91879Q109', '583840509', '583840608', '59001A102', '552848103') AND
A.PrimaryExchgCD = A.ExchgCD AND
A.CloseType = 'CC'
) t
WHERE t.rank <= 3;
The place to start is with an index. For this query, you want an index on EDI_Price04(CloseType, CUSIP, ExchgCD, MktCloseDate).
Unfortunately, the condition A.PrimaryExchgCD = A.ExchgCD prevents index seeks. If you were to make changes to the query/data, then one approach would be to add a flag when these are the same, rather then looking at the values separately. That would allow an index on:
EDI_Price04(CloseType, IsPrimary, CUSIP, PrimaryExchgCD, ExchgCD, MktCloseDate)
PRIMARY KEY (id),
UNIQUE(`MIC`,`LocalCode`,`Currency`,`SecID`,`MktCloseDate`,
`ExchgCD`,`Repush`,`CloseType`),
-- KEY `idx_EDI_Price04_0` (`MIC`),
KEY `idx_EDI_Price04_1` (`LocalCode`),
KEY `idx_EDI_Price04_2` (`ISIN`),
KEY `idx_EDI_Price04_3` (`PriceDate`),
KEY `idx_EDI_Price04_4` (`SEDOL`),
-- KEY `idx_EDI_Price04_5` (`CUSIP`),
KEY `idx_EDI_Price04_6` (`PrimaryExchgCD`),
KEY `idx_EDI_Price04_7` (`ExchgCD`),
KEY `idx_EDI_Price04_8` (`CloseType`),
KEY `idx_EDI_Price04_9` (`MktCloseDate`),
KEY `idx_EDI_Price04_CUSIP_ExchgCD_CloseType_MktCloseDate` (`CUSIP`,
`ExchgCD`, `CloseType`, `MktCloseDate`)
KEY (CUSIP, MktCloseDate)
Having so many columns in the PK costs in space and insert time. So, I added an id, which needs to be AUTO_INCREMENT.
Keys 0 and 5 are redundant because of the rule "If you have INDEX(a,b), INDEX(a) redundant.
I added (CUSIP, MktCloseDate) in hopes that it will optimize the RANK expression.

Mysql slow query on large table

With Mysql 5.6.10, I have a table like this:
CREATE TABLE `es_user_action` (
`id` bigint(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(11) unsigned NOT NULL,
`company_id` bigint(11) unsigned NOT NULL,
`work_id` bigint(11) unsigned NOT NULL,
`action` tinyint(2) NOT NULL COMMENT '10, 20, 30, 40',
`action_id` bigint(11) unsigned DEFAULT NULL,
`apply_id` bigint(11) unsigned DEFAULT '0',
`apply_display_id` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`action_time` datetime NOT NULL,
`scout_time` datetime NOT NULL,
`register_datetime_sorting` datetime DEFAULT NULL,
`score` tinyint(2) DEFAULT '0',
`is_pending` tinyint(2) DEFAULT '0',
`apply_status` tinyint(2) DEFAULT '2' COMMENT '1: paid, 2: free',
`has_response` tinyint(2) DEFAULT '0',
`response_time` datetime DEFAULT NULL,
`is_shown` tinyint(2) DEFAULT '1',
`source` varchar(15) COLLATE utf8mb4_unicode_ci NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `user_company_work` (`user_id`,`work_id`,`company_id`,`apply_id`,`source`),
KEY `IDX_2` (`company_id`,`is_shown`,`apply_status`,`apply_id`,`work_id`,`action_time`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=436896779 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
The full query I want to perform on the table would be the following:
SELECT
`id` AS `seqNo`,
`company_id` AS `companyId`,
`work_id` AS `workId`,
`action` AS `userActionType`,
`action_time` AS `userActionDatetime`,
`is_pending`,
`apply_status`,
`action_id`,
`source`,
`user_id` AS `userId`,
`apply_display_id`
FROM
`es_user_action`
WHERE
`company_id` = 449664
AND `is_shown` = 1
AND `apply_status` = 1
AND `apply_id` = 0
AND `action_time` >= '2021-01-05 15:56:14'
ORDER BY
is_pending ASC,
action ASC,
score DESC,
CASE
source
WHEN "entenshoku" THEN
action_time
END DESC,
CASE
WHEN source <> "entenshoku" THEN
action_time
END ASC
LIMIT 100 OFFSET 0;
The table has around 15 million rows and the following query takes around 15 seconds. The query becomes very slow.
Can anyone help out? Thanks in advance.
UPDATED:
This is the explain query result:
KEY `IDX_2` (`company_id`,`is_shown`,`apply_status`,`apply_id`,`work_id`,`action_time`)
Work_id is in the way of that index being effective for that SELECT. Remove it. That should speed up the query some. Still the are too many columns and expressions involved in the WHER and ORDER BY to do much more to help.
Changing BIGINTs to smaller INT types will help some (by shrinking the table).
I'm pretty sure that what you have cannot work.
You have is a dynamically changing ORDER BY, depending on the value of source. But which row's source?
If you could decide before issuing the query whether to scan action_time ascending or descending, which value of source would you pick?
Delete this Question, rewrite the query, then open another Question if you still have an issue.

Mysql JOIN query apparently slow

I have 2 tables. The first, called stazioni, where I store live weather data from some weather station, and the second called archivio2, where are stored archived day data. The two tables have in common the ID station data (ID on stazioni, IDStazione on archvio2).
stazioni (1,743 rows)
CREATE TABLE `stazioni` (
`ID` int(10) NOT NULL,
`user` varchar(100) NOT NULL,
`nome` varchar(100) NOT NULL,
`email` varchar(50) NOT NULL,
`localita` varchar(100) NOT NULL,
`provincia` varchar(50) NOT NULL,
`regione` varchar(50) NOT NULL,
`altitudine` int(10) NOT NULL,
`stazione` varchar(100) NOT NULL,
`schermo` varchar(50) NOT NULL,
`installazione` varchar(50) NOT NULL,
`ubicazione` varchar(50) NOT NULL,
`immagine` varchar(100) NOT NULL,
`lat` double NOT NULL,
`longi` double NOT NULL,
`file` varchar(255) NOT NULL,
`url` varchar(255) NOT NULL,
`temperatura` decimal(10,1) DEFAULT NULL,
`umidita` decimal(10,1) DEFAULT NULL,
`pressione` decimal(10,1) DEFAULT NULL,
`vento` decimal(10,1) DEFAULT NULL,
`vento_direzione` decimal(10,1) DEFAULT NULL,
`raffica` decimal(10,1) DEFAULT NULL,
`pioggia` decimal(10,1) DEFAULT NULL,
`rate` decimal(10,1) DEFAULT NULL,
`minima` decimal(10,1) DEFAULT NULL,
`massima` decimal(10,1) DEFAULT NULL,
`orario` varchar(16) DEFAULT NULL,
`online` int(1) NOT NULL DEFAULT '0',
`tipo` int(1) NOT NULL DEFAULT '0',
`webcam` varchar(255) DEFAULT NULL,
`webcam2` varchar(255) DEFAULT NULL,
`condizioni` varchar(255) DEFAULT NULL,
`Data2` datetime DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
archivio2 (2,127,347 rows)
CREATE TABLE `archivio2` (
`ID` int(10) NOT NULL,
`IDStazione` int(4) NOT NULL DEFAULT '0',
`localita` varchar(100) NOT NULL,
`temp_media` decimal(10,1) DEFAULT NULL,
`temp_minima` decimal(10,1) DEFAULT NULL,
`temp_massima` decimal(10,1) DEFAULT NULL,
`pioggia` decimal(10,1) DEFAULT NULL,
`pressione` decimal(10,1) DEFAULT NULL,
`vento` decimal(10,1) DEFAULT NULL,
`raffica` decimal(10,1) DEFAULT NULL,
`records` int(10) DEFAULT NULL,
`Data2` datetime DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
The indexes that I set
-- Indexes for table `archivio2`
--
ALTER TABLE `archivio2`
ADD PRIMARY KEY (`ID`),
ADD KEY `IDStazione` (`IDStazione`),
ADD KEY `Data2` (`Data2`);
-- Indexes for table `stazioni`
--
ALTER TABLE `stazioni`
ADD PRIMARY KEY (`ID`),
ADD KEY `Tipo` (`Tipo`);
ALTER TABLE `stazioni` ADD FULLTEXT KEY `localita` (`localita`);
On a map, I call by a calendar the date to search data on archive2 table, by this INNER JOIN query (I put an example date):
SELECT *, c.pioggia AS rain, c.raffica AS raff, c.vento AS wind, c.pressione AS press
FROM stazioni as o
INNER JOIN archivio2 as c ON o.ID = c.IDStazione
WHERE c.Data2 LIKE '2019-01-01%'
All works fine, but the time needed to show result are really slow (4/5 seconds), even if the query execution time seems to be ok (about 0.5s/1.0s).
I tried to execute the query on PHPMyadmin, and the results are the same. Execution time quickly, but time to show result extremely slow.
EXPLAIN query result
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE o ALL PRIMARY,ID NULL NULL NULL 1743 NULL
1 SIMPLE c ref IDStazione,Data2 IDStazione 4 sccavzuq_rete.o.ID 1141 Using where
UPDATE: the query goes fine if I remove the index from 'IDStazione'. But in this way I lost all advantages and speed on other queries... why only that query become slow if I put index on that field?
In your WHERE clause
WHERE c.Data2 LIKE '2019-01-01%'
the value of Data2 must be casted to a string. No index can be used for that condition.
Change it to
WHERE c.Data2 >= '2019-01-01' AND c.Data2 < '2019-01-01' + INTERVAL 1 DAY
This way the engine should be able to use the index on (Data2).
Now check the EXPLAIN result. I would expect, that the table order is swapped and the key column will show Data2 (for c) and ID (for o).
(Fixing the DATE is the main performance solution; here is a less critical issue.)
The tables are much bigger than necessary. Size impacts disk space and, to some extent, speed.
You have 1743 stations, yet the datatype is a 32-bit (4-byte) number (INT). SMALLINT UNSIGNED would allow for 64K stations and use only 2 bytes.
Does it get really, really, hot there? Like 999999999.9 degrees? DECIMAL(10.1) takes 5 bytes; DECIMAL(4,1) takes only 3 and allows up to 999.9 degrees. DECIMAL(3,1) has a max of 99.9 and takes only 2 bytes.
What is "localita varchar(100)" doing in the big table? Seems like you could JOIN to the stations table when you need it? Removing that might cut the table size in half.

Huge speed difference in two similar queries (MySQL ORDER clause)

Important updade (explanation):
I realized that my query having single DESC order is 10 times slower that the same query with ASC order. The ordered field has an index. Is it normal behavior?
Original question with queries:
I have a mysql table with a few hundred of product items. It's suprising (for me) how 2 similar sql queries differs in terms of performance. I don't know why. Can you please give me a hint or explain why the difference is so huge?
This query takes 3ms:
SELECT
*
FROM
`product_items`
WHERE
(product_items.shop_active = 1)
AND (product_items.active = 1)
AND (product_items.active_category_id is not null)
AND (has_picture is not null)
AND (price_orig is not null)
AND (category_min_discount IS NOT NULL)
AND (product_items.slug is not null)
AND `product_items`.`active_category_id` IN (6797, 5926, 5806, 6852)
ORDER BY
price asc
LIMIT 1
But the following query takes already 169ms... Only difference is that the order clause contains 2 columns. "Price" value has each product, while "price top" has roughly only 1% of products.
SELECT
*
FROM
`product_items`
WHERE
(product_items.shop_active = 1)
AND (product_items.active = 1)
AND (product_items.active_category_id is not null)
AND (has_picture is not null)
AND (price_orig is not null)
AND (category_min_discount IS NOT NULL)
AND (product_items.slug is not null)
AND `product_items`.`active_category_id` IN (6797, 5926, 5806, 6852)
ORDER BY
price asc,
price_top desc
LIMIT 1
The table structure looks like this:
CREATE TABLE `product_items` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`shop_id` int(11) DEFAULT NULL,
`item_id` varchar(255) DEFAULT NULL,
`productname` varchar(255) DEFAULT NULL,
`description` text,
`url` text,
`url_hash` varchar(255) DEFAULT NULL,
`img_url` text,
`price` decimal(10,2) DEFAULT NULL,
`price_orig` decimal(10,2) DEFAULT NULL,
`discount` decimal(10,2) DEFAULT NULL,
`discount_percent` decimal(10,2) DEFAULT NULL,
`manufacturer` varchar(255) DEFAULT NULL,
`delivery_date` varchar(255) DEFAULT NULL,
`categorytext` text,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`active_category_id` int(11) DEFAULT NULL,
`shop_active` int(11) DEFAULT NULL,
`active` int(11) DEFAULT '0',
`price_top` decimal(10,2) NOT NULL DEFAULT '0.00',
`attention_priority` int(11) DEFAULT NULL,
`attention_priority_over` int(11) DEFAULT NULL,
`has_picture` varchar(255) DEFAULT NULL,
`size` varchar(255) DEFAULT NULL,
`category_min_discount` int(11) DEFAULT NULL,
`slug` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_product_items_on_url_hash` (`url_hash`),
KEY `index_product_items_on_shop_id` (`shop_id`),
KEY `index_product_items_on_active_category_id` (`active_category_id`),
KEY `index_product_items_on_productname` (`productname`),
KEY `index_product_items_on_price` (`price`),
KEY `index_product_items_on_discount_percent` (`discount_percent`),
KEY `index_product_items_on_price_top` (`price_top`)
) ENGINE=InnoDB AUTO_INCREMENT=1715708 DEFAULT CHARSET=utf8;
UPDATE
I realized that the difference is mainly in the type of ordering: if I use asc+asc for both columns the query takes around 6ms, if I use asc+desc or desc+asc, the query takes around 160ms..
Thank you.
If creating an index to help ORDER BY doesn't help, try creating an index that helps both WHERE and ORDER BY:
CREATE INDEX product_items_i1 ON product_items (
shop_active,
active,
active_category_id,
has_picture,
price_orig,
category_min_discount,
slug,
price,
price_top DESC
)
Obviously, this is a bit clunky, and you'll have to balance the performance gain for the query with the price of maintaining the index.