MYSQL Array Variable (No store prodecure, No temporarily table) - mysql

As mention in the title, I would like to know any solution for this by not using store prodecure, temporarily table etc.
Compare Query#1 and Query#3, Query#3 get worst performance result. Does it have any workaround to put in variable but without impact the performance result.
Schema (MySQL v5.7)
create table `order`(
id BIGINT(20) not null auto_increment,
transaction_no varchar(20) not null,
primary key (`id`),
unique key uk_transaction_no (`transaction_no`)
);
insert into `order` (`id`,`transaction_no`)
value (1,'10001'),(2,'10002'),(3,'10003'),(4,'10004');
Query #1
explain select * from `order` where transaction_no in ('10001','10004');
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
order
range
uk_transaction_no
uk_transaction_no
22
2
100
Using where; Using index
Query #2
set #transactionNos = "10001,10004";
There are no results to be displayed.
Query #3
explain select * from `order` where find_in_set(transaction_no, #transactionNos);
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
order
index
uk_transaction_no
22
4
100
Using where; Using index

Short Answer: See Sargeability
Long Answer:
MySQL makes no attempt to optimize an expression when an indexed column when it is hidden inside a function call such as FIND_IN_SET(), DATE(), etc. Your Query 3 will always be performed as a full table scan or a full index scan.
So the only way to optimize what you are doing is to construct
IN ('10001','10004')
That is often difficult to achieve when "binding" a list of values. IN(?) will not work in most APIs to MySQL.

Related

How to reduce rows lookup when using LIMIT MySQL

I have the following table with Index on id and Foreign Key on activityID:
comment (id, activityID, text)
and the following query:
SELECT <cols> FROM `comment` WHERE `comment`.`activityID` = 1257 ORDER BY `id` DESC LIMIT 20;
I basically want to get only the first 20 comments for this activity that has 1165, however, this is the result of a describe:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE comment ref activityID activityID 4 const 1165 NULL
Essentially, it is looking through all comments for this activity before deciding to limit it.
We tested this query under high load when an activity has 200,000 comments and the query takes 5+ seconds, whereas on the same load, an activity with 30 comments takes a couple of ms.
PS: If I remove the WHERE clause, an EXPLAIN says it will only lookup a single row (don't know if that's the case really):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE comment index NULL PRIMARY 4 NULL 1 NULL
Is it possible to optimize this kind of query in any way?
Thank you.
The ordering is causing the slowness.
The query uses the activityID index to find all the rows with that ID. Then it has to read all 200,000 comments and sort them by id to find the last 20.
Add a composite index so it can use an index for the ordering:
ALTER TABLE comment ADD INDEX (activityID, id);
Note that you will no longer need the index on activityID by itself, since it's a prefix of this new index.
Use offset
SELECT <cols> FROM `comment` WHERE `comment`.`activityID` = 1257 ORDER BY `id` DESC LIMIT 0,20;
In limit clause add 0 as an offset to get only the first 20 comments
Just add two separate indexes on activityID and id. That should help you in ORDER BY too. There is no hard and fast rules in optimizations, but you need to try various methods.
Do it this way:
ALTER TABLE comment ADD INDEX (id);
ALTER TABLE comment ADD INDEX (activityID);
I think this will help.

Query is too slow, and not using index

here is the "explain" of my query:
explain
select eil.sell_fmt, count(sell_fmt) as itemCount
from table_items eil
where eil.cl_Id=123 and eil.si_Id='0'
and start_date <= now() and end_date is not null and end_date < NOW()
group by eil.sell_fmt
without date (start_date, end_date) filters:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE eil ref table_items_clid_siid_sellFmt 39 const,const 7393 Using where; Using index
With date filters:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE eil ref table_items_clid_siid_sellFmt 39 const,const 8400 Using where
possible_keys are:
table_items_clid_siid, table_items_clid_siid_itemId, table_items_clid_siid_startDate_endDate, table_items_clid_siid_sellFmt
The query without date filters is very fast (0.4 sec), but with date filters, its taking about 30 seconds. total records are 14K only.
Table field types:
`cl_Id` int(11) NOT NULL,
`si_Id` varchar(11) NOT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`sell_fmt` varchar(20) DEFAULT NULL
I concatenated field-names to give index names, so you can estimate combined fields available in the index.
Can somebody guide me here? what's going on here? what is the best course of action i should take here, or where i'm doing wrong?
I need one more suggestion: in another query on same table, a user can filter based on UPTO 10 fields, and in no definite order of fields (random no of fields in random order). Then this type search would be too slow again. What's the best strategy then? one covering index with "all" possible searchable fields? if yes, does the order of fields in index matter? (i.e. if that order is different than that of fields in query, will the index be used?
First, without seeing your create table statement, I can offer the following... create composite index (multiple fields) that best match your common querying elements applicable to the where clause, starting with the smaller nominal count basis. Since you are explicitly looking for a "cl_ID" and "si_ID" plus start and end dates. Since you have a group by, I would add that to the index for optimization purposes and be a completely COVERING index so the engine does not need to go back to the raw data to complete the query. It can resolve by all fields in the index directly.
I would have an index on
( cl_id, si_id, start_date, end_date, sell_fmt )
Finally, change your count from count(sell_fmt) to just count(*) indicating an "i don't care about a specific field, just as long as a record is found, count it"

Whats the difference between "Using index" and "Using where; Using index" in the EXPLAIN

In the extra field of the explain in mysql you can get:
Using index
Using where; Using index
What's the difference between the two?
To explain my question better I'm going to use the following table:
CREATE TABLE `test` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`another_field` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8;
INSERT INTO test() VALUES(),(),(),(),();
Which ends up with the content like:
SELECT * FROM `test`;
id another_field
1 0
2 0
3 0
4 0
5 0
On my research I found
Why is this query using where instead of index?
The output of EXPLAIN can sometimes be misleading.
For instance, filesort has nothing to do with files, using where
does not mean you are using a WHERE clause, and using index can
show up on the tables without a single index defined.
Using where just means there is some restricting clause on the table
(WHERE or ON), and not all record will be returned. Note that
LIMIT does not count as a restricting clause (though it can be).
Using index means that all information is returned from the index,
without seeking the records in the table. This is only possible if all
fields required by the query are covered by the index.
Since you are selecting *, this is impossible. Fields other than
category_id, board_id, display and order are not covered by
the index and should be looked up.
and I also found
https://dev.mysql.com/doc/refman/5.1/en/explain-output.html#explain-extra-information
Using index
The column information is retrieved from the table using only
information in the index tree without having to do an additional seek
to read the actual row. This strategy can be used when the query uses
only columns that are part of a single index.
If the Extra column also says Using where, it means the index is being
used to perform lookups of key values. Without Using where, the
optimizer may be reading the index to avoid reading data rows but not
using it for lookups. For example, if the index is a covering index
for the query, the optimizer may scan it without using it for lookups.
For InnoDB tables that have a user-defined clustered index, that index
can be used even when Using index is absent from the Extra column.
This is the case if type is index and key is PRIMARY.
(Look at the second paragraph)
My problem with this:
First: I didn't understand the second paragraph the way it's written.
Second:
The following query returns
EXPLAIN SELECT id FROM test WHERE id = 5;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE test const PRIMARY PRIMARY 8 const 1 Using index
(Scroll to the right)
And this other query returns:
EXPLAIN SELECT id FROM test WHERE id > 5;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE test range PRIMARY PRIMARY 8 NULL 1 Using where; Using index
(Scroll to the right)
Other than the fact that one query uses a range search and another uses the constant search, both queries are using some restricting clause on the table (WHERE or ON), and not all record will be returned.
What does the Using where; mean on the second query mean? and what does the fact that it's not on the first query mean?
EXTRA
What is the difference with Using index condition; Using where?
(I'm not adding an example of this because I have not been able to reproduce it in a small self contained piece os code)
When you see Using Index in the Extra part of an explain it means that the (covering) index is adequate for the query.
In your example: SELECT id FROM test WHERE id = 5; the server doesn't need to access the actual table as it can satisfy the query (you only access id) only using the index (as the explain says). In case you are not aware the PK is implemented via a unique index.
When you see Using Index; Using where it means that first the index is used to retrieve the records (an actual access to the table is not needed) and then on top of this result set the filtering of the where clause is done.
In this example: SELECT id FROM test WHERE id > 5; you still fetch for id from the index and then apply the greater than condition to filter out the records non matching the condition

Mysql converting subquery into dependent subquery

Hi I am not understanding , why the subquery of given query is converting into dependent subquery.
Although the subquery is not dependent(not using primary query table) on main query.
I know that this query can be optimized using joins,but here i just want to know the reason of this
MYSQL Version 5.5
EXPLAIN SELECT id FROM `cab_request_histories`
WHERE cab_request_histories.id = any(SELECT id
FROM cab_requests
WHERE cab_requests.request_type = 'pickup')
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY cab_request_histories index NULL PRIMARY 4 NULL 20
2 DEPENDENT SUBQUERY cab_requests unique_subquery PRIMARY PRIMARY 4 func 1
I suspect that the ANY keyword will require MySQL to pass the values from outside the subquery to inside it to evaluate whether the result is true.
Mysql optimizer uses EXIST strategy for this query, effectively changing it to something like:
SELECT id FROM cab_request_histories
WHERE EXISTS
( SELECT 'this one is dependent' FROM cab_requests
WHERE cab_requests.request_type = 'pickup'
AND cab_requests.id = cab_request_histories.id )
You can see what optimizer does with your query using EXPLAIN EXTENDED your_query followed by SHOW WARNINGS.
This type of optimization is described in http://dev.mysql.com/doc/refman/5.5/en/subquery-optimization-with-exists.html.

Limited SQL query returns all rows instead of one

I tried the SQL code:
explain SELECT * FROM myTable LIMIT 1
As a result I got:
id select_type table type possible_keys key key_len ref **rows**
1 SIMPLE myTable ALL NULL NULL NULL NULL **32117**
Do you know why the query would run though all rows instead of simply picking the first row?
What can I change within the query (or in my table) to reduce the line amount for a similar result?
The rows count shown is only an estimate of the number of rows to examine. It is not always equal to the actual number of rows examined when you run the query.
In particular:
LIMIT is not taken into account while estimating number of rows Even if you have LIMIT which restricts how many rows will be examined MySQL will still print full number.
Source
When the query actually runs only one row will be examined.
Edited for use of subselect:
Assuming the primary key is "my_id" , use WHERE. For instance:
select * from mytable
where my_id = (
select max(my_id) from mytable
)
While this seems less efficient at first, the result is as such in explain, resulting in just one row returned and a read on the index to find max. I do not suggest doing this against partitioned tables in MySQL:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY mytable const PRIMARY PRIMARY 4 const 1
2 SUBQUERY NULL NULL NULL NULL NULL NULL NULL Select tables optimized away