Two Left Joins, only one is using an index - mysql

I'll show my query first, then the tables. I've tried forcing the index, but it just won't. When that part of the query is ran as a new query on its own, it uses the index and is fast, but since it won't in my full query, it's incredibly slow / never completes.
SELECT p.*, INET_NTOA(p.ip) AS ipStr, qs.score, db.*
FROM proxies p
LEFT JOIN ipdb db FORCE INDEX FOR JOIN (`iprange`)
ON db.ipStart <= p.ip AND db.ipEnd >= p.ip
LEFT JOIN ipqs qs ON qs.ip = p.ip
WHERE expiration_date < '2021-09-18'
ORDER BY expiration_date
LIMIT 500
'iprange' is an index on ipStart + ipEnd.
There are indexes on p.ip and expiration_date
Explain results:
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
p
range
expirationdate
expirationdate
4
NULL
2547
Using index condition
1
SIMPLE
db
ALL
iprange
NULL
NULL
NULL
8334413
Range checked for each record (index map: 0x2)
1
SIMPLE
qs
eq_ref
PRIMARY
PRIMARY
4
adscend_Aff.p.ip
1
NULL
The query of ipdb, ran by itself, sometimes uses the index and sometimes doesn't.... When it doesn't it takes 17 seconds, when it does it takes 0.4 seconds.
explain SELECT * FROM ipdb db WHERE db.ipStart <= 785476891 AND db.ipEnd >= 785476891;
explain SELECT * from ipdb db where db.ipStart <= 16941057 AND db.ipEnd >= 16941057;
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
db
ALL
iprange
NULL
NULL
NULL
8334413
Using where
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
db
range
iprange
iprange
4
NULL
86
Using index condition
When I force the index:
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
db
range
iprange
iprange
4
NULL
1818402
Using index condition
and takes 1.8 seconds.
Tried FORCE INDEX instead of FORCE INDEX FOR JOIN in the larger query, but no difference. Not sure how to address this. Tried splitting this into two steps and doing the second step within a php loop but it's still crazy slow that way

If startIp is at the wrong end of the table, forcing that index will force it to go through most of the table. You can't win.
Start over on designing the table and the queries. Here is a technique that runs O(1) instead of O(N): http://mysql.rjweb.org/doc.php/ipranges

Related

Very slow query when using `id in (max(id))` in subquery

We recently moved our database from MariaDB to AWS Amazon Aurora RDS (MySQL). We observed something strange in a set of queries. We have two queries that are very quick, but when together as nested subquery it takes ages to finish.
Here id is the primary key of the table
SELECT * FROM users where id in(SELECT max(id) FROM users where id = 1);
execution time is ~350ms
SELECT * FROM users where id in(SELECT id FROM users where id = 1);
execution time is ~130ms
SELECT max(id) FROM users where id = 1;
execution time is ~130ms
SELECT id FROM users where id = 1;
execution time is ~130ms
We believe it has to do something with the type of value returned by max that is causing the indexing to be ignored when running the outer query from results of the sub query.
All the above queries are simplified for illustration of the problem. The original queries have more clauses as well as 100s of millions of rows. The issue did not exist prior to the migration and worked fine in MariaDB.
--- RESULTS FROM MariaDB ---
MySQL seems to optimize less efficient compared to MariaDB (int this case).
When doing this in MySQL (see: DBFIDDLE1), the execution plans look like:
For the query without MAX:
id select_type table partitions type
possible_keys
key key_len ref
rows
filtered Extra
1 SIMPLE integers null const
PRIMARY
PRIMARY 4 const
1
100.00 Using index
1 SIMPLE integers null const
PRIMARY
PRIMARY 4 const
1
100.00 Using index
For the query with MAX:
id select_type table partitions type
possible_keys
key key_len ref
rows
filtered Extra
1 PRIMARY integers null index null
PRIMARY
4 null
1000
100.00 Using where; Using index
2 DEPENDENT SUBQUERY null null null null
null
null null
null
null Select tables optimized away
While MariaDB (see: DBFIDDLE2 does have a better looking plan when using MAX:
id select_type table type
possible_keys
key key_len ref
rows
filtered Extra
1 PRIMARY system null
null
null null
1
100.00
1 PRIMARY integers const PRIMARY
PRIMARY
4 const
1
100.00 Using index
2 MATERIALIZED null null null
null
null null
null
null Select tables optimized away
EDIT: Because of time (some lack of it 😉) I now add some info
A suggestion to fix this:
SELECT *
FROM integers
WHERE i IN (select * from (SELECT MAX(i) FROM integers WHERE i=1)x);
When looking at the EXECUTION PLAN from MariaDB, which has 1 extra step, I tried to do the same in MySQL. Above query has an even bigger execution plan, but tests show that it performs better. (for explain plans, see: DBFIDDLE1a)
"the question is Mariadb that much faster? it uses a step more that mysql"
One step more does not mean that things get slower.
MySQL takes about 2-3 seconds on the query using the MAX, and MariaDB does execute the same in under 10 msecs. But this is performance, and time may vary on different systems.
SELECT max(id) FROM users where id = 1
Is strange. Since it is looking only at rows where id = 1, then "max" is obviously "1". So is the min. And the average.\
Perhaps you wanted:
SELECT max(id) FROM users
Is there an index on id? Perhaps the PRIMARY KEY? If not, then that might explain the sluggishness.
This can be done much faster (against assuming an index):
SELECT * FROM users
ORDER BY id DESC
LIMIT 1
Does that give you what you want?
To discuss this further, please provide SHOW CREATE TABLE users

MYSQL Array Variable (No store prodecure, No temporarily table)

As mention in the title, I would like to know any solution for this by not using store prodecure, temporarily table etc.
Compare Query#1 and Query#3, Query#3 get worst performance result. Does it have any workaround to put in variable but without impact the performance result.
Schema (MySQL v5.7)
create table `order`(
id BIGINT(20) not null auto_increment,
transaction_no varchar(20) not null,
primary key (`id`),
unique key uk_transaction_no (`transaction_no`)
);
insert into `order` (`id`,`transaction_no`)
value (1,'10001'),(2,'10002'),(3,'10003'),(4,'10004');
Query #1
explain select * from `order` where transaction_no in ('10001','10004');
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
order
range
uk_transaction_no
uk_transaction_no
22
2
100
Using where; Using index
Query #2
set #transactionNos = "10001,10004";
There are no results to be displayed.
Query #3
explain select * from `order` where find_in_set(transaction_no, #transactionNos);
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
order
index
uk_transaction_no
22
4
100
Using where; Using index
Short Answer: See Sargeability
Long Answer:
MySQL makes no attempt to optimize an expression when an indexed column when it is hidden inside a function call such as FIND_IN_SET(), DATE(), etc. Your Query 3 will always be performed as a full table scan or a full index scan.
So the only way to optimize what you are doing is to construct
IN ('10001','10004')
That is often difficult to achieve when "binding" a list of values. IN(?) will not work in most APIs to MySQL.

Confused about why index is not being used [duplicate]

When running this EXPLAIN query without an index
EXPLAIN SELECT exec_date,
100 * SUM(CASE WHEN cached = 'no' THEN 1 ELSE 0 END) / SUM(1) cached_no,
100 * SUM(CASE WHEN cached != 'no' THEN 1 ELSE 0 END) / SUM(1) cached_yes
FROM requests
GROUP BY exec_date
This is the output
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE requests ALL NULL NULL NULL NULL 478619 Using temporary; Using filesort
If I create an index
ALTER TABLE requests ADD INDEX exec_date(exec_date);
The output is
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE requests index NULL exec_date 4 NULL 497847
Since the value of Extra is blank, does that mean the key exec_date is not being used?
On a test server, the execution time of the actual (not the EXPLAIN statement) query with and without the index is the same.
Using index doesn't mean what you think it means. If it is present in the Extra column, it indicates that the optimizer isn't actually reading the entire rows, it is using the index (exclusively) to find column information.
The key could still be in use for other things, for example to perform lookups if you have a WHERE clause etc. In your specific scenario, for example, the disappearance of the Using temporary; actually does mean that your index is being utilized, since MySQL no longer needs to rearrange the contents of your table into a new temporary table to perform the GROUP BY.

Using an index with multiple OR statements

I'm trying to get this query to use an index.
explain SELECT SQL_NO_CACHE * FROM products WHERE (stock_disabled = 1 OR negative_stock_allowed = 1 OR stock > 0)
This is what it returns:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p ALL multi_index NULL NULL NULL 2890 Using where
The multi_index is an that contains stock_disabled, negative_stock_allowed and stock in the same order. I think the index is not working because of the multiple OR statements. What can I do here?
"USE INDEX foo" if you explicitly want to use an index
SELECT * FROM products USE INDEX(`multi_index`)

optimizing less than greater than query of mysql

i am facing heavy load from my one query . i have 10 million data in my table and i have done indexing of both pid and fid columns then also mysql explain shows that it is not using any index
here is my query
select * from tableA where pid < '94898' and fid='37' order by id desc limit 1;
my mysql explain output says this
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tableA index fid PRIMARY 4 NULL 152 Using where
but mysql slow shows its scanning millions of data