I must run this query with MySQL:
select requests.id, requests.id_temp, categories.id
from opadithree.requests inner join
opadi.request_detail_2
on substring(requests.id_sub_temp, 3) = request_detail_2.id inner join
opadithree.categories
on request_detail_2.theme = categories.cu_code
where categories.atc = false and id_sub_temp like "2_%";
However for some reason the query is too slow. The table requests has 15583 rows. The table request_detail_2 66469 rows and the table categories has 13452 rows.
The most problematic column id_sub_temp has data strings in the following formats: "2_number" or "3_number".
Do you know some trick to make the query faster?
Here are the indexes I'd try:
First, I need an index so your WHERE condition on id_sub_temp can find the rows needed efficiently. Then add the column id_temp so the result can select that column from the index instead of forcing it to read the row.
CREATE INDEX bk1 ON requests (id_sub_temp, id_temp);
Next I'd like the join to categories to filter by atc=false and then match the cu_code. I tried reversing the order of these columns so cu_code was first, but that resulted in an expensive index-scan instead of a lookup. Maybe that was only because I was testing with empty tables. Anyway, I don't think the column order is important in this case.
CREATE INDEX bk2 ON categories (atc, cu_code);
The join to request_detail_2 is currently by primary key, which is already pretty efficient.
Related
now i'm trying to figure out, what should i do, to improve my query result.
Now, it's 47.55.
So, should i create any indexes for columns? Tell me please
SELECT bw.workloadId, lrer.lecturerId, lrer.lastname, lrer.name, lrer.fathername, bt.title, ac.activityname, cast(bw.exactday as char(45)) as "date", bw.exacttime as "time" FROM base_workload as bw
right join unioncourse as uc on uc.idunioncourse = bw.idunioncourse
right join basecoursea as bc on bc.idbasecoursea = uc.idbasecourse
right join lecturer as lrer on lrer.lecturerId = uc.lecturerId
right join basetitle as bt on bt.idbasetitle = bc.idbasetitle
right join activity as ac on ac.activityId = bc.activityId
where lrer.lecturerId is not null AND bc.idbasecoursea is not null and bw.idunioncourse != ""
ORDER BY bw.exactday, bw.exacttime ASC;
From MySQL 8.0 documentation:
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. This is much faster than reading every row sequentially.
MySQL use indexes for these operations:
To find the rows matching a WHERE clause quickly.
To eliminate rows from consideration.
If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to look up rows.
To retrieve rows from other tables when performing joins.
To find the MIN() or MAX() value for a specific indexed column key_col.
To sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable index (for example, ORDER BY key_part1, key_part2).
In some cases, a query can be optimized to retrieve values without consulting the data rows.
As of your requirements, you could use index on the WHERE clause for faster data retrieval.
I think you can get rid of
lrer.lecturerId is not null
AND bc.idbasecoursea is not null
By changing the first 3 RIGHT JOINs to JOINs.
What is the datatype of exactday? What is the purpose of
cast(bw.exactday as char(45)) as "date"
The CAST may be unnecessary.
Re bw.exactday, bw.exacttime: It is usually better to use a single column for DATETIME instead of two columns (DATE and TIME).
What are the PRIMARY KEYs of the tables?
Please convert to LEFT JOIN if possible; I can't wrap my head around RIGHT JOINs.
This index on bw may help: INDEX(exactday, exacttime).
I was running a query of this kind of query:
SELECT
-- fields
FROM
table1 JOIN table2 ON (table1.c1 = table.c1 OR table1.c2 = table2.c2)
WHERE
-- conditions
But the OR made it very slow so i split it into 2 queries:
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c1 = table.c1
WHERE
-- conditions
UNION
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c2 = table.c2
WHERE
-- conditions
Which works much better but now i am going though the tables twice so i was wondering if there was any further optimizations for instance getting set of entries that satisfies the condition (table1.c1 = table.c1 OR table1.c2 = table2.c2) and then query on it. That would bring me back to the first thing i was doing but maybe there is another solution i don't have in mind. So is there anything more to do with it or is it already optimal?
Splitting the query into two separate ones is usually better in MySQL since it rarely uses "Index OR" operation (Index Merge in MySQL lingo).
There are few items I would concentrate for further optimization, all related to indexing:
1. Filter the rows faster
The predicate in the WHERE clause should be optimized to retrieve the fewer number of rows. And, they should be analized in terms of selectivity to create indexes that can produce the data with the fewest filtering as possible (less reads).
2. Join access
Retrieving related rows should be optimized as well. According to selectivity you need to decide which table is more selective and use it as a driving table, and consider the other one as the nested loop table. Now, for the latter, you should create an index that will retrieve rows in an optimal way.
3. Covering Indexes
Last but not least, if your query is still slow, there's one more thing you can do: use covering indexes. That is, expand your indexes to include all the rows from the driving and/or secondary tables in them. This way the InnoDB engine won't need to read two indexes per table, but a single one.
Test
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c1 = table2.c1
WHERE
-- conditions
UNION ALL
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c2 = table2.c2
WHERE
-- conditions
/* add one more condition which eliminates the rows selected by 1st subquery */
AND table1.c1 != table2.c1
Copied from the comments:
Nico Haase > What do you mean by "test"?
OP shows query patterns only. So I cannot predict does the technique is effective or not, and I suggest OP to test my variant on his structure and data array.
Nico Haase > what you've changed
I have added one more condition to 2nd subquery - see added comment in the code.
Nico Haase > and why?
This replaces UNION DISTINCT with UNION ALL and eliminates combined rowset sorting for duplicates remove.
When I execute a SQL like this;
SELECT *
FROM table_foo
JOIN table_bar
ON table_foo.foo_id = table_bar.bar_id
do I need an index just on table_foo.foo_id ?
Or does MySQL uses both indices on table_foo.foo_id and table_bar.bar_id ?
The result of EXPLAIN is like this.
There are multiple possible execution plans for this query:
SELECT f.*, b.*
FROM table_foo f JOIN
table_bar b
ON f.foo_id = b.bar_id;
Here are some examples:
The one you want to avoid (presumably) is a nested loop join that loops through one table -- row by row -- and then for each row loops through the second one.
Scan foo and look up each value in bar, using an index on table_bar(bar_id). From the row id in the bar index, get the associated columns for each matching row.
Scan bar and look up each value in foo, using an index on table_foo(foo_id). From the row id in the foo index, get the associated columns for each matching row.
Scan both indexes using a merge join and look up the associated rows in each of the tables.
This leave out other options such as hash join which would not normally use indexes.
So, either or both indexes might be used, depending on which algorithms the optimizer implements. That is, one index is often going to be good enough to get the performance you want. But, you give the optimizer more options if you have an index on both tables.
We are facing some performance issues in some reports that work on millions of rows. I tried optimizing sql queries, but it only reduces the time of execution to half.
The next step is to analyse and modify or add some indexes, therefore i have some questions:
1- the sql queries contain a lot of joins: do i have to create an index for each foreignkey?
2- Imagine the request SELECT * FROM A LEFT JOIN B on a.b_id = b.id where a.attribute2 = 'someValue', and we have an index on the table A based on b_id and attribute2: does my request use this index for the where part ( i know if the two conditions were on the where clause the index will be used).
3- If an index is based on columns C1, C2 and C3, and I decided to add an index based on C2, do i need to remove the C2 from the first index?
Thanks for your time
You can use EXPLAIN query to see what MySQL will do when executing it. This helps a LOT when trying to figure out why its slow.
JOIN-ing happens one table at a time, and the order is determined by MySQL analyzing the query and trying to find the fastest order. You will see it in the EXPLAIN result.
Only one index can be used per JOIN and it has to be on the table being joined. In your example the index used will be the id (primary key) on table B. Creating an index on every FK will give MySQL more options for the query plan, which may help in some cases.
There is only a difference between WHERE and JOIN conditions when there are NULL (missing rows) for the joined table (there is no difference at all for INNER JOIN). For your example the index on b_id does nothing. If you change it to an INNER JOIN (e.g. by adding b.something = 42 in the where clause), then it might be used if MySQL determines that it should do the query in reverse (first b, then a).
No.. It is 100% OK to have a column in multiple indexes. If you have an index on (A,B,C) and you add another one on (A) that will be redundant and pointless (because it is a prefix of another index). An index on B is perfectly fine.
I have the following query:
SELECT region.id, region.world_id, min_x, min_y, min_z, max_x, max_y, max_z, version, mint_version
FROM minecraft_worldguard.region
LEFT JOIN minecraft_worldguard.region_cuboid
ON region.id = region_cuboid.region_id
AND region.world_id = region_cuboid.world_id
LEFT JOIN minecraft_srvr.lot_version
ON id=lot
WHERE region.world_id = 10
AND region_cuboid.world_id=10;
The Mysql slow query log tells me that it takes more than 5 seconds to execute, returns 2300 rows but examines 15'404'545 rows to return it.
The three tables each have bout 6500 rows only with unique keys on the id and lot fields as well as keys on the world_id fields. I tried to minimize the amount of rows examined by filtering both cuboid and world by their ID and the double WHERE on world_id, but it did not seem to help.
Any idea how I can optimize this query?
Here is the sqlfiddle with the indexes as of current status.
MySQL can't use index in this case because joined fields has different data types:
`lot` varchar(20) COLLATE utf8_unicode_ci NOT NULL
`id` varchar(128) COLLATE utf8_bin NOT NULL
If you change types of this fields to general type (for example, region.id to utf8_unicode_ci), MySQL uses primary key (fiddle).
According to docs:
Comparison of dissimilar columns (comparing a string column to a
temporal or numeric column, for example) may prevent use of indexes if
values cannot be compared directly without conversion.
You have joined the two tables "minecraft_worldguard.region" and "minecraft_worldguard.region_cuboid", on region.world_id and region_cuboid.world_id. So WHERE clause wouldn't require two conditions.
The two columns in the WHERE clause have been equated in the JOIN condition, hence you wouldn't require checking both the conditions in the WHERE clause. Remove one of them in the WHERE clause and add an index on the column that is remaining on the WHERE condition.
In your example, leave the WHERE clause as below:
WHERE region.world_id = 10
and add an index on the region.world_id column, that would improve the performance a bit.
NOTE: observe that I am suggesting you to discard "AND region_cuboid.world_id=10;" part of the WHERE clause.
Hope that helps.
First, when writing queries that have multiple tables, it is a very good thing to get used to "alias" references to the tables so you don't have to retype the entire long name throughout. Also, it is a really good idea to identify which tables the columns are coming from to allow users to better understand what is where which can also help improve performance (such as suggesting a covering index).
That said, I have applied aliases to your original query, but AM GUESSING the table per the respective columns, but you can obviously identify quickly and adjust.
SELECT
R.id,
R.world_id,
RC.min_x,
RC.min_y,
RC.min_z,
RC.max_x,
RC.max_y,
RC.max_z,
LV.version,
LV.mint_version
FROM
minecraft_worldguard.region R
LEFT JOIN minecraft_worldguard.region_cuboid RC
ON R.id = RC.region_id
AND R.world_id = RC.world_id
LEFT JOIN minecraft_srvr.lot_version LV
ON R.id = LV.lot
WHERE
R.world_id = 10
I also removed from the where clause your "region_cuboid.world_id = 10" as that is redundant as a result of the JOIN clause based on region AND world.
For suggestion of indexes, and if I have the proper alias references to the columns, I would suggest a covering index on the region table of
( world_id, id ). The "World_id" in the first position quickly qualifies the WHERE clause, and the "id" is there for the RC and LV tables.
For the region_cuboid table, I would also have an index on ( world_id, region_id) to match the region table being joined to it.
For the lot_version table, and index on (lot) or a covering index on (lot, version, mint_version)