Optimizing MySQL Left join query between 3 tables to reduce execution time - mysql

I have the following query:
SELECT region.id, region.world_id, min_x, min_y, min_z, max_x, max_y, max_z, version, mint_version
FROM minecraft_worldguard.region
LEFT JOIN minecraft_worldguard.region_cuboid
ON region.id = region_cuboid.region_id
AND region.world_id = region_cuboid.world_id
LEFT JOIN minecraft_srvr.lot_version
ON id=lot
WHERE region.world_id = 10
AND region_cuboid.world_id=10;
The Mysql slow query log tells me that it takes more than 5 seconds to execute, returns 2300 rows but examines 15'404'545 rows to return it.
The three tables each have bout 6500 rows only with unique keys on the id and lot fields as well as keys on the world_id fields. I tried to minimize the amount of rows examined by filtering both cuboid and world by their ID and the double WHERE on world_id, but it did not seem to help.
Any idea how I can optimize this query?
Here is the sqlfiddle with the indexes as of current status.

MySQL can't use index in this case because joined fields has different data types:
`lot` varchar(20) COLLATE utf8_unicode_ci NOT NULL
`id` varchar(128) COLLATE utf8_bin NOT NULL
If you change types of this fields to general type (for example, region.id to utf8_unicode_ci), MySQL uses primary key (fiddle).
According to docs:
Comparison of dissimilar columns (comparing a string column to a
temporal or numeric column, for example) may prevent use of indexes if
values cannot be compared directly without conversion.

You have joined the two tables "minecraft_worldguard.region" and "minecraft_worldguard.region_cuboid", on region.world_id and region_cuboid.world_id. So WHERE clause wouldn't require two conditions.
The two columns in the WHERE clause have been equated in the JOIN condition, hence you wouldn't require checking both the conditions in the WHERE clause. Remove one of them in the WHERE clause and add an index on the column that is remaining on the WHERE condition.
In your example, leave the WHERE clause as below:
WHERE region.world_id = 10
and add an index on the region.world_id column, that would improve the performance a bit.
NOTE: observe that I am suggesting you to discard "AND region_cuboid.world_id=10;" part of the WHERE clause.
Hope that helps.

First, when writing queries that have multiple tables, it is a very good thing to get used to "alias" references to the tables so you don't have to retype the entire long name throughout. Also, it is a really good idea to identify which tables the columns are coming from to allow users to better understand what is where which can also help improve performance (such as suggesting a covering index).
That said, I have applied aliases to your original query, but AM GUESSING the table per the respective columns, but you can obviously identify quickly and adjust.
SELECT
R.id,
R.world_id,
RC.min_x,
RC.min_y,
RC.min_z,
RC.max_x,
RC.max_y,
RC.max_z,
LV.version,
LV.mint_version
FROM
minecraft_worldguard.region R
LEFT JOIN minecraft_worldguard.region_cuboid RC
ON R.id = RC.region_id
AND R.world_id = RC.world_id
LEFT JOIN minecraft_srvr.lot_version LV
ON R.id = LV.lot
WHERE
R.world_id = 10
I also removed from the where clause your "region_cuboid.world_id = 10" as that is redundant as a result of the JOIN clause based on region AND world.
For suggestion of indexes, and if I have the proper alias references to the columns, I would suggest a covering index on the region table of
( world_id, id ). The "World_id" in the first position quickly qualifies the WHERE clause, and the "id" is there for the RC and LV tables.
For the region_cuboid table, I would also have an index on ( world_id, region_id) to match the region table being joined to it.
For the lot_version table, and index on (lot) or a covering index on (lot, version, mint_version)

Related

Should i create any indexes here to optimize my query?

now i'm trying to figure out, what should i do, to improve my query result.
Now, it's 47.55.
So, should i create any indexes for columns? Tell me please
SELECT bw.workloadId, lrer.lecturerId, lrer.lastname, lrer.name, lrer.fathername, bt.title, ac.activityname, cast(bw.exactday as char(45)) as "date", bw.exacttime as "time" FROM base_workload as bw
right join unioncourse as uc on uc.idunioncourse = bw.idunioncourse
right join basecoursea as bc on bc.idbasecoursea = uc.idbasecourse
right join lecturer as lrer on lrer.lecturerId = uc.lecturerId
right join basetitle as bt on bt.idbasetitle = bc.idbasetitle
right join activity as ac on ac.activityId = bc.activityId
where lrer.lecturerId is not null AND bc.idbasecoursea is not null and bw.idunioncourse != ""
ORDER BY bw.exactday, bw.exacttime ASC;
From MySQL 8.0 documentation:
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. This is much faster than reading every row sequentially.
MySQL use indexes for these operations:
To find the rows matching a WHERE clause quickly.
To eliminate rows from consideration.
If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to look up rows.
To retrieve rows from other tables when performing joins.
To find the MIN() or MAX() value for a specific indexed column key_col.
To sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable index (for example, ORDER BY key_part1, key_part2).
In some cases, a query can be optimized to retrieve values without consulting the data rows.
As of your requirements, you could use index on the WHERE clause for faster data retrieval.
I think you can get rid of
lrer.lecturerId is not null
AND bc.idbasecoursea is not null
By changing the first 3 RIGHT JOINs to JOINs.
What is the datatype of exactday? What is the purpose of
cast(bw.exactday as char(45)) as "date"
The CAST may be unnecessary.
Re bw.exactday, bw.exacttime: It is usually better to use a single column for DATETIME instead of two columns (DATE and TIME).
What are the PRIMARY KEYs of the tables?
Please convert to LEFT JOIN if possible; I can't wrap my head around RIGHT JOINs.
This index on bw may help: INDEX(exactday, exacttime).

Limiting the number of rows joined in SQL while using inner join,limit and order by

I have many tables which require joins before fetching data to display.
Also I am using inner join and I just want to show first 20 results.
So the problem is that at present the query is scanning the whole table because it will sort first and then extract the top 20 rows, and I can't make the column I want to sort as primary index though it is indexed.
The second thing is that also the query uses Joins so is there a way that I get the top 20 results from parent table(sorted column is present in parent table itself) and then reduce the rows being joined and thus reducing the join time?
Query is like:
select distinct patientFirstName as patientFirstName, patientLastName
as patientLastName from EncounterInformationBean inner join LastName
ON
EncounterInformationBean.id=LastName.uid
where
patientFirstName like key and isActive=1 order by patientFirstName
LIMIT 20
key is a variable name which is of string type.
Any optimization strategy for the problem is appreciated.
I already indexed the column to be sorted(though not as primary key) but the main problem is join.
One way can be denormalization but there are many columns to be displayed.
Don't use DISTINCT
Spell ORDER BY correctly
What the heck is like key
Have INDEX(isActive, patientFirstName)

Need some clarification on indexes (WHERE, JOIN)

We are facing some performance issues in some reports that work on millions of rows. I tried optimizing sql queries, but it only reduces the time of execution to half.
The next step is to analyse and modify or add some indexes, therefore i have some questions:
1- the sql queries contain a lot of joins: do i have to create an index for each foreignkey?
2- Imagine the request SELECT * FROM A LEFT JOIN B on a.b_id = b.id where a.attribute2 = 'someValue', and we have an index on the table A based on b_id and attribute2: does my request use this index for the where part ( i know if the two conditions were on the where clause the index will be used).
3- If an index is based on columns C1, C2 and C3, and I decided to add an index based on C2, do i need to remove the C2 from the first index?
Thanks for your time
You can use EXPLAIN query to see what MySQL will do when executing it. This helps a LOT when trying to figure out why its slow.
JOIN-ing happens one table at a time, and the order is determined by MySQL analyzing the query and trying to find the fastest order. You will see it in the EXPLAIN result.
Only one index can be used per JOIN and it has to be on the table being joined. In your example the index used will be the id (primary key) on table B. Creating an index on every FK will give MySQL more options for the query plan, which may help in some cases.
There is only a difference between WHERE and JOIN conditions when there are NULL (missing rows) for the joined table (there is no difference at all for INNER JOIN). For your example the index on b_id does nothing. If you change it to an INNER JOIN (e.g. by adding b.something = 42 in the where clause), then it might be used if MySQL determines that it should do the query in reverse (first b, then a).
No.. It is 100% OK to have a column in multiple indexes. If you have an index on (A,B,C) and you add another one on (A) that will be redundant and pointless (because it is a prefix of another index). An index on B is perfectly fine.

Mysql, PHP Laravel

I have following tables on Production with the respective counts of records,
people count= '565367'
donors count= '556325'
telerec_recipients count= '115147'
person_addresses count= '563183'
person_emails count= '106958'
person_phones count= '676474'
person_sms count= '22275'
On the UI end I want to display some data by applying some joins or left joins, and at the end by grouping, ordering and paginating I'm generating a json response.
So for achieving this I tried 2 methods, one by creating the view file of the join queries and apply where, group by and order by clauses inside controller, and second by directly firing the laravel syntax of joins in my controller.
Because I have a large data in all the tables, The join query works good but when it comes to group by statement it takes much time to execute.
Help me out in order to optimize or faster this process.
My Example Query is:
SELECT people.id as person_id,donors.donor_id,telerec_recipients.id as telerec_recipient_id,people.first_name,people.last_name,people.birth_date,
person_addresses.address_id,person_emails.email_id,person_phones.phone_id,person_sms.sms_id
FROM `people`
inner join `donors`
on `people`.`id` = `donors`.`person_id`
left join `telerec_recipients`
on `people`.`id` = `telerec_recipients`.`person_id`
left join `person_addresses`
on `people`.`id` = `person_addresses`.`person_id`
left join `person_emails`
on `people`.`id` = `person_emails`.`person_id`
left join `person_phones`
on `people`.`id` = `person_phones`.`person_id`
left join `person_sms`
on `people`.`id` = `person_sms`.`person_id`
GROUP BY `people`.`id`, `telerec_recipients`.`id`
ORDER BY `last_name` ASC LIMIT 25 offset 0;
Result of the explain:
Donors Table Indexes
The problem is that MySQL uses the wrong index on the donors table, you need to instruct MySQL using force index index hint to use the primary key instead.
Explanation
From the output of the explain it is clear that sg really goes wrong with donors table, since you can see using temporary, using filesort in the extra column.
The key column tells you that MySQL uses a unique index on the donors.donor_id field. However, donor_id field is not used anywhere in the query for joining / grouping / filtering purposes. donors.person_id field is used in the join. Since the primary key of donors table is on the person_id field, myql should use that for the join. force index index hint tells MySQL that you firmly believe that a certain index should be used.
Note 1:
You have several duplicate indexes. Remove all duplicates, they are only slowing your system own.
Note 2:
I think you query is against the sql standards because you have fields in the select list that are neither in the group by list, nor are subject of an aggregate function such as sum(), nor are functionally dependent on the grouping fields. In MySQL under certain sql mode settings such queries are allowed to run, but they may not produce entirely the same output as you would like.

I'm not sure if I have the correct indexes or if I can improve the speed of my query in MySQL?

My query has a join, and it looks like it's using two indexes which makes it more complicated. I'm not sure if I can improve on this, but I thought I'd ask.
The query produces a list of records with similar keywords the record being queried.
Here's my query.
SELECT match_keywords.padid,
COUNT(match_keywords.word) AS matching_words
FROM keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
WHERE match_keywords.word IS NOT NULL
AND current_program_keywords.padid = 25695
GROUP BY match_keywords.padid
ORDER BY matching_words DESC
LIMIT 0, 11
The EXPLAIN
Word is varchar(40).
You can start by trying to remove the IS NOT NULL test, which is implicitly removed by COUNT on the field. It also looks like you would want to omit 25695 from match_keywords, otherwise 25695 (or other) would surely show up as the "best" match within your 11 row limit?
SELECT match_keywords.padid,
COUNT(match_keywords.word) AS matching_words
FROM keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
WHERE current_program_keywords.padid = 25695
GROUP BY match_keywords.padid
ORDER BY matching_words DESC
LIMIT 0, 11
Next, consider how you would do it as a person.
You would to start with a padid (25695) and retrieve all the words for that padid
From those list of words, go back into the table again and for each matching word,
get their padid's (assumed to have no duplicate on padid + word)
group the padid's together and count them
order the counts and return the highest 11
With your list of 3 separate single-column indexes, the first two steps (both involve only 2 columns) will always have to jump from index back to data to get the other column. Covering indexes may help here - create two composite indexes to test
create index ix_keyword_pw on keyword(padid, word);
create index ix_keyword_wp on keyword(word, padid);
With these composite indexes in place, you can remove the single-column indexes on padid and word since they are covered by these two.
Note: You always have to temper SELECT performance against
size of indexes (the more you create the more to store)
insert/update performance (the more indexes, the longer it takes to commit since it has to update the data, then update all indexes)
Try the following... ensure index on PadID, and one on WORD. Then, by changing the order of the SELECT WHERE qualifier should optimize on the PADID of the CURRENT keyword first, then join to the others... Exclude a join to itself. Also, since you were checking on equality on the inner join to matching keywords... if the current keyword is checked for null, it should never join to a null value, thus eliminating a compare on the MATCH keywords alias as looking at every comparison as looking for NULL...
SELECT STRAIGHT_JOIN
match_keywords.padid,
COUNT(*) AS matching_words
FROM
keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
and match_keywords.padid <> 25695
WHERE
current_program_keywords.padid = 25695
AND current_program_keywords.word IS NOT NULL
GROUP BY
match_keywords.padid
ORDER BY
matching_words DESC
LIMIT
0, 11
You should index the following fields (check to what table corresponds)
match_keyword.padid
current_program_keywords.padid
match_keyword.words
current_program_keywords.words
Hope it helps accelerate