I have three tables:
Orders
OrdersPromotions
Promotions
Most of my queries are of this kind:
SELECT `promotions`.* FROM `promotions` INNER JOIN `orders_promotions` ON `promotions`.`id` = `orders_promotions`.`promotions_id` WHERE `orders_promotions`.`orders_id` = 3 AND `promotions`.`code` = 'my_promotion_code'
So, I never fetch promotions directly, but also within the scope of an order. An order won't have many promotions. I am wondering if it would be useful to place an INDEX in the code column of promotion, knowing that when doing the INNER JOIN actually the results after the INNER JOIN are not many, and so, it would be ok to go through all them finding the promotion which code is the given.
Would an index make sense in my previous query, knowing that just this query:
SELECT `promotions`.* FROM `promotions` INNER JOIN `orders_promotions` ON `promotions`.`id` = `orders_promotions`.`promotions_id` WHERE `orders_promotions`.`orders_id` = 3
Would return no more than 20 rows?
You should almost always use an index on any fields you are going to use for joins, sorts, grouping, or filtering in where clauses. I would say ALWAYS, but there could be exceptions to the rule (like if you had a very heavy write load on a table that was very infrequently used for reads where indexes would be useful).
Related
I have 2 tables: Articles and Comments;
"Comments.articleID" is a foreign key.
I want to query the database to compose a website that shows the article text of a certain article (given an articleID) and all the article's comments.
I can think of 2 ways to query the data:
Use 2 separate queries:
SELECT articles.text FROM articles where id = givenArticleID
SELECT comments.* FROM comments where comments.articleID = givenArticleID
Use an Inner join:
SELECT articles.text, comments.*
FROM articles
INNER JOIN comments on articles.id = comments.articleID
WHERE articles.id = givenArticleID
The first option only returns the data I am interested in - that is good.
The second option returns all data I am interested in, but much more data than necessary. Every row in the result set contains the article.text column, that could be a lot of (unnecessary) data.
I think that the join would be better for certain queries, that do not require a WHERE condition (thus containing different articles).
Which way would you generally prefer in the situation above?
Or is there an even better alternative...?
Option 2 is probably better, because it is only one client-server round trip.
Also don't forget that each query has to be parsed by the database server.
I'd recommend that you benchmark both versions and see which one performs better.
Below is my query
Select
count(t.prid)
from
(select
pr.prid
from
jcp
inner join pr ON pr.prid = jcp.prid
where
jcp.custid = 123 union select
pr.prid
from
jcl
inner join pr ON pr.prid = jcl.prid
where
jcl.custid = 123) as t
is there any way to make it more efficient? this query is inside some function and executing 1000s of time. so making it slow.
First of all, your query appears to be combining two very different types of data in your 'union' - the first part being the count of an ID, and the second being the literal ID - so I would question whether this is really doing what you intend it to do as written. However, just taking it at face value, you could eliminate the subquery in the first part as follows:
SELECT COUNT(pr.prid)
FROM jcp
INNER JOIN pr
ON pr.prid = jcp.prid
WHERE jcp.custid = 123
I can't say how much that would help your performance without knowing the context of your data, but it certainly wouldn't hurt.
Given the difference in the two data sets, it doesn't appear possible to avoid the union if you want to force these two different bits of data into the same column. If you were to put them into different columns, you could probably avoid the union.
Lets say I have the following query:
SELECT occurs.*, events.*
FROM occurs
INNER JOIN events ON (events.event_id = occurs.event_id)
WHERE event.event_state = 'visible'
Another way to do the same query and get the same results would be:
SELECT occurs.*, events.*
FROM occurs
INNER JOIN events ON (events.event_id = occurs.event_id
AND event.event_state = 'visible')
My question. Is there a real difference? Is one way faster than the other? Why would I choose one way over the other?
For an INNER JOIN, there's no conceptual difference between putting a condition in ON and in WHERE. It's a common practice to use ON for conditions that connect a key in one table to a foreign key in another table, such as your event_id, so that other people maintaining your code can see how the tables relate.
If you suspect that your database engine is mis-optimizing a query plan, you can try it both ways. Make sure to time the query several times to isolate the effect of caching, and make sure to run ANALYZE TABLE occurs and ANALYZE TABLE events to provide more info to the optimizer about the distribution of keys. If you do find a difference, have the database engine EXPLAIN the query plans it generates. If there's a gross mis-optimization, you can create an Oracle account and file a feature request against MySQL to optimize a particular query better.
But for a LEFT JOIN, there's a big difference. A LEFT JOIN is often used to add details from a separate table if the details exist or return the rows without details if they do not. This query will return result rows with NULL values for b.* if no row of b matches both conditions:
SELECT a.*, b.*
FROM a
LEFT JOIN b
ON (condition_one
AND condition_two)
WHERE condition_three
Whereas this one will completely omit results that do not match condition_two:
SELECT a.*, b.*
FROM a
LEFT JOIN b ON some_condition
WHERE condition_two
AND condition_three
Code in this answer is dual licensed: CC BY-SA 3.0 or the MIT License as published by OSI.
I'm working through the JOIN tutorial on SQL zoo.
Let's say I'm about to execute the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM game a
JOIN goal g
ON g.matchid = a.id
GROUP BY a.stadium
As it happens, it produces the same output as the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM goal g
JOIN game a
ON g.matchid = a.id
GROUP BY a.stadium
So then, when does it matter which table you assign at FROM and which one you assign at JOIN?
When you are using an INNER JOIN like you are here, the order doesn't matter. That is because you are connecting two tables on a common index, so the order in which you use them is up to you. You should pick an order that is most logical to you, and easiest to read. A habit of mine is to put the table I'm selecting from first. In your case, you're selecting information about a stadium, which comes from the game table, so my preference would be to put that first.
In other joins, however, such as LEFT OUTER JOIN and RIGHT OUTER JOIN the order will matter. That is because these joins will select all rows from one table. Consider for example I have a table for Students and a table for Projects. They can exist independently, some students may have an associated project, but not all will.
If I want to get all students and project information while still seeing students without projects, I need a LEFT JOIN:
SELECT s.name, p.project
FROM student s
LEFT JOIN project p ON p.student_id = s.id;
Note here, that the LEFT JOIN refers to the table in the FROM clause, so that means ALL of students were being selected. This also means that p.project will be null for some rows. Order matters here.
If I took the same concept with a RIGHT JOIN, it will select all rows from the table in the join clause. So if I changed the query to this:
SELECT s.name, p.project
FROM student s
RIGHT JOIN project p ON p.student_id = s.id;
This will return all rows from the project table, regardless of whether or not it has a match for students. This means that in some rows, s.name will be null. Similar to the first example, because I've made project the outer joined table, p.project will never be null (assuming it isn't in the original table). In the first example, s.name should never be null.
In the case of outer joins, order will matter. Thankfully, you can think intuitively with LEFT and RIGHT joins. A left join will return all rows in the table to the left of that statement, while a right join returns all rows from the right of that statement. Take this as a rule of thumb, but be careful. You might want to develop a pattern to be consistent with yourself, as I mentioned earlier, so these queries are easier for you to understand later on.
When you only JOIN 2 tables, usually the order does not matter: MySQL scans the tables in the optimal order.
When you scan more than 2 tables, the order could matter:
SELECT ...
FROM a
JOIN b ON ...
JOIN c ON ...
Also, MySQL tries to scan the tables in the fastest way (large tables first). But if a join is slow, it is possible that MySQL is scanning them in a non-optimal order. You can verify this with EXPLAIN. In this case, you can force the join order by adding the STRAIGHT_JOIN keyword.
The order doesn't always matter, I usually just order it in a way that makes sense to someone reading your query.
Sometime order does matter. Try it with LEFT JOIN and RIGHT JOIN.
In this instance you are using an INNER JOIN, if you're expecting a match on a common ID or foreign key, it probably doesn't matter too much.
You would however need to specify the tables the correct way round if you were performing an OUTER JOIN, as not all records in this type of join are guaranteed to match via the same field.
yes, it will matter when you will user another join LEFT JOIN, RIGHT JOIN
currently You are using NATURAL JOIN that is return all tables related data, if JOIN table row not match then it will exclude row from result
If you use LEFT / RIGHT {OUTER} join then result will be different, follow this link for more detail
MySQL setup: step by step.
programs -> linked to --> speakers (by program_id)
At this point, it's easy for me to query all the data:
SELECT *
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
Nice and easy.
The trick for me is this. My speakers table is also linked to a third table, "books." So in the "speakers" table, I have "book_id" and in the "books" table, the book_id is linked to a name.
I've tried this (including a WHERE you'll notice):
SELECT *
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
JOIN books on speakers.book_id = books.book_id
WHERE programs.category_id = 1
LIMIT 5
No results.
My questions:
What am I doing wrong?
What's the most efficient way to make this query?
Basically, I want to get back all the programs data and the books data, but instead of the book_id, I need it to come back as the book name (from the 3rd table).
Thanks in advance for your help.
UPDATE:
(rather than opening a brand new question)
The left join worked for me. However, I have a new problem. Multiple books can be assigned to a single speaker.
Using the left join, returns two rows!! What do I need to add to return only a single row, but separate the two books.
is there any chance that the books table doesn't have any matching columns for speakers.book_id?
Try using a left join which will still return the program/speaker combinations, even if there are no matches in books.
SELECT *
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
LEFT JOIN books on speakers.book_id = books.book_id
WHERE programs.category_id = 1
LIMIT 5
Btw, could you post the table schemas for all tables involved, and exactly what output (or reasonable representation) you'd expect to get?
Edit: Response to op author comment
you can use group by and group_concat to put all the books on one row.
e.g.
SELECT speakers.speaker_id,
speakers.speaker_name,
programs.program_id,
programs.program_name,
group_concat(books.book_name)
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
LEFT JOIN books on speakers.book_id = books.book_id
WHERE programs.category_id = 1
GROUP BY speakers.id
LIMIT 5
Note: since I don't know the exact column names, these may be off
That's typically efficient. There is some kind of assumption you are making that isn't true. Do your speakers have books assigned? If they don't that last JOIN should be a LEFT JOIN.
This kind of query is typically pretty efficient, since you almost certainly have primary keys as indexes. The main issue would be whether your indexes are covering (which is more likely to occur if you don't use SELECT *, but instead select only the columns you need).