MySQL Left Join (I think) Difficulty - mysql

I have the following query:
SELECT
u.username as username,
s.campaignno as campaign,
if(f.hometeamscore>f.awayteamscore,1,0) as Win,
if(f.hometeamscore=f.awayteamscore,1,0) as Draw,
if(f.hometeamscore<f.awayteamscore,1,0) as Loss,
f.hometeamscore as Goals,
ss.seasonid as Season,
av.avatar as Avatar
FROM
avatar_avatar av,
straightred_fixture f,
straightred_userselection s,
auth_user u,
straightred_season ss
WHERE
av.user_id = u.id
AND ss.seasonid = 1025
AND f.soccerseasonid = ss.seasonid
AND s.fixtureid = f.fixtureid
AND s.teamselectionid = f.hometeamid
AND s.user_id = u.id;
This query is working as expected but I have now realised that a user may not have uploaded a profile picture. So the following part av.user_id = u.id is excluding anyone who has NOT uploaded a profile picture. I feel i need to use a left join after reading the following https://www.w3schools.com/sql/sql_join.asp but I just keep going around in circles and get nowhere.
Any guidance on this would be greatly appreciated, many thanks, Alan.

First and foremost: avoid implicit JOINs. Make JOINs explicit and you will make much more clear which entity relates to which entity, and you'll never forget to add one of the AND conditions in your WHERE and get a cartesian product.
Second: try to put your tables in the FROM using an order that follows a certain logic. In your case, you seem to start looking for ss.seasonid = 1025... (it's the only condition on the WHERE having a constant). Then, your list of conditions produces a certain logical order... Each table in the FROM has a relationship with the previous one...
That said, I think you need this query:
SELECT
u.username as username,
s.campaignno as campaign,
if(f.hometeamscore>f.awayteamscore,1,0) as Win,
if(f.hometeamscore=f.awayteamscore,1,0) as Draw,
if(f.hometeamscore<f.awayteamscore,1,0) as Loss,
f.hometeamscore as Goals,
ss.seasonid as Season,
av.avatar as Avatar
FROM
straightred_season ss
JOIN straightred_fixture f
ON f.soccerseasonid = ss.seasonid
JOIN straightred_userselection s
ON s.fixtureid = f.fixtureid AND s.teamselectionid = f.hometeamid
JOIN auth_user u
ON u.id = s.user_id
-- This last table is the one that needs to be LEFT-joined
-- if the avatar is *optional*. If it isn't there, av.avatar will just
-- be shown as NULL
LEFT JOIN avatar_avatar av
ON av.user_id = u.id
WHERE
ss.seasonid = 1025 ;
If the content of more tables is optional, you may need more than one LEFT JOIN. In order to find out what makes sense, we would need to have the full data model, or the ERD, that represents your scenario. That is, which relationships are 1 to 1, which are 1 to Many, which are 1 to (0 or 1), which are Many-to-Many, etc.

I'm a fan of using JOIN's so I rewrote your query like this.
Be advised however that I user SQL SERVER / ORACLE and not MYSQL so not sure if my semantics are correct. I use the IFNULL function since at least in my world, using a column where the row isn't available can cause the entire result to filter out.
Also by moving ss.seasonid = 1025 into the join, rather than leaving it in the where, you should get results regardless of there existing an ss record.
That said, this should resolve your issues:
EDIT - replace ISNULL with IFNULL
select
u.username as username
,s.campaignno as campaign
,if(ifnull(f.hometeamscore,0)>ifnull(f.awayteamscore,0),1,0) as Win
,if(ifnull(f.hometeamscore,0)=ifnull(f.awayteamscore,-1),1,0) as Draw
,if(ifnull(f.hometeamscore,0)<ifnull(f.awayteamscore,0),1,0) as Loss
,f.hometeamscore as Goals
,ss.seasonid as Season
,av.avatar as Avatar
from
auth_user u
Left Join
avatar_avatar av on u.id = av.user_id
Left Join
straightred_userselection s on u.id = s.user_id
Left Join
straightred_fixture f on f.hometeamid = s.teamselectionid
and f.fixtureid = s.fixtureid
Left Join
straightred_season ss on f.soccerseasonid = ss.seasonid
and ss.seasonid = 1025

Related

how do i join third table values into main join?

in query here i have https://www.db-fiddle.com/f/32Kc3QisUEwmSM8EmULpgd/1
SELECT p.prank, d.dare
FROM dares d
INNER JOIN pranks p ON p.id = d.prank_id
WHERE d.condo_id = 1;
i have one condo with id 1 and it have unique connection to dares that has connection to pranks and unique connection to condos_pranks
and i wanna have all unique pranks from both tables and i used this query above to get relation of
dares to pranks and expected result was L,M,N - Yes,No,Maybe and it is correct but i also wanna have those in condos_pranks which ids are 1,4,5,6 = L,O,P,Q
so i tried to join the table with left join because it might not have condos_pranks row
SELECT p.prank, d.dare
FROM dares d
INNER JOIN pranks p ON p.id = d.prank_id
LEFT JOIN condos_pranks pd ON pd.condo_id = d.condo_id AND pd.prank_id = p.id
WHERE d.condo_id = 1;
but result is same as first and what i want is
prank
dare
L
Yes
M
No
N
Maybe
O
No
P
No
Q
No
with default being No = 2 if prank_id of condos_pranks is not in dares
how to connect it?
This seems like an exercise in identifying extraneous information more than anything. You are unable to join something to a table that has no key, however if you know your default then you may use something like coalesce to identify the records where there was no data to join NULL and replace them with your default.
I mentioned in a comment above that this table schema makes little sense. You have keys all over the place that doing have all sorts of circular references. If this is your derived schema, consider stopping here and revisiting the relationships. If it is not and it is something educational, which I suspect it is, disregard and recognize the logical flaws in what you are working in. Perhaps consider taking the data provided and creating a new table schema that is more normalized and uses other tables to handle the many to many and one to many relationships.
dbfiddle
SELECT
pranks.prank,
COALESCE(dares.dare, 'No')
FROM pranks LEFT OUTER JOIN
dares ON pranks.id = dares.prank_id
ORDER BY pranks.prank ASC;
clearlyclueless gave correct explanations
To achieve the result, the following SELECT can also be used:
SELECT
pranks.prank,
case
when dare is null then 'No'
else dare
end
FROM pranks LEFT OUTER JOIN
dares ON pranks.id = dares.prank_id

SQL JOIN query needs over 15s to run

I have a pretty big SQL query to get data from multiple database tables. I use the ON condition to check if the guild_ids are always the same and in some cases, he check's for an user_id too.
That is my query:
SELECT
SUM( f.guild_id = 787672220503244800 AND f.winner_id LIKE '%841827102331240468%' ) AS guild_winner,
SUM( f.winner_id LIKE '%841827102331240468%' ) AS win_sum,
m.message_count,
r.bypass_role_id,
i.real_count,
i.total_count,
i.bonus_count,
i.left_count
FROM
guild_finished_giveaways AS f
JOIN guild_message_count AS m
JOIN guild_role_settings AS r
JOIN guild_invite_count AS i ON m.guild_id = f.guild_id
AND m.user_id = 841827102331240468
AND r.guild_id = f.guild_id
AND i.guild_id = f.guild_id
AND i.user_id = m.user_id
But it runs pretty slow, with over 15s. I can't see why it needs so long.
I figured out that if I remove the "guild_invite_count" JOIN, it's pretty fast again. Do I have some simple error here that I don't see? Or what could be the issue?
Each JOIN expression needs it's own ON. Don't wait until the end for this. As it was, the server was forced to build up a cartesian product of all those tables before narrowing them down again, and I'm surprised the query ran at all (I'd expect a syntax error for missing ON clauses).
FROM guild_finished_giveaways AS f
JOIN guild_message_count AS m ON m.guild_id = f.guild_id
JOIN guild_role_settings AS r ON r.guild_id = f.guild_id
JOIN guild_invite_count AS i ON i.guild_id = f.guild_id
AND i.user_id = m.user_id
WHERE m.user_id = 841827102331240468
It's also more than a little odd to use SUM() or any other aggregate function in the same query as non-aggregated values without a GROUP BY clause.
Are you using InnoDB?
Does every table have a PRIMARY KEY?
These may help:
m: PRIMARY KEY(user_id) -- assuming that is unique in that table
f: INDEX(guild_id, winner_id)
r: INDEX(guild_id, bypass_role_id)
i: INDEX(user_id,)
It looks like some tables should not be separate -- perhaps r,i,f could be combined? (I need to see SHOW CREATE TABLE to say more.)
Do NOT have a commalist in winner_id. Instead have another table with one row per winner per game (or whatever it is a winner of). Perhaps just to columns like a Many-to-many mapping table.
Noting that the execution is likely to start with m and then go next to i let's improve on Joel's suggestion:
FROM guild_message_count AS m
JOIN guild_invite_count AS i ON i.user_id = m.user_id
JOIN guild_finished_giveaways AS f ON f.guild_id = m.guild_id
JOIN guild_role_settings AS r ON r.guild_id = m.guild_id
WHERE m.user_id = 841827102331240468
Note that 3 tables are joined on guild_id; but only 2 = are needed.
SUM without GROUP BY sums up the entire resultset (after JOINing). But you have 6 non-aggregates, so you need to GROUP BY all 6.
But that may lead to grossly inflated sums. Maybe you need to do the aggregation just over f first since that is where you are summing. Then JOIN to the rest??

How the SQL join actually works?

Suppose I have two table Gardners table and Plantings table.
Suppose my query is:
SELECT gid, first_name, last_name, pid, gardener_id, plant_name
FROM Gardners
INNER JOIN Plantings
ON gid = gardener_id
I want to know how exactly it works internally?
As per my understanding in every join condition:
Each row from Gardner Table will be compared with each row of Plantings Table. If the condition is matched then it will print out. Is my understanding correct?
In terms of program if you think:
for i in [Gardners Table]:
for j in [Plantings Table]:
if i.id == j.garderner id:
print <>
Now suppose if you query is something like:
User(uid,uname,ucity,utimedat)
Follows(uid,followerid) // uid and followerid are pointing to `uid` of `User`.
SELECT U.uid, U.uname FROM User U1 JOIN Follows F,User U2 WHERE
F.followerid = U2.uiddate AND U2.ucity = 'D'
How the join condition will work internally here? Is it equivalent to:
for i in [USER]:
for j in [Follows]:
for k in [USER]:
if condition:
print <>
Your example with Gardners table and Plantings table is correct. But example with users not so obvious.
I think that what you want to get is user followers from some city.
Assuming correct query is:
SELECT U1.uid, U2.uname
FROM User U1
JOIN Follows F ON U1.uid = F.uid
JOIN User U2 ON F.followerid = U2.uid
WHERE U2.ucity = 'D'
Then in pseudo code it'll look like this:
for i in [User Table]:
for j in [Follows Table]:
if i.uid = j.uid:
for k in [User Table]:
if j.followerid = k.uid and k.city = 'D':
print <>
SQL Fiddle for this: http://sqlfiddle.com/#!9/caeb1e/5
There is a very good picture of how joins actually works can be found here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
In your second query, it's not clear what you're trying to do exactly as the syntax is erroneous; but if I were to guess, it seems like your intention is to join User U1 with a sub query of (implicit) join between Followers F and User U2.
If my guess is correct, the query would properly look more like this:
SELECT U1.uid, U1.uname
FROM User U1 JOIN
(SELECT U2.uid
FROM Followers F,User U2
WHERE F.followerid = U2.uiddate AND U2.ucity = 'D') T
WHERE u1.uid = T.uid
Which is not a 'best practice' way of writing the query either (you should use explicit joins, there's no need for a sub-query but you can just join three times, and so on)
But I wrote it this way to keep it closest to your original query.
And if my guess is correct, then your pseudo code would be more like:
for u2 in [User 2 where condition]:
for f in [Follows]:
if f.uid == u2.uid
SELECT uid AS T
for u1 in [User 1]:
if u1.uid == T.uid:
print <>
However, it's not a fully explained interpretation, because one key to understanding SQL is to think more in 'set' of data being filtered, rather than sequential selection of objects of data, because SQL does operations based on the set of data, which one might not be used to.
So a number of the above steps will be executed in one go, instead of sequential. But other than that, you should look towards the answer given by Yuri Tkachenko above, in how to view joins - and then the internals will come second when writing correct joins.
Yes you're understanding is correct if you are only talking on a join not on the other join eg: Inner, Outer like in SQL

Why use letters in front of each value in MySQL query?

Why would I use letters in front of each value in my query like this?
In the database, each of these values is WITHOUT the letter in front.
SELECT c.client_id, c.client_name, c.contactperson, c.internal_comment,
IF NULL(r.region, 'Alle byer') as region, c.phone, c.email,
uu.fullname as changed_by,
(select count(p.project_id)
from projects p
where p.client_id = c.client_id and (p.is_deleted != 1 or p.is_deleted is null)
) as numProjects
FROM clients c LEFT JOIN users uu ON c.db_changed_by = uu.id
LEFT JOIN regions r ON c.region_id = r.region_id
WHERE (c.is_deleted != 1 or c.is_deleted is null)
I have tried looking it up, but I can't find it anywhere.
When in SQL you need to use more than one table for a query, you can do this:
SELECT person.name, vehicle.id FROM person, vehicle;
OR you can do it smaller, and put like this
SELECT p.name, v.id FROM person p, vehicle v;
It's only for reducing the query length, and it's useful for you
By "letters in front", I assume you mean the qualifiers on the columns c., uu. and so on. They indicate the table where the column comes from. In a sense, they are part of the definition of the column.
This is your query:
SELECT c.client_id, c.client_name, c.contactperson, c.internal_comment,
IF NULL(r.region, 'Alle byer') as region, c.phone, c.email,
uu.fullname as changed_by,
(select count(p.project_id)
from projects p
where p.client_id = c.client_id and (p.is_deleted != 1 or p.is_deleted is null)
) as numProjects
FROM clients c LEFT JOIN
users uu
ON c.db_changed_by = uu.id LEFT JOIN
regions r
ON c.region_id = r.region_id
WHERE (c.is_deleted != 1 or c.is_deleted is null)
In some cases, these are needed. Consider the on clause:
ON c.region_id = r.region_id
If you leave them out, you have:
ON region_id = region_id
The SQL compiler cannot interpret this, because it does not know where region_id comes from. Is it from clients or regions? If you used this in the select, you would have the same issue -- and it makes a difference because of the left join. This is also true in the correlated subquery.
In general, it is good practice to qualify column names for several reasons:
The query is unambiguous.
You (and others) readily know where columns are coming from.
If you modify the query and add a new table/subquery, you don't have to worry about naming conflicts.
If the underlying tables are modified to have new column names that are shared with other tables, then the query will still compile.
Consider you are accessing 2 tables and both have same column name say 'Id', In query you can easily identify those columns using letters like a.Id == d.Id if first table has alias name 'a' and second table 'b'. Or else It would be very difficult to identify which column belongs which table especially when you have common table columns.

Optimizing a where statement MYSQL

Im writing this complex query to return a large dataset, which is about 100,000 records. The query runs fine until i add in this OR statement to the WHERE clause:
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS
Null)
Now i understand that by putting the or statement in there it adds a lot of overhead.
Without that statement and just:
AND responses.StrategyFk = strategies.Id
The query runs within 15 seconds, but doesn't return any records that didn't have a fk linking a strategie.
Although i would like these records as well. Is there an easier way to find both records with a simple where statement? I can't just add another AND statement for null records because that will break the previous statement. Kind of unsure of where to go from here.
Heres the lower half of my query.
FROM
responses, subtestinstances, students, schools, items,
strategies, subtests
WHERE
subtestinstances.Id = responses.SubtestInstanceFk
AND subtestinstances.StudentFk = students.Id
AND students.SchoolFk = schools.Id
AND responses.ItemFk = items.Id
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS Null)
AND subtests.Id = subtestinstances.SubtestFk
try:
SELECT ... FROM
responses
JOIN subtestinstances ON subtestinstances.Id = responses.SubtestInstanceFk
JOIN students ON subtestinstances.StudentFk = students.Id
JOIN schools ON students.SchoolFk = schools.Id
JOIN items ON responses.ItemFk = items.Id
JOIN subtests ON subtests.Id = subtestinstances.SubtestFk
LEFT JOIN strategies ON responses.StrategyFk = strategies.Id
That's it. No OR condition is really needed, because that's what a LEFT JOIN does in this case. Anywhere responses.StrategyFk IS NULL will result in no match to the strategies table, and it wil return a row for that.
See this link for a simple explanation of joins: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
After that, if you're still having performance issues then you can start looking at the EXPLAIN SELECT ... ; output and looking for indexes that may need to be added. Optimizing Queries With Explain -- MySQL Manual
Try using explicit JOINs:
...
FROM responses a
INNER JOIN subtestinstances b
ON b.id = a.subtestinstancefk
INNER JOIN students c
ON c.id = b.studentfk
INNER JOIN schools d
ON d.id = c.schoolfk
INNER JOIN items e
ON e.id = a.itemfk
INNER JOIN subtests f
ON f.id = b.subtestfk
LEFT JOIN strategies g
ON g.id = a.strategyfk