LEFT JOIN - narrow things down - mysql

I'm currently having a problem with a legacy app I just inherited on my new job. I have a SQL query that's way too long to respond and I need to find a way to fasten it.
This query acts on 3 tables:
SESSION contains all users visits
CONTACT contains all the messages people have been sending through a form and contains a "session_id" field that links back to the SESSION id field
ACCOUNT contains users accounts (people who registered on the website) and whose "id" field is linked back in SESSION (through a "SESSION.account_id" field). ACCOUNT and CONTACT are no linked in any way, besides the SESSION table (legacy app...).
I can't change this structure unfortunately.
My query tries to recover ALL the interesting sessions to serve to the administrator. I need to find all sessions that links back to an account OR a contact form.
Currently, the query is structured like that :
SELECT s.id
/* a few fields from ACCOUNT and CONTACT tables */
FROM session s
LEFT JOIN account act ON act.id = s.account_id
LEFT JOIN contact c on c.session_id = s.id
WHERE s.programme_id = :program_id
AND (
c.id IS NOT NULL
OR
act.id IS NOT NULL
)
Problem is, the SESSION table is growing pretty fast (as you can expect) and with 400k records it slows things down for some programs ( :programme_id in the query).
I tried to use an UNION query with two INNER JOIN query, one between SESSION and ACCOUNT and the other one between SESSION and CONTACT, but it doesn't give me the same number of records and I don't really understand why.
Can somebody help me to find a better way to make this query ?
Thanks a lot in advance.

I think you just need indexes. For this query:
SELECT s.id
/* a few fields from ACCOUNT and CONTACT tables */
FROM session s LEFT JOIN
account act
ON act.id = s.account_id LEFT JOIN
contact c
ON c.session_id = s.id
WHERE s.programme_id = :program_id AND
(c.id IS NOT NULL OR act.id IS NOT NULL);
You want indexes on session(programme_id, account_id, id), account(id) and contact(session_id).
It is important that programme_id be the first column in the index on session.

#Gordon already suggested you add an index, which is generally the easy and effective solution, so I'm going to answer a different part of your question.
I tried to use an UNION query with two INNER JOIN query, one between
SESSION and ACCOUNT and the other one between SESSION and CONTACT, but
it doesn't give me the same number of records and I don't really
understand why.
That part is rather simple: the JOIN returns a result set that contains the rows of both tables joined together. So in the first case you would end up with a result that looks like
session.id, session.column2, session.column3, ..., account.id, account.column2, account.column3, ....
and a second where
session.id, session.column2, session.column3, ..., contact.id, contact.column2, contact.column3, ....
Then an UNION will faill unless the contact and account tables have the same number of columns with correspoding types, which is unlikely. Otherwise, the database will be unable to perform a UNION. From the docs (emphasis mine):
The column names from the first SELECT statement are used as the column names for the results returned. Selected columns listed in corresponding positions of each SELECT statement should have the same data type. (For example, the first column selected by the first statement should have the same type as the first column selected by the other statements.)
Just perform both INNER JOINs seperately and compare the results if you're unsure.
If you want to stick to an UNION solution, make sure to perform a SELECT only on corresponding columns : doing SELECT s.id would be trivial but it should work, for instance.

Related

Laravel Left Join with a tricky filter

I am assigning unique users to vouchers on my website.1 user may have more than one voucher assigned to them but cannot be assigned the same voucher twice.I have 2 mysql tables that I am fetching data from.
tbl_users
tbl_voucher_users
When a user click on a button on my website, they pass along a voucher_id with which I use to display eligible users that can be assigned this voucher ( I.e Users that have not been assigned to this voucher ).
Below is how I am getting the users where voucher_id = 8
$user_data = DB::table('users')
->leftJoin('voucher_users', 'users.id', '=', 'voucher_users.user_id')
->where('voucher_users.voucher_id','!=',8) //User not assigned this voucher
->select('users.*','users.id as userID','voucher_users.*')
->get();
My problem
I am able to left join without the where clause and get results from both Users table and Voucher_users table having eliminated all users assigned voucher_id=8.
However, the results also include users who are assigned other vouchers but also the voucher I am assigning.
i.e
Expected resulsts would be users: 8,11,12,13,14 having eliminated users: 1,4
But my current results are:4,8,11,12,13,14
How do I get rid of the user 4 to prevent double assignment?
Thanks to the suggestion above by #Kevin Lynch to use NOT EXIST .. I simplified the code to:
SELECT users.*
FROM
users
WHERE
NOT EXISTS(SELECT user_id FROM voucher_users WHERE voucher_users.user_id = users.id AND voucher_id=9)
It works so far, I can then covert it to Laravel style
NOT EXISTS would be a good solution if this were to remain a small project (or if you could put more constraints on the users to keep the dataset small perhaps by limiting based on user created date or similar).
However if you can't do such a thing, this query will eventually give you problems because under the hood, mysql will be running that sub-query for each record returned in the main query. You can check https://dev.mysql.com/doc/refman/5.6/en/subquery-materialization.html for more information.
A different solution which would handle the scaling quite a bit better would be to generate a temporary table of users that have the voucher you are looking to remove...
create temporary table tmp_voucher_user (user_id int not null, primary key (user_id)) as
select distinct user_id from voucher_users where voucher_id = 8;
Now that we have a table of users we which to remove, all we need to do is worry about a simple left join...
select users.*, user_voucher.*
from users
inner join user_voucher on users.id = user_voucher.user_id
left join tmp_voucher_user on users.id = tmp_voucher_user.user_id
where tmp_voucher_user.user_id is null -- this part is important, it's only going to grab users where there isn't a match on tmp_user_voucher
Unfortunately this isn't as clean as just doing a NOT EXISTS and I don't believe Laravel supports a way to build temporary tables outside of just writing a raw query but it should scale quite a bit better.

Mysql JOIN syntax error getting specified columns data

I have 2 tables, one is setting and one is accounts
setting has columns of: isVerified, customMessage, user
accounts has columns of: id, fullName, password, address, phone
I know I have to do join but how do I get only fullName from accounts table?
I did this
SELECT isVerified, customMessage, fullName
FROM setting FULL OUTER JOIN
accounts
ON setting.user = accounts.id;
but got error near the JOIN. What's wrong?
An inner join should suffice:
SELECT s.isVerified, s.customMessage, a.fullName
FROM setting s INNER JOIN
accounts a
ON s.user = a.id;
MySQL does not support FULL OUTER JOIN. Presumably, all accounts have settings and vice versa.
Note that I introduced table aliases so the query is easier to write and to read. And, in this query, all column names specify the table they come from.
I know I have to do join but how do I get only fullName from accounts
table?
If you only want fullName only specify fullName column in your select statement.
Select fullname FROM ....
As others have pointed out MySQL doesn't support FULL OUTER JOIN so change that to simply JOIN as Gordon Linoff has mentioned above.
Normally when you do a join you either want rows that match both the tables (setting and accounts in your case). Based on the columns you've described and depending on how you've designed your schema it's either a One to One relationship between two tables or One to Many. Your case sounds like one to one as each users account will have a setting.
You're joining on s.user = a.id but I don't see you mentioned s.user is actually same as a.id? What is the user field? Perhaps you need to name this better as s.id if it's actually an id. As others have pointed out please include your actual table definition so it's easier to figure out why you get the SQL error while running your query.
Good luck.

SQL JOIN Query's to collect data for two columns

One of my tables has a column named "t_name" which supplies an exact name (ie: Google)
And another table has two columns named
m_team_home and m_team_away
Both m_team_home and m_team_away would be INT's in the database but would grab the name from the first table. My joined query is only able to grab the home's team name, and I don't know how to get the away team's name because it will output the same thing.
I know it may be hard to explain, but much help would be appreciated.
sounds like you want to join on the table twice
SELECT a.team, a1.team
FROM table t
JOIN another_table a on a.m_team_home = t.id -- t.id or whatever is in that table that maps to the home / away teams
JOIN another_table a1 on a1.m_team_away = t.id
that way you can get the name for the home team and away team.. you may want to consider making those LEFT joins just incase one doesn't exist and it gets filtered out

How to combine 5 tables together with same ID in a query?

I have 5 different tables T_DONOR, T_RECIPIENT_1, T_RECIPIENT_2, T_RECIPIENT_3, and T_RECIPIENT_4. All 5 tables have the same CONTACT_ID.
This is the T_DONOR table:
T_RECIPIENT_1:
T_RECIPIENT_2:
This is what I want the final table to look like with more recipients and their information to the right.
T_RECIPIENT_3 and T_RECIPIENT_4 are the same as T_RECIPIENT_1 and T_RECIPIENT_2 except that they have different RECIPIENT ID and different names. I want to combine all 5 of these tables so on one line I can have the DONOR_CONTACT_ID which his information, and then all of the Recipient's information.
The problem is that when I try to run a query, it does not work because not all of the Donors have all of the recipient fields filled, so the query will run and give a blank table. Some instances I have a Donor with 4 Recipients and other times I have a Donor with only 1 Recipient so this causes a problem. I've tried running queries where I connect them with the DONOR_CONTACT_ID but this will only work if all of the RECIPIENT fields are filled. Any suggestions on what to do? Is there a way I could manipulate this in VBA? I only know some VBA, I'm not an expert.
First I think you want all rows from T_DONOR. And then you want to pull in information from the recipient tables when they include DONOR_CONTACT_ID matches. If that is correct, LEFT JOIN T_DONOR to the other tables.
Start with a simpler set of fields; you can add in the "name" fields after you get the joins set to correctly return the rest of the data you need.
SELECT
d.DONOR_CONTACT_ID,
r1.RECIPIENT_1,
r2.RECIPIENT_1
FROM
(T_DONOR AS d
LEFT JOIN T_RECIPIENT_1 AS r1
ON d.ORDER_NUMBER = r1.ORDER_NUMBER)
LEFT JOIN T_RECIPIENT_2 AS r2
ON d.ORDER_NUMBER = r2.ORDER_NUMBER;
Notice the parentheses in the FROM clause. The db engine requires them for any query which includes more than one join. If possible, set up your joins in Design View of the query designer. The query designer knows how to add parentheses to keep the db engine happy.
Here is a version without aliased table names in case it's easier to understand and set up in the query designer ...
SELECT
T_DONOR.DONOR_CONTACT_ID,
T_RECIPIENT_1.RECIPIENT_1,
T_RECIPIENT_2.RECIPIENT_1
FROM
(T_DONOR
LEFT JOIN T_RECIPIENT_1
ON T_DONOR.ORDER_NUMBER = T_RECIPIENT_1.ORDER_NUMBER)
LEFT JOIN T_RECIPIENT_2
ON T_DONOR.ORDER_NUMBER = T_RECIPIENT_2.ORDER_NUMBER;
SELECT T_DONOR.ORDER_NUMBER, T_DONOR.DONOR_CONTACT_ID, T_DONOR.FIRST_NAME, T_DONOR.LAST_NAME, T_RECIPIENT_1.RECIPIENT_1, T_RECIPIENT_1.FIRST_NAME, T_RECIPIENT_1.LASTNAME
FROM T_DONOR
JOIN T_RECIPIENT_1
ON T_DONOR.DONOR_CONTACT_ID = T_RECIPIENT_1.DONOR_CONTACT_ID
This shows you how to JOIN the first recipient table, you should be able to follow the same structure for the other three...

MySQL -- joining then joining then joining again

MySQL setup: step by step.
programs -> linked to --> speakers (by program_id)
At this point, it's easy for me to query all the data:
SELECT *
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
Nice and easy.
The trick for me is this. My speakers table is also linked to a third table, "books." So in the "speakers" table, I have "book_id" and in the "books" table, the book_id is linked to a name.
I've tried this (including a WHERE you'll notice):
SELECT *
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
JOIN books on speakers.book_id = books.book_id
WHERE programs.category_id = 1
LIMIT 5
No results.
My questions:
What am I doing wrong?
What's the most efficient way to make this query?
Basically, I want to get back all the programs data and the books data, but instead of the book_id, I need it to come back as the book name (from the 3rd table).
Thanks in advance for your help.
UPDATE:
(rather than opening a brand new question)
The left join worked for me. However, I have a new problem. Multiple books can be assigned to a single speaker.
Using the left join, returns two rows!! What do I need to add to return only a single row, but separate the two books.
is there any chance that the books table doesn't have any matching columns for speakers.book_id?
Try using a left join which will still return the program/speaker combinations, even if there are no matches in books.
SELECT *
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
LEFT JOIN books on speakers.book_id = books.book_id
WHERE programs.category_id = 1
LIMIT 5
Btw, could you post the table schemas for all tables involved, and exactly what output (or reasonable representation) you'd expect to get?
Edit: Response to op author comment
you can use group by and group_concat to put all the books on one row.
e.g.
SELECT speakers.speaker_id,
speakers.speaker_name,
programs.program_id,
programs.program_name,
group_concat(books.book_name)
FROM programs
JOIN speakers on programs.program_id = speakers.program_id
LEFT JOIN books on speakers.book_id = books.book_id
WHERE programs.category_id = 1
GROUP BY speakers.id
LIMIT 5
Note: since I don't know the exact column names, these may be off
That's typically efficient. There is some kind of assumption you are making that isn't true. Do your speakers have books assigned? If they don't that last JOIN should be a LEFT JOIN.
This kind of query is typically pretty efficient, since you almost certainly have primary keys as indexes. The main issue would be whether your indexes are covering (which is more likely to occur if you don't use SELECT *, but instead select only the columns you need).