MySQL - Working Back Through Multiple Tables by FK - mysql

I have a database which isn't of my own creation. I need to extract some specific data from it but I'm struggling to get my head around how to get the data back without doing multiple queries and looping over the result set in my code. I've looked around at other questions but haven't been able to get very far.
My data structure is (very condensed with non-relevant rows and columns omitted):
MyDb.Source
+--------+-------------+--------------------------------------+
| ID | SOURCE_TYPE | URL |
+--------+-------------+--------------------------------------+
| 10 | 3 | https://en.wikipedia.org |
+--------+-------------+--------------------------------------+
MyDb.Resource
+--------+------------+--------------------------------------+
| ID | SOURCE_FK | IDENTIFIER |
+--------+------------+--------------------------------------+
| 1 | 10 | All_Saints_Church,_Marple |
+--------+------------+--------------------------------------+
MyDb.Item_Base
+--------+-------------+--------------------------------------+
| ID | RESOURCE_FK | ITEM_TITLE |
+--------+-------------+--------------------------------------+
| 55 | 1 | All Saints Church, Marple |
+--------+-------------+--------------------------------------+
MyDb.Item
+--------+-------------+--------------------------------------+
| ID | BASE_FK | ITEM_DESCRIPTION |
+--------+-------------+--------------------------------------+
| 120 | 55 | Foo bar |
+--------+-------------+--------------------------------------+
Source - Resource is 1 to many.
Resource - Base is 1 to 1.
Item_Base - Item is 1 to 1.
What am I trying to do?
I want as few queries as possible to work back from MyDb.Source to find all items related to it. The only information I have in my hand is the ID for the source, which is 10. I want to end up with a result set of Item.ID which contains only those where Source.ID is 10.

I think you can just inner join the four tables together in a single query. This should be safe, because in order for a relationship to exist between a source and an item, the latter must be reachable via a key relationship.
SELECT
t1.ID AS source_id,
t4.*
FROM Source t1
INNER JOIN Resource t2
ON t1.ID = t2.SOURCE_FK
INNER JOIN Item_Base t3
ON t2.ID = t3.RESOURCE_FK
INNER JOIN Item t4
ON t3.ID = t4.BASE_FK
WHERE
t1.ID = 10

You need INNER JOINs in your query. Then it's possible in one simple query:
SELECT
i.ID
FROM
Source s
INNER JOIN Resource r
ON s.ID = r.SOURCE_FK
INNER JOIN Item_Base b
ON r.ID = b.RESOURCE_FK
INNER JOIN Item i
ON b.ID = i.BASE_FK
WHERE
s.ID = 10
See https://dev.mysql.com/doc/refman/5.7/en/join.html and http://www.mysqltutorial.org/mysql-inner-join.aspx for more info and examples relating to joins and how to use them.

Related

How to select and join into multiple tables with row names

Billing table
| mainid | subID | subid_name | subid_value |
|----------|---------|-------------|--------------|
| 100 | 3478 | name | Ali Baba |
| 100 | 2373 | school_type | ghetto |
| 100 | 2989 | school_loc | 41000 |
| 100 | 9824 | fee_sem | 40 |
| 100 | 283 | desc | thieves |
| 100 | 32383 | CGPA_grade | 2.9 |
Hi all,
I'm trying to work from this table (let's call it the billing table) and inner/left join into other tables based on the subid_values obtained from multiple where conditions.
For example using the subid_name = school_loc returns subid_value of 41000, which from here i can get more information about the school location if i join it into the location_info table.
However the problem comes when I also need to return values like fee_sem and CGPA_grade and school_type from the original billing table as part of the query result as I have already used the "where subid_name = school_loc".
I'd like to also join into other tables based on different subid_values that are based on different subid_name(s) like school_type, fee_sem, CGPA_grade
I have tried and cried trying to self-join into itself based on the main ID, I also tried to do the nested select without much success. I was told that this is how the data will be and there will be no change of the developer correcting the table structure which should have been transposed and columnised in the first place. Usually MS SQL and better structured DB pose no issue, however this was done on a MySQL DB which I'm not that good at using MySQL, but with the row data which should have been in columns, I need to ask for help.
select * from billing
where main_id = 100
and subid_name = **'school_loc'**
so this will return one line only with subid_value = 41000
I would inner join this subID value with location_info table and select more columns (location_name, location_state) from location_info table by using the value of billing.subid_value = 41000
select billing.mainID
,billing.subID
,billing.subid_value as School_Postcode
,location_info.location_name
,location.info,location_state
from dbo.billing
inner join dbo.location_info on billing.subid_value = location_info.ID
where subid_name is billing.school_loc
OK first inner join is done (based on where billing.subid_name = school_loc).
I'm stuck here where I need to combine the below with different where of subid_name and expecting different subid_value and taking the resultant subid_value into an inner join
select billing.mainID
,billing.subID
,billing.subid_value as Fee_Class
,fee_ranking.FeeAffordable
,fee_ranking.FeeSubsidy
inner join dbo.fee_ranking on billing.subid_value = fee_ranking.class
where billing.main_id = 100
and billing.subid_name = **'fee_sem'**
so this will return one line only (the one with fee_sem = 40)
(combining with another inner join into fee_ranking_table)
I would also like to combine more than one where of subid_name and expecting different subid_value and taking the resultant subid_value into an inner join
select billing.mainID
,billing.subID
,billing.subid_value as CGPA_Score
,CGPA_ranking.IsSmart
,CGPA_ranking.IsHardToEnter
inner join dbo.CGPA_ranking on billing.subid_value = CGPA_ranking.score
where billing.main_id = 100
and billing.subid_name = **'CGPA_grade'**
so this will return one line only (the one with CGPA_grade = 2.9)
I'm trying to achieve the output of
| mainid(from billing) | school_postcode | location_name (from location_info) | location_state (from location_info) | Fee_Class | FeeAffordable (from fee_ranking) | CGPA_Score | IsSmart (from CGPA_ranking) |
-----------------------|-----------------|------------------------------------|-------------------------------------|-----------|----------------------------------|------------|-----------------------------|
|100 |41000 |Boston |MA |40 |False |2.9 |False |
It's a bit tedious but you just have to have as many joins (or possibly left joins) as there are subid_names (or possibly subids) in your data assigning an alias to each so that you can add data from other tables.
with cte as
(select mainid from t where subid_name = 'school_loc')
select cte.mainid,t1.subid_value,t2.subid_value,t3.subid_value,
t4.subid_value,t5.subid_value,t6.subid_value
from cte
join t t1 on t1.mainid = cte.mainid and t1.subid_name = 'name'
join t t2 on t2.mainid = cte.mainid and t2.subid_name = 'school_type'
join t t3 on t3.mainid = cte.mainid and t3.subid_name = 'school_loc'
join t t4 on t4.mainid = cte.mainid and t4.subid_name = 'fee_sem'
join t t5 on t5.mainid = cte.mainid and t5.subid_name = 'desc'
join t t6 on t6.mainid = cte.mainid and t6.subid_name = 'cgpa_grade';
+--------+-------------+-------------+-------------+-------------+-------------+-------------+
| mainid | subid_value | subid_value | subid_value | subid_value | subid_value | subid_value |
+--------+-------------+-------------+-------------+-------------+-------------+-------------+
| 100 | Ali Baba | ghetto | 41000 | 40 | thieves | 2.9 |
+--------+-------------+-------------+-------------+-------------+-------------+-------------+
1 row in set (0.001 sec)

Mysql select rows with same id's (3 tables)

I have the following tables:
'blog_content'
'blog_media'
'blog_media_content'
| blog_id | media_id |
========================
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 3 | 4 |
I want to select all blog_media.uri's where blog_media.media_id equals blog_media_content.blog_id.
Please help me to achieve my aim.
An inner join between blog_media and blog_media_content tables would suffice.
SELECT
bm.uri
FROM blog_media bm
INNER JOIN blog_media_content bmc ON bm.media_id = bmc.media_id
WHERE bmc.blog_id =3;
Note:
If you need any additional information from blog table then you need an additional inner join like below:
...INNER JOIN blog_table b ON bmc.blog_id = b.blog_id...
EDIT:
In order to get records for all blog_ids :
SELECT
bm.uri
FROM blog_media bm
INNER JOIN blog_media_content bmc ON bm.media_id = bmc.media_id
ORDER BY bmc.blog_id;

Select rows from two tables and exclude primary keys that exist in both tables

I have two tables of service providers, providers and providers_clean. providers contains many thousands of providers with very poorly formatted data, providers_clean only has a few providers which still exist in the 'dirty' table as well.
I want the system using this data to remain functional while the user is 'cleaning' the data up, so I'd like to be able to select all of the rows that have already been 'cleaned' and the rows that are still 'dirty' while excluding any 'dirty' results that have the same id as the 'clean' ones.
How can I select all of the providers from the providers_clean table merged with all of the providers from the providers table, and EXCLUDE the ones that have already been 'cleaned'
I've tried:
SELECT * FROM providers WHERE NOT EXISTS (SELECT * FROM providers_clean WHERE providers_clean.id = providers.id)
which gives me all of the 'dirty' results from providers EXCLUDING the 'clean' ones, but how can I rewrite the query to now merge all of the 'clean' ones from providers_clean?
Here's a visual representation of what I'm trying to do:
Clean Table
+----+-------------------+
| ID | Name |
+----+-------------------+
| 1 | Clean Provider 1 |
| 4 | Clean Provider 4 |
| 5 | Clean Provider 5 |
+----+-------------------+
Dirty Table
+----+------------------+
| ID | Name |
+----+------------------+
| 1 | Dirty Provider 1 |
| 2 | Dirty Provider 2 |
| 3 | Dirty Provider 3 |
| 4 | Dirty Provider 4 |
| 5 | Dirty Provider 5 |
+----+------------------+
Desired Result
+----+------------------+
| ID | Name |
+----+------------------+
| 1 | Clean Provider 1 |
| 2 | Dirty Provider 2 |
| 3 | Dirty Provider 3 |
| 4 | Clean Provider 4 |
| 5 | Clean Provider 5 |
+----+------------------+
Thanks
UPDATE
This is working, however, Is there a more efficient way to write this query?
SELECT providers.id AS id,
CASE
WHEN
providers_clean.id IS NOT NULL
THEN
providers_clean.provider_name
ELSE
providers.provider_name
END AS pname,
CASE
WHEN
providers_clean.id IS NOT NULL
THEN
providers_clean.phone
ELSE
providers.phone
END AS pphone,
CASE
WHEN
providers_clean.id IS NOT NULL
THEN
providers_clean.website
ELSE
providers.website
END AS pwebsite
FROM providers
LEFT JOIN providers_clean ON providers_clean.id = providers.id
ORDER BY providers.id asc
You need to do an outer join from Dirty to Clean (since Dirty has all rows Clean has, but not vice versa)
SELECT dirty.id AS id,
CASE
WHEN clean.id IS NOT NULL THEN clean.name
ELSE dirty.name
END AS new_name
FROM dirty
LEFT JOIN clean ON clean.id = dirty.id
ORDER BY dirty.id asc
Example
Seems like a LEFT JOIN is what you need:
SELECT COALESCE(pc.ID, p.ID), COALESCE(pc.Name, p.Name)
FROM providers AS p
LEFT JOIN providers_clean AS pc ON p.ID = pc.ID
What this query essentially does: if the record exists in the 'clean' table then select this one, otherwise select the one from the 'dirty' table.
I love and often refer to this
and this visual explanation of how JOINs work.
According to them you need a FULL OUTER JOIN excluding the items that are in both tables ("Outer Excluding JOIN"):
SELECT *
FROM providers p
FULL OUTER JOIN providers_clean pc
ON pc.id = p.id
WHERE p.id IS NULL OR pc.id IS NULL;
Update Unfortunately there's no FULL OUTER JOIN in MySQL so you have to emulate it. I used this answer to do that:
select p.*
from providers p left join providers_clean pc on pc.id = p.id
where pc.id is null
union all
select pc.*
from providers p right join providers_clean pc on pc.id = p.id;
The first SELECT are the dirty ones that have no clean counterpart and the second SELECT are simply the clean ones.

MySQL selective GROUP BY, using the maximal value

I have the following (simplified) three tables:
user_reservations:
id | user_id |
1 | 3 |
1 | 3 |
user_kar:
id | user_id | szak_id |
1 | 3 | 1 |
2 | 3 | 2 |
szak:
id | name |
1 | A |
2 | B |
Now I would like to count the reservations of the user by the 'szak' name, but I want to have every user counted only for one szak. In this case, user_id has 2 'szak', and if I write a query something like:
SELECT sz.name, COUNT(*) FROM user_reservations r
LEFT JOIN user_kar k ON k.user_id = r.user_id
LEFT JOIN szak s ON r.szak_id = r.id
It will return two rows:
A | 2 |
B | 2 |
However I want to every reservation counted to only one szak (lets say the highest id only). I tried MAX(k.id) with HAVING, but seems uneffective.
I would like to know if there is a supported method for that in MySQL, or should I first pick all the user ID-s on the backend site first, check their maximum kar.user_id, and then count only with those, removing them from the id list, when the given szak is counted, and then build the data back together on the backend side?
Thanks for the help - I was googling around for like 2 hours, but so far, I found no solution, so maybe you could help me.
Something like this?
SELECT sz.name,
Count(*)
FROM (SELECT r.user_id,
Ifnull(Max(k.szak_id), -1) AS max_szak_id
FROM user_reservations r
LEFT OUTER JOIN user_kar k
ON k.user_id = r.user_id
GROUP BY r.user_id) t
LEFT OUTER JOIN szak sz
ON sz.id = t.max_szak_id
GROUP BY sz.name;

When to use LEFT JOIN and when to use INNER JOIN?

I feel like I was always taught to use LEFT JOINs and I often see them mixed with INNERs to accomplish the same type of query throughout several pieces of code that are supposed to do the same thing on different pages. Here goes:
SELECT ac.reac, pt.pt_name, soc.soc_name, pt.pt_soc_code
FROM
AECounts ac
INNER JOIN 1_low_level_term llt on ac.reac = llt.llt_name
LEFT JOIN 1_pref_term pt ON llt.pt_code = pt.pt_code
LEFT JOIN 1_soc_term soc ON pt.pt_soc_code = soc.soc_code
LIMIT 100,10000
Thats one I am working on:
I see a lot like:
SELECT COUNT(DISTINCT p.`case`) as count
FROM FDA_CaseReports cr
INNER JOIN ae_indi i ON i.isr = cr.isr
LEFT JOIN ae_case_profile p ON cr.isr = p.isr
This seems like the LEFT may as well be INNER is there any catch?
Is there any catch? Yes there is -- left joins are a form of outer join, while inner joins are a form of, well, inner join.
Here's examples that show the difference. We'll start with the base data:
mysql> select * from j1;
+----+------------+
| id | thing |
+----+------------+
| 1 | hi |
| 2 | hello |
| 3 | guten tag |
| 4 | ciao |
| 5 | buongiorno |
+----+------------+
mysql> select * from j2;
+----+-----------+
| id | thing |
+----+-----------+
| 1 | bye |
| 3 | tschau |
| 4 | au revoir |
| 6 | so long |
| 7 | tschuessi |
+----+-----------+
And here we'll see the difference between an inner join and a left join:
mysql> select * from j1 inner join j2 on j1.id = j2.id;
+----+-----------+----+-----------+
| id | thing | id | thing |
+----+-----------+----+-----------+
| 1 | hi | 1 | bye |
| 3 | guten tag | 3 | tschau |
| 4 | ciao | 4 | au revoir |
+----+-----------+----+-----------+
Hmm, 3 rows.
mysql> select * from j1 left join j2 on j1.id = j2.id;
+----+------------+------+-----------+
| id | thing | id | thing |
+----+------------+------+-----------+
| 1 | hi | 1 | bye |
| 2 | hello | NULL | NULL |
| 3 | guten tag | 3 | tschau |
| 4 | ciao | 4 | au revoir |
| 5 | buongiorno | NULL | NULL |
+----+------------+------+-----------+
Wow, 5 rows! What happened?
Outer joins such as left join preserve rows that don't match -- so rows with id 2 and 5 are preserved by the left join query. The remaining columns are filled in with NULL.
In other words, left and inner joins are not interchangeable.
Here's a rough answer, that is sort of how I think about joins. Hoping this will be more helpful than a very precise answer due to the aforementioned math issues... ;-)
Inner joins narrow down the set of rows returns. Outer joins (left or right) don't change number of rows returned, but just "pick up" additional columns if possible.
In your first example, the result will be rows from AECounts that match the conditions specified to the 1_low_level_term table. Then for those rows, it tries to join to 1_pref_term and 1_soc_term. But if there's no match, the rows remain and the joined in columns are null.
An INNER JOIN will only return the rows where there are matching values in both tables, whereas a LEFT JOIN will return ALL the rows from the LEFT table even if there is no matching row in the RIGHT table
A quick example
TableA
ID Value
1 TableA.Value1
2 TableA.Value2
3 TableA.Value3
TableB
ID Value
2 TableB.ValueB
3 TableB.ValueC
An INNER JOIN produces:
SELECT a.ID,a.Value,b.ID,b.Value
FROM TableA a INNER JOIN TableB b ON b.ID = a.ID
a.ID a.Value b.ID b.Value
2 TableA.Value2 2 TableB.ValueB
3 TableA.Value3 3 TableB.ValueC
A LEFT JOIN produces:
SELECT a.ID,a.Value,b.ID,b.Value
FROM TableA a LEFT JOIN TableB b ON b.ID = a.ID
a.ID a.Value b.ID b.Value
1 TableA.Value1 NULL NULL
2 TableA.Value2 2 TableB.ValueB
3 TableA.Value3 3 TableB.ValueC
As you can see, the LEFT JOIN includes the row from TableA where ID = 1 even though there's no matching row in TableB where ID = 1, whereas the INNER JOIN excludes the row specifically because there's no matching row in TableB
HTH
Use an inner join when you want only the results that appear in both tables that matches the Join condition.
Use a left join when you want all the results from Table A, but if Table B has data relevant to some of Table A's records, then you also want to use that data in the same query.
Use a full join when you want all the results from both Tables.
For newbies, because it helped me when I was one: an INNER JOIN is always a subset of a LEFT or RIGHT JOIN, and all of these are always subsets of a FULL JOIN. It helped me understand the basic idea.