I have multiple (3+) Excel tables (only first row is given for presentation purpose for first 3 tables). What I am trying to achive is to show all rows from all the tables, and join them where a common value exists, i.e. I am trying to simulate FULL OUTER JOIN.
TBL1
IDno IDname d10 d30 datecreated
12-778 Robetrt 62 72 9/25/2020
TBL2
IDno IDname result3 datecreated
12-124 Sam2 text22 9/25/2018
TBL3
IDno IDname Area datecreated
12-324 Trinton 6.25 9/25/2013
Sometimes IDno between diferent tables are the same but in the majority of cases they are different.
I made a UNION query to return all the IDno from the tables:
SELECT TBL1.IDno, TBL1.IDname FROM TBL1 UNION SELECT TBL2.IDno, TBL2.IDname FROM TBL2 UNION SELECT TBL3.IDno, TBL3.IDname FROM TBL3;
I named the query IDs. Then, I would like to get a table which would show all the rows and also combine the rows where the IDno is the same:
RESULT
IDno IDname d10 d30 result3 Area
I used:
SELECT
IDs.IDno, IDs.IDname,
TBL1.d10, TBL1.d30,
TBL2.result3,
TBL3.Area
FROM ((IDs
left join TBL1 on IDs.IDno = TBL1.IDno)
left join TBL2 on TBL1.IDno = TBL2.IDno)
left join TBL3 on TBL1.IDno = TBL3.IDno
UNION
SELECT
IDs.IDno, IDs.IDname,
TBL1.d10, TBL1.d30,
TBL2.result3,
TBL3.Area
FROM ((IDs
right join TBL1 on IDs.IDno = TBL1.IDno)
right join TBL2 on TBL1.IDno = TBL2.IDno)
right join TBL3 on TBL1.IDno = TBL3.IDno
WHERE
IDs.IDno IS NULL;
The end result is LEFT JOINed tables with some additional rows that show IDno and IDname but no data from TBL2 or TBL3. I cannot figure out how to UNION left and right JOIN to achive FULL OUTER JOIN in this case. It works with 2 tables but not three or more in Acess. Is there a way to join multiple tables in Excel (Power query) or elsewhere without the need for SQL server?
Related
Hello (at start i wish to sorry for my bad english)
I have two tables: tbl1 and tbl2; both have same structure but dataset in them if from different sources and i can't mix them.
in another table tbl3 i have dataid and datasource.
what i wish to do is to select data from table linked in source.
pseudocode i try to produce:
SELECT
tbl3.dataid,
tbl3.datasource,
SEL_TBL.important_data,
SEL_TBL.another_thing,
SEL_TBL.something_completly_different
FROM
tbl3
LEFT JOIN
( SWITCH tbl3.datasource
CASE 'tbl1':
tbl1 AS >> SEL_TBL
CASE 'tbl2':
tbl2 AS >> SEL_TBL
)
ON
SEL_TBL.dataid = tbl3.dataid
i need result that contains: important_data, another_thing and of course something_completly_different from table selected in "switch statment".
what works right now :
SELECT
tbl3.dataid,
tbl3.datasource,
(
CASE
WHEN tbl3.datasource ='tbl1'
THEN
tbl1.important_data
ELSE
tbl2.important_data
END
) important_data
FROM
tbl3
LEFT JOIN
tbl2
ON
tbl2.dataid = tbl3.dataid
LEFT JOIN
tbl1
ON
tbl1.dataid = tbl3.dataid
in results of this query i got dataid, datasource and imporatn_data. I can ofcourse repeat whole case block for every single field but perhaps there is more civilized method.
oh and one more thing: tbl1.dataid and tbl2.dataid can get the same value (that's why i can't mix tables)
The best solution is to use a left join and include the join requirements.
Then coalesce gives you your results for each field.
Since this is a standard practice for star data models SQL optimizers will make it run quite fast.
SELECT
tbl3.dataid,
tbl3.datasource,
COALESCE(tbl1.important_data, tbl2.important_data) important_data
COALESCE(tbl1.another_thing, tbl2.another_thing) another_thing,
COALESCE(tbl1.something_completly_different, tbl2.something_completly_different) something_completly_different
FROM tbl3
LEFT JOIN tbl2 ON tbl2.dataid = tbl3.dataid AND tbl3.datasource = 'tbl2'
LEFT JOIN tbl1 ON tbl1.dataid = tbl3.dataid AND tbl3.datasource = 'tbl1'
Say I need to pull data from several tables like so:
item 1 - from table 1
item 2 - from table 1
item 3 - from table 1 - but select only max value of item 3 from table 1
item 4 - from table 2 - but select only max value of item 4 from table 2
My query is pretty simple:
select
a.item 1,
a.item 2,
b.item 3,
c.item 4
from table 1 a
left join (select b.key_item, max(item 3) from table 1, group by key_item) b on a.key_item = b.key_item
left join (select c.key_item, max(item 4) from table 2, group by key_item) c on c.key_item = a.key_item
I am not sure if my methodology of pulling just a single max item from a table is the most efficient. Assume both tables are over a million rows. my actual sql run forever using this sql setup.
EDIT: I changed the group by clause to reflect comments made. I hope it makes a bit of sense now?
Your best bet is to add an index on table1 and table2, as follows:
ALTER TABLE table1
ADD INDEX `GoodIndexName1` (`key_item`,`item3`)
ALTER TABLE table2
ADD INDEX `GoodIndexName2` (`key_item`,`item4`)
This will allow you to use queries as described in the MySQL documentation for finding the rows holding the group-wise maximum, which appears to be what you are looking for.
Your original (edited) query should work:
select
a.item1,
a.item2,
b.item3,
c.item4
from table1 a
LEFT OUTER JOIN (
SELECT
b.key_item,
MAX(item3) AS item3
FROM table1
GROUP BY key_item
) b
ON a.key_item = b.key_item
LEFT OUTER JOIN (
SELECT
c.key_item,
MAX(item4)
FROM table2
GROUP BY key_item
) c
ON c.key_item = a.key_item
and if that performs slowly after adding the indexes, try the following too:
SELECT
a.item1,
a.item2,
b.item3,
c.item4
FROM table1 a
LEFT OUTER JOIN table1 b
ON b.key_item = a.key_item
LEFT OUTER JOIN table1 larger_b
ON larger_b.key_item = b.key_item
AND larger_b.item3 > b.item_3
LEFT OUTER JOIN table2 c
ON c.key_item = a.key_item
LEFT OUTER JOIN table2 larger_c
ON larger_c.key_item = c.key_item
AND larger_c.item4 > c.item4
WHERE larger_b.key_item IS NULL
AND larger_c.key_item IS NULL
(I have modified the table and column names only slightly, so that they conform to correct MySQL syntax. )
I work with queries that use the above structure all the time, and they perform very efficiently with indexes like the one I provided.
That said, usually I am using INNER JOINs on the b and c tables, but I don't see why your query should have any issues.
If you do experience performance problems still, report the data types of the key_item columns for each table, as if you try to join on different data types, you will generally get poor performance.
Here's my problem all :
I have 2 big table call it A n B.
If I join that's 2 table with a very simple query like this example :
SELECT COUNT(*) FROM lib_judul, lib_buku
Then mysql process is not over yet, I don't know why. Table A have 158,670 records (33,6 MB) and Table B have 130,028 records (34,6 MB). I think myquery is right, cause I've try before to join table A with table C (the very smaller table one) and it's run well.
What should I do to do this?
You have implicit CROSS JOIN in your code which creates full Cartesian Product of the two tables. It creates a new table with 158,670 times 130,028 rows. This is more than 20 billion (20,631,542,760) records.
It's because there is no common field for both of the tables. Try using Explicit Join just like below:
SELECT
COUNT(*)
FROM lib_judul A
JOIN lib_buku B ON A.id=B.id
The cost of your query maybe is too large. Your query have cost = 158,670 x 130,028 = 20,631,542,760 I/O.
The query execution plan will execute join first, then select the column.
Know your need. May be you can add some "where condition" before you join it. Example:
this query: SELECT
COUNT(*)
FROM lib_judul A, lib_buku B
WHERE B.id = 1 AND B.id = A.id
can be optimized like this:
SELECT * FROM
(SELECT * FROM lib_judul) A
JOIN
(SELECT * FROM lib_buku WHERE lib_buku.id = 1) B
ON B.id = A.id
I am having 3 queries, which takes data from 3 different tables (with joins) and their column names are pretty much same (or I made them same by using ASkeyword). Once the 3 queries are completed, I want to combine their results, so it looks like they are coming from one table. Please have a look at the below codes.
1st Query
SELECT Client_Portfolio.*,
Client.Name,
Provider.Name,
"One" AS Income_Type,
One.`One_Gross_Fee` AS "Gross_Fee",
One.`One_V_Fee` AS "V_Fee",
One.`One_E_Fee` AS "E_Fee",
One.`One_I_Fee` AS "I_Fee",
One.`One_Tax_Provision` AS "Tax_Provision",
One.`One_Net_Income` AS "Net_Income",
"N/A" AS VAT,
One.`Updated_Date`
FROM Client_Portfolio
INNER JOIN Portfolio ON Portfolio.`idPortfolio` = Client_Portfolio.`idPortfolio`
INNER JOIN Client ON Client.idClient = Client_Portfolio.idClient
JOIN Provider ON Provider.idProvider = Portfolio.idProvider
INNER JOIN One ON One.idPortfolio = Portfolio.idPortfolio
2nd Query
SELECT Client_Portfolio.*,
Client.Name,
Provider.Name,
"Two" AS Income_Type,
Two.`Two_Gross_Fee` AS "Gross_Fee",
Two.`Two_V_Fee` AS "V_Fee",
Two.`Two_E_Fee` AS "E_Fee",
Two.`Two_I_Fee` AS "I_Fee",
Two.`Two_Tax_Provision` AS "Tax_Provision",
Two.`Two_Net_Income` AS "Net_Income",
Two.`Two_Vat` AS VAT,
Two.`Updated_Date`
FROM Client_Portfolio
INNER JOIN Portfolio ON Portfolio.`idPortfolio` = Client_Portfolio.`idPortfolio`
INNER JOIN Client ON Client.idClient = Client_Portfolio.idClient
JOIN Provider ON Provider.idProvider = Portfolio.idProvider
INNER JOIN Two ON Two.idPortfolio = Portfolio.idPortfolio
3rd Query
SELECT Client_Portfolio.*,
Client.Name,
Provider.Name,
"Three" AS Income_Type,
Three.`Three_Gross_Fee` AS "Gross_Fee",
"N\A" AS "V_Fee",
Three.`Three_E_Fee` AS "E_Fee",
"N\A" AS "I_Fee",
Three.`Three_Tax_Provision` AS "Tax_Provision",
Three.`Three_Net_Income` AS "Net_Income",
Three.`Three_Vat` AS VAT,
Three.`Updated_Date`
FROM Client_Portfolio
INNER JOIN Portfolio ON Portfolio.`idPortfolio` = Client_Portfolio.`idPortfolio`
INNER JOIN Client ON Client.idClient = Client_Portfolio.idClient
JOIN Provider ON Provider.idProvider = Portfolio.idProvider
INNER JOIN Three ON Three.idPortfolio = Portfolio.idPortfolio
Once these queries are done, I want to combine their results. Which means, Rows returned by the 2nd Query will be appended after the rows returned by the 1st query. Rows returned by the 3rd query will be appended after the rows returned by the 2nd query. Finally, I want to sort the final result by Updated_Date
How can I do this?
Use UNION to combine the queries:
SELECT one_fields FROM Client_Portfolio ...
UNION
SELECT two_fields FROM Client_Portfolio ...
UNION
SELECT three_fields FROM Client_Portfolio ...
Sorting can be done by appending an order by clause after the last query, as follows:
SELECT one_fields FROM Client_Portfolio ...
UNION
SELECT two_fields FROM Client_Portfolio ...
UNION
SELECT three_fields FROM Client_Portfolio ...
ORDER BY field1, field2, field3...;
Note that field1, field2... can be field names or field numbers (starting from 1).
Sorry if my title is odd, but I'm not even sure how to word my problem in this description let alone a short title.
We have 1000 users. 400 of them are new. 500 of them have updated their profiles with the new fields we added. 100 have not updated their profiles.
When I try to pull data on a specific field I get 900 results.
Select j1.question, j1.response
FROM table1 t1
JOIN table2 j1 on t1.id_user = j1.iduser AND j1.idquestion IN (26)
This is missing the 100 users that haven't updated their profile using the new profile questions.
When I try to pull data on that specific field to include the old profile question that was similar I get 1500 results.
Select j1.question, j1.response
FROM table1 t1
JOIN table2 j1 on t1.id_user = j1.iduser AND (j1.idquestion IN (26) OR j1.idquestion IN (8))
This pulls the 900 results from 26 as well as the original 600 users result from 8.
So my question is, how do I only get the data of the idquestion IN (26) and then the 100 left over from idquestion IN (8)?
This will get you the 100 users that where 'missing' in your first query. I am not sure I understand what you want with quesionID(8).
SELECT t1.question, t1.response
FROM table1 t1
LEFT OUTER JOIN table2 j1 on t1.id_user = j1.iduser
AND j1.idquestion IN (26)
WHERE j1.iduser IS NULL
SELECT IF(q26.question IS NOT NULL, q26.question, q8.question) as question, IF(q26.response IS NOT NULL, q26.response, q8.response) as response
FROM table1 t1
LEFT JOIN table2 as q26
ON
(t1.id_user = q26.iduser
AND q26.idquestion = 26)
LEFT JOIN table2 as q8
ON
(t1.id_user = q8.iduser
AND q8.idquestion = 8);
This should work. Starting with table1 and left joining ensures you get one answer for each user. Joining q26 will join the q26 values if they exist and null otherwise. Joining q8 will do the same in additional columns.
You end up with table1 with some columns that only apply to question26 (or null), followed by columns that only apply to question 8 (or null). Then, if you use IF() in your selects, you can choose the right columns.