SQL Optimization - Join different tables based on column value - mysql

I have a table that contains a column that acts as a "flag" which is used to decide which table to pull additional information from (i.e. the value 1 pulls from table1, 2 from table2, etc). Usually I would just join the table using indexes/keys. However the table that I could join contained information that could be normalized into separate tables which leaves me to this situation of using a column to decide which table to join.
So here is my question, what is the most efficient way to join different tables based on the value produced in this column?
Here are the two ways I know how to accomplish this task currently. I am pretty sure they are both not the optimal solution:
Pull the information from my main table (containing the column value that decides which table to join), and then through code in my application send additional queries to get the rest of the information.
Go join crazy, return the columns of every table (even if unused). Then, through my code, ignore the nulls of the tables not needed.

Definately not option 2. If you dont need the data dont retrieve it. Simple. It would be incredibly inefficient to join on tables (especially large ones) when you dont need the data. You could go with option 1 or use dynamic SQL to build up the query. I would then put some test cases together and run the execution plan to see how your query is performing.

Depending on the content of the other tables, I'd suggest a UNION - the columns returned need to be the same from each query. So you can do something like:
SELECT table1.title, tabel2.text FROM table1 INNER JOIN table2 ON table1.id=table2.id WHERE table1.key='2'
UNION
SELECT table1.title, tabel3.text FROM table1 INNER JOIN table3 ON table1.id=table3.id WHERE table1.key='3'
(Tweaking the SQL to make sure that it matches your schema, and indeed to avoid any mistakes I've added in)

I think it is possible:
create table a (id integer, flag boolean);
create table b (id integer, value_b varchar(30));
create table c (id integer, value_c varchar(30));
insert into a values (1, true), (2, false);
insert into b values (1, 'Val 1'), (2, 'Val 2');
insert into c values (1, 'C 1'), (2, 'C 2');
select a.id,
case when a.flag then b.value_b else c.value_c end AS value
from a
left join b using (id)
left join c using (id);
You can try it out.
Of course there're limitations:
number of columns is fixed, so you should go for NULLs if some values should be omitted;
you'll have to write a CASE ... END for each column;
you should know all joined tables in advance;
performance might not be the best.

Related

Does select * always return the same columns order?

E.g., table t1 has columns c1 & c2, is there any chance that select * from t1 returns tuples (c2, c1) instead of (c1, c2)?
And is there any chance that select * from (select * from t1) returns tuples (c2, c1) instead of (c1, c2)?
It will always return the same columns unless you drop a column and re add it. Then it will appear at the end. To be safe I would list the columns rather than using select *
In a simple case, selecting from one table, it should be consistent.
When the from includes multiple tables, it can be more complex.
In the following "NATURAL JOIN" example from Oracle, the first column(s) are the columns common to both tables, followed by the columns from the first table specified in the join (other than the common ones), and then those from the second table.
You can get into more complex situations where there are multiple common columns in the tables, in different order in each table, and with more than two tables used in the source....
create table tab_a
(id number(2,0) primary key,
value_a varchar2(20));
create table tab_b
(value_b varchar2(20),
id number(2,0) primary key
);
insert into tab_a values (10,'blue');
insert into tab_a values (20,'red');
insert into tab_b values ('square',10);
insert into tab_b values ('oval',20);
select * from tab_b natural join tab_a
ID VALUE_B VALUE_A
10.00 square blue
20.00 oval red
PS. Edit to add - Natural join syntax is a stupid idea, and shouldn't be used in practice. This is one reason why. More generally, if audit style columns get added to tables, such as CREATED_BY, it confuses the heck out of the SQL.

SQL split column by comma in where clause

I am trying to display where a record has multiple categories though my query only appears to be showing the first instance. I need for the query to be displaying the domain multiple times for each category it appears in
The SQL statement I have is
SELECT domains.*, category.*
FROM domains,category
WHERE category.id IN (domains.category_id)
Which gives me the below results
You should not store numeric values in a string. Bad, bad idea. You should use a proper junction table and the right SQL constructs.
Sometimes, we are stuck with other people's bad design decisions. MySQL offers find_in_set() to help in this situation:
where find_in_set(category.id, domains.category_id) > 0
Use find_in_set().
SELECT domains.*, category.*
FROM domains,category
WHERE find_in_set (category.id ,domains.category_id)
But it is very bad db design to store fk as a csv.
As others have pointed out, you can use FIND_NI_SET() as a workaround to solve your problem.
My suggestion is that you refactor your code and database a bit. Storing values in CSV format stored in a single column is almost always a bad idea.
As Gordon Linoff correctly points out you'd be better off if you'd create an additional table to store the category_id values:
CREATE TABLE domain_categories (domain_id INT, category_id INT, PRIMARY KEY (domain_id, category_id));
That's assuming you want to enfore that each category is only stored once against each domain. If you don't, just drop the primary key.
You'd then insert your IDs into this new table:
INSERT INTO domain_categories (domain_id, category_id) VALUES (2,6),(2,8);
or
INSERT INTO domain_categories (domain_id, category_id) VALUES (4,3);
INSERT INTO domain_categories (domain_id, category_id) VALUES (4,11);
INSERT INTO domain_categories (domain_id, category_id) VALUES (20,3);
Now that you have properly stored the data you can easily query as needed:
SELECT domains.*,category,*
FROM domains
JOIN domain_category ON (domain_category.domain_id=domains.id)
JOIN category ON (category.id=domain_category.category_id);
Using MySQL quirks you can even show the category_id column in CSV format.
SELECT domains.*, GROUP_CONCAT(DISTINCT domain_category.category_id)
FROM domains
JOIN domain_category ON (domain_category.domain_id=domains.id)
GROUP BY domains.id;

MySQL - Taking DISTINCT values from table A and turning into PRIMARY KEY on table B

Basically trying to fill a table with product codes from another table. The codes can be repeated in the Products table, as it has data about product sizes, however this value needs to be unique in ProductDescriptions.
I need there to be a new entry into the ProductDescriptions table every time there is a new DISTINCT product code in the Products table.
I've tried all sorts of things with no success, doesn't seem to that complex however its baffled me.
I have tried this:
INSERT INTO ProductDescriptionsbackup(ProductID) SELECT DISTINCT `PRODUCT CODE` FROM Products
However it doesn't run as it can't just take the DISTINCT values, and in the long run it doesn't update automatically, however that would be a secondary issue.
The main issue is getting the DISTINCT values into the ProductDescriptions table.
I've had a look around for answers but theres been nothing that stood out and made sense to me. Thanks in advance for any help received.
If you are only trying to insert new values, this should work. You could make a left join to the target table and only include records that don't exist (WHERE pbu.ProductID IS NULL).
INSERT INTO ProductDescriptionsbackup(ProductID)
SELECT DISTINCT p.`PRODUCT CODE`
FROM Products as p
LEFT JOIN ProductDescriptionsbackup as pbu
ON p.`PRODUCT CODE` = pbu.ProductID
WHERE pbu.ProductID IS NULL;
Lets break your question is two parts. First is to garanteee that ProductDescriptions is always updated when you insert a new row in the products, the second is how to insert the new values.
For the first part there are two option, you're responsible in the code to explicit insert the productdescription every time and everywhere you do a insert into the product table or you can create a trigger after insert to do that
create trigger ai_eav
after insert on eav
for each row
(...)
I recommend this second option so you won't forget to insert, but there are some folks that don't like triggers since they can become "invisible" to the programmer as they are usually forgotten.
For the second part, you can do a insert if not exists, which could be achieved by doing a insert with a left join
insert into ProductDescriptions(ProductID)
select distinct p.`PRODUCT CODE`
from Products
left join join ProductDescriptions on `PRODUCT CODE` = ProductID
where ProductID is null
if you opt for the trigger you can even take advantage of the pseudo row new and make the insert even faster since you'll be working with a single row, instead of working with the whole products table.

Inserting values into a third table by CROSS JOINing two tables

I want to insert values into a Table_C by cross joining two tables Table_A and Table_B.
Table_A contains two attributes ID and item.
Table_B contains two attributes ID and color.
Table_C contains four attributes ID,item,color,quantity.
All IDs have AUTO INCREMENT.
suppose each item can have all color and I need to create a relation about it.
How should I write a query for this? How could I reference a relation cross joining item and color.
What my solution is create an intermediate third table joining these two tables and then use that table to insert values into Table_C. But I am pretty sure that there is a better optimized solution for this.
Thanks in advance.
No need for a temp table... You can do:
insert into ...
select ... from ...
Write the query you'd need to "fill" that temp table you mention, and insert the rows directly into your final table.
Here is the query which worked for me.
INSERT INTO Table_C (SELECT null, Table_A.item, Table_B.color, null FROM
Table_A CROSS JOIN Table_B);

MSQL creating a table which automatically updates with data from other tables

I have two tables:
Table1 - Sales
id, Timestamp, Notes, CSR, Date, ExistingCustomer, PubCode, Price, Pub, Expiry, Size, KeyCode, UpSell, CustName, CustAddress, CustCity, CustState, CustPostal, CustCountry, CustPhone, CustEmail, CustCardName, CustCardNumber, CustCardType, CustCardExpiry, CustCardCode
Table2 - Refunds
id,Timestamp,CSR,Date,OrderId,Refunded,Saved,Publication
Basically, I want to create a table (MySQL) which will have some columns that are the same between the two tables and which will update automatically with the values from these two columns.
ie. Table3
Timestamp, CSR, Date, Publication
And this table would automatically update whenever a new record is posted into either of the other two tables, so it would essentially be a merged table.
Because there's nothing to join these two tables, I don't think the JOIN function would work here. Is there anyway I can do this?
You can use an trigger which actives on insert on both tables to make it automatically update.
As for combining tables with no common tables, view this question.
You need to use a stored procedure and a trigger on insert/update of the non merged table
There's got to be some way to join it, and in fact you mention Timestamp, CSR, Date, Publication.
You could join on them in a view. You could add table three and then add triggers though that would be an awful mess.
Why do you want to denormalise in this way?
How about Table3 is a unique key to use as a surrogate, and your 4 join fields, and then you take those out of Table 1 and 2 and replace them with the key the suurrogate key in table 3.
Then it'sa simple join query and no data duplication.