Can't understand. Is this a subquery? - mysql

I have something in a query that I have to edit, that I don't understand.
There are 4 tables that are joined: tickets, tasks, tickets_users, users. The whole query is not important, but you have an example at the end of the post. What bugs me is this kind of code used many times in relation to other tables:
(SELECT name
FROM users
WHERE users.id=tickets_users.users_id
) AS RequesterName,
Is this a subquery with the tables users and tickets_users joined? What is this?
WHERE users.id=tickets_users.users_id
If this was a join I would have expected to see:
ON users.id = tickets_users.users_id
And how is this different from a typical join? Just use the same column definition: users.name and just join with the users table.
Can anyone enlighten me on the advanced SQL querying prowess of the original author?
The query looks like this:
SELECT
description,
(SELECT name
FROM users
WHERE users.id = tickets_users.users_id) AS RequesterName,
(SELECT description
FROM tickets
WHERE tickets.id = ticket_tasks.tickets_id) AS TicketDescription,
ticket_tasks.content AS TaskDescription
FROM
ticket_tasks
RIGHT JOIN
tickets ON ticket_tasks.tickets_id = tickets.id
INNER JOIN
tickets_users ON tickets_users.tickets_id = tickettasks.tickets_id
Thanks,

This is what is called a correlated subquery. To describe it in simple terms its doing a select inside a select.
However doing this more than once in ANY query is not recommended AT ALL.. the performance issue with this will be huge.
A correlated subquery will return a row by row comparison for each row of the select... if that doesnt make sense then think of it this way...
SELECT
id,
(SELECT id FROM tableA AS ta WHERE ta.id > t.id)
FROM
tableB AS t;
This will do for each row in tableB, every row in tableA will be selected and compared to tableB id.
NOTE:
If you have 100 rows in all 4 tables and you do a correlated subquery for each one then you are doing 100*100*100*100 row comparisons. thats 100,000,000 (one hundred million) comparisons!
A correlated subquery is NOT a join, but rather a subquery..
SELECT *
FROM
(SELECT id FROM t -- this is a subquery
) AS temp
However, JOINs are different... generally you can do it one of these two ways
This is the faster way
SELECT *
FROM t
JOIN t1 ON t1.id = t.id
This is the slower way
SELECT *
FROM t, t1
WHERE t1.id = t.id
what the second join is doing is making the Cartesian Product of the two tables and then filtering out the extra stuff in the WHERE clause as opposed to the first JOIN that filters as it joins.
For the different types of joins theres a few and all are useful in their prospective actions..
INNER JOIN (same as JOIN)
LEFT JOIN
RIGHT JOIN
LEFT OUTER JOIN
RIGHT OUTER JOIN
In mysql FULL JOIN or FULL OUTER JOIN does not exist.. so in order to do a FULL join you need to combine a LEFT and RIGHT join. See this link for a better understanding of what joins do with Venn diagrams LINK
REMEMBER this is for SQL so it includes the FULL joins as well. those don't work in MySQL.

Related

What other approch I can use to select * while inner joining multiple tables?

I want to create View by selecting all data from multiple tables, but the error I got saying that I have duplicate columns
CREATE VIEW All_Data AS
SELECT *
FROM table1 tb1
INNER JOIN table 2 tb2 ON tb1.ID = tb2.ID
INNER JOIN table 3 tb3 ON tb2.ID = tb3.ID
INNER JOIN table 4 tb4 ON tb3.ID = tb4.ID
INNER JOIN table 5 tb5 ON tb4.ID = tb5.ID
INNER JOIN table 6 tb6 ON o.SpecialID = tb6.ID
INNER JOIN table 7 tb7 ON tb6.ID = tb7.ID
LEFT JOIN table 8 tb8 ON tb7.ID = tb8.ID
However, I am still having the same problem. I want to know is there a faster way to do that instead using alias selecting each column one by one.
DISTINCT won't help as its talking about duplicate columns vs duplicate rows.
Its the * that is causing the columns from all tables, with duplicate column names to be returned. You'll need to replace the * with explicit columns and alias them like below if both are needed.
SELECT p.created_date as product_created_date, order.created_date as order_created date .....
Note using a view isn't a needed pattern. Selects of exactly the right result are normally sufficient. Selects on views can suffer in performance as they are more complicated for MySQL to optimize. They are useful if you need an explicit GRANT on the them for a specific user however.
You need to provide the full list of columns that you want to be part of the view. There is no shortcut.
But yes, to ease your work, If you want the comma separated list of the column names of the table then you can use the following query for each table and put their columns in your select query. You will then just need to alias the columns which have same names.
Select listagg('pd.'|| column_name, ',')
Within group (order by column_id)
From user_tab_columns
Where table_name = 'your_table_name_in_capital';
Note that you need to replace 'pd.' for each table with the table alias in your query.

MySQL performance comparison between joining table and derived table

I have this two queries following and noticed they have a huge performance difference
Query1
SELECT count(distinct b.id) FROM tableA as a
LEFT JOIN tableB as b on a.id = b.aId
GROUP BY a.id
Query2
SELECT count(distinct b.id) FROM tableA as a
LEFT JOIN (SELECT * FROM tableB) as b on a.id = b.aId
GROUP BY a.id
the queries are basically joining one table to another and I noticed that Query1 takes about 80ms whereas Query2 takes about 2sec with thousands of data in my system. Could anyone explain me why this happens ? and if it's a wise choice to use only Query2 style whenever I am forced to use it ? or is there a better way to do the same thing but better than Query2 ?
When you replace tableB with (SELECT * FROM tableB) you are forcing the query engine to materialize a subquery, or intermediate table result. In other words, in the second query, you aren't actually joining directly to tableB, you are joining to some intermediate table. As a result of this, any indices which might have existed on tableB to make the query faster would not be available. Based on your current example, I see no reason to use the second version.
Under certain conditions you might be forced to use the second version though. For example, if you needed to transform tableB in some way, you might need a subquery to do that.

How to do a join on 2 tables, but only return the data for one table?

I am not sure if this is possible. But is it possible to do a join on 2 tables, but return the data for only one of the tables. I want to join the two tables based on a condition, but I only want the data for one of the tables. Is this possible with SQL, if so how? After reading the docs, it seems that when you do a join you get the data for both tables. Thanks for any help!
You get data from both tables because join is based on "Cartesian Product" + "Selection". But after the join, you can do a "Projection" with desired columns.
SQL has an easy syntax for this:
Select t1.* --taking data just from one table
from one_table t1
inner join other_table t2
on t1.pk = t2.fk
You can chose the table through the alias: t1.* or t2.*. The symbol * means "all fields".
Also you can include where clause, order by or other join types like outer join or cross join.
A typical SQL query has multiple clauses.
The SELECT clause mentions the columns you want in your result set.
The FROM clause, which includes JOIN operations, mentions the tables from which you want to retrieve those columns.
The WHERE clause filters the result set.
The ORDER BY clause specifies the order in which the rows in your result set are presented.
There are a few other clauses like GROUP BY and LIMIT. You can read about those.
To do what you ask, select the columns you want, then mention the tables you want. Something like this.
SELECT t1.id, t1.name, t1.address
FROM t1
JOIN t2 ON t2.t1_id = t1.id
This gives you data from t1 from rows that match t2.
Pro tip: Avoid the use of SELECT *. Instead, mention the columns you want.
This would typically be done using exists (or in) if you prefer:
select t1.*
from table1 t1
where exists (select 1 from table2 t2 on t2.x = t1.y);
Although you can use join, it runs the risk of multiplying the number of rows in the result set -- if there are duplicate matches in table2. There is no danger of such duplicates using exists (or in). I also find the logic to be more natural.
If you join on 2 tables.
You can use SELECT to select the data you want
If you want to get a table of data, you can do this,just select one table date
SELECT b.title
FROM blog b
JOIN type t ON b.type_id=t.id;
If you want to get the data from two tables, you can do this,select two table date.
SELECT b.title,t.type_name
FROM blog b
JOIN type t ON b.type_id=t.id;

Query performance difference

select * from table t inner join table_3 t3 on (t3.t_id=t.id) where t3.k_id IN(2,3,5);
select * from table t inner join table_3 t3 on (t3.t_id=t.id) where t3.k_id IN(select id from table_2);
How do these two statements differ as performance in big tables? In 2nd statement, is the inner "select" queried again and again or is it queried once only? Thanks
No, these two queries are quite different. Compare the two query execution plans with EXPLAIN, it'll show a DEPENDENT SUBQUERY select type for the second query. You can however optimize your query easily turning the dependent subquery intto a static subquery. It'll be something like
select *
from table t
inner join table_3 t3 on (t3.t_id=t.id)
where exists (select 1 from table2 where t3.k_id = table2.id);
Didn't try this out so please verify that both queries are equivalent.

MYSQL select query based on another tables entries

I have stumped on this as I am a total beginner in MySql.
Here is a the basic of how the two tables are formed
Table 1
id,product_id, product_name
Table 2
id,product_id,active
Now i know how to do a select statement to query the results from one table but when I have to involve two, I am lost. Not sure if I have to use inner join, left join etc.
So how can I return the results of the product_id from table 1 only if in table 2 is active?
You could use JOIN (as Fosco pointed out), but you can do the same thing in the WHERE clause. I've noticed that it's a bit more intuitive method than JOIN especially for someone who's learning SQL. This query joins the two tables according to product_id and returns those products that are active. I'm assuming "active" is boolean type.
SELECT t1.*
FROM Table1 t1, Table2 t2
WHERE t1.product_id = t2.product_id AND t2.active = TRUE
W3Schools has a good basic level tutorial of different kinds of JOINs. See INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN.
It's pretty simple to join two tables:
select t1.*
from Table1 t1
join Table2 t2 on t1.product_id = t2.product_id
where t2.active = 'Y'