mySQL JOIN usage [duplicate] - mysql

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
INNER JOIN ON vs WHERE clause
I started as web developer about 1 year ago. Since then I've gone through PHP, mySQL, Javascript, jQuery, etc. I think I've learned some things in this time. But when it comes to SQL I'm a real noob. I just do a couple of SELECT, INSERT, UPDATE, and using some functions like SUM or UNIX_TIMESTAMP, etc.
I've come across the JOIN functions, and it seems to be pretty useful, but I don't see the difference of using JOIN or some WHERE clauses to "join" the data betwwen tables.
I read this article (spanish) about join. and I can't really see the usefulness of it. For example:
Assuming this dataset:
id nombre id nombre
-- ---- -- ----
1 Pirata 1 Rutabaga
2 Mico 2 Pirata
3 Ninja 3 Darth Vader
4 Spaghetti 4 Ninja
Wouldn't this query: SELECT * FROM TablaA INNER JOIN TablaB ON TablaA.name = TablaB.name produce the same results as SELECT * FROM TablaA, TablaB WHERE TablaA.name = TablaB.name ?

Yes, they will produce the same result, however, SELECT * FROM TablaA, TablaB WHERE TablaA.name = TablaB.name is an older method of performing a join that does not meet the SQL-92 standard. It is also a lot more difficult to read once your queries start to become very complex. It is better to stick with the explicitly declared INNER JOIN format.

Your example happens to yield the same results, but realize there are other types of joins - LEFT JOIN being the only other one I usually use - and the concept of joining is separate from the concept of filtering the results with the WHERE clause. Sometimes WHERE clauses can get complicated enough without all the join conditions mixed in there as well. Keep your SQL easy to maintain and join in the JOIN clause, not in the WHERE clause.

Related

SQL Return All Rows Where 2 Values Are Repeated

I am sure this question has already been answered, but I can't find it or the answer was too complicated. I am new to SQL and am not sure how to word this generically.
I have a mySQL database of software installed on devices. My query to pull all the data has more fields and more joins, but for brevity I just included a few. I need to add another dimension to create a report that lists every case where a device has more than one installation of software from the same product family.
sample
Right now I have code kind of like this and it is not doing what I need. I have seen some info on exists but the examples didn't account for multiple joins so the syntax escapes me. Help?
select
devices.name,
sw_inventory.product,
products.family_name,
sw_inventory.ignore_usage,
from sw_inventory
inner join products
on sw_inventory.product=products.product_name
inner join devices
on sw_inventory.device_name=devices.name
where sw_inventory.ignore=0
group by devices.name, products.family_name
There are plenty of answers out there on this topic but I definitely understand not always knowing terminology. you are looking for how to find duplicates values.
Basically this is a two step process. 1 find the duplicates 2 relate that back to the original records if you want those. Note the second part is optional.
So to literally find all of the duplicates of the query you provided
ADD HAVING COUNT(*) > 1 after group by statements. If you want to know how many duplicates add a calculated column to count them.
select
devices.name,
sw_inventory.product,
products.family_name,
sw_inventory.ignore_usage,
NumberOfDuplicates = COUNT(*)
from sw_inventory
inner join products
on sw_inventory.product=products.product_name
inner join devices
on sw_inventory.device_name=devices.name
where sw_inventory.ignore=0
group by devices.name, products.family_name
HAVING COUNT(*) > 1

Is this a MySQL JOIN? [duplicate]

This question already has answers here:
Explicit vs implicit SQL joins
(12 answers)
Closed 8 years ago.
SELECT
dim_date.date, dim_locations.city, fact_numbers.metric
FROM
dim_date, fact_numbers
WHERE
dim_date.id = fact_numbers.fk_dim_date_id
AND
dim_locations.city = "Toronto"
AND
dim_date.date = 2010-04-13;
Since I'm not using the JOIN command, I'm wondering if this is indeed a JOIN (and if not, what to call it)?
This is for a dimensional model by the way which is using surrogate and primary keys to match up details
Yes, it is joining the tables together so it will provide related information from all of the tables.
The one major downfall of linking tables via WHERE instead of JOIN is not being able to use LEFT, RIGHT, and Full to show records from one table even if missing in the other.
However, your SQL statement is invalid. You are not linking dim_locations to either of the other tables and it is missing in the FROM clause.
Your query using an INNER JOIN which is comparable to your WHERE clause may look something like the following:
SELECT DD.date, DL.city, FN.metric
FROM dim_date AS DD
JOIN fact_numbers AS FN ON DD.id = FN.fk_dim_date_id
JOIN dim_locations AS DL ON DL.id = FN.fk_dim_locations_id
WHERE DL.city = 'Toronto'
AND DD.date = 2010-04-13
yes, it is joining tables. It is called non-ANSI JOIN syntax when join clause is not used explicitly. And when join clause is used, it is ANSI JOIN
If you reference two different tables in a where clause and compare their referential IDs, then yes, it acts the same as a JOIN.
Note however that this can be very inefficient if the optimizer doesn't optimize it properly see: Is there something wrong with joins that don't use the JOIN keyword in SQL or MySQL? and INNER JOIN keywords | with and without using them

Select taking too long. Need advice for a better performance

Ok, here we go. There's this messy SELECT crossing other tables and ordering to get the one desired row. Basically I do the "math" inside the ORDER BY.
1 base table.
7 JOINS poiting to local tables.
WHERE with 2 clauses and a NOT IN crossing another table.
You'll see in the code the ORDER BY is pretty damn big/ugly, it sums the result of 5 different calculations. I need that result to order by those calculations in order to get the worst row-case.
The problem is once I execute the Stored Procedure it takes up to 8 seconds to run. That's kind of non-acceptable. So, I'm starting to check Indexes.
So, I'm looking for advices on how to make this query run faster.
I'm indexing the WHERE clauses and the field LINEA, Should I index something else? Like the rows Im crossing for the JOINs? or should I approach the query differently?
Query:
SET #LINEA = (
SELECT TOP 1
BOA.LIN
FROM
BAND_BA BOA
LEFT JOIN
TEL PAR
ON REPLACE(BOA.Lin,'-','') = SUBSTRING(PAR.Te,2,10)
LEFT JOIN
TELP CLP
ON REPLACE(BOA.Lin,'-','') = SUBSTRING(CLP.Numtel,2,10)
LEFT JOIN
CA C
ON REPLACE(BOA.Lin,'-','') = C.An
LEFT JOIN
RE R
ON REPLACE(BOA.Lin,'-','') = R.Lin
LEFT JOIN
PRODUCTOS2 P2
ON BOA.PRODUCTO = P2.codigo
LEFT JOIN
EN
ON REPLACE(BOA.Lin,'-','') = EN.G
LEFT JOIN
TIP ID
ON TIPID = ID.ID
WHERE
BOA.EST = 'C' AND
ID.SE = 'boA' AND
BOA.LIN NOT IN (
SELECT
LIN
FROM
BAN
)
ORDER BY (EN.VALUE + ANT.VALUE + REIT.VAL + C.VALUE + TEL.VALUE
) DESC,
I'll be frank, this is some pretty terrible SQL. Without seeing all your table structures, advice here will be incomplete. That being said, please don't post all your table structures because you are already very close to "hire a consultant" territory with this.
All the REPLACE logic should be done away with. If you need to JOIN on these fields, then add comparable fields to the tables so you don't need to manipulate the data. Every single JOIN that uses a REPLACE or SUBSTRING is a table or index scan - those are non-SARGable and a definite anti-pattern.
The ORDER BY is probably the most convoluted ORDER BY I have ever seen. Some major issues there:
Subqueries should all be eliminated and materialized either in the outer query or as variables
String manipulation should be eliminated (see item 1 above)
The entire query is basically a code smell. If you need to write code like this to meet business requirements then you either have a terribly inappropriate design or some other much larger issue in the organization or data.
One thing that can kill performance is using a lot of LEFT JOINs. To improve performance of LEFT JOIN, you might want to make sure that the column(s) to which you join have an index - that can have a huge impact on performance.

what is difference between 'where' and different 'joins' in mysql? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
In MySQL queries, why use join instead of where?
Joins are always confusing for me..can anyone tell me what is difference between different joins and which one is quickest and when to use which join and what is difference between join and where clause ?? Plz give ur answer in detail as i already read about joins on some websites but didn't get the concept properly.
Rather than quote an entire Wikipedia article, I'm going to suggest your their article on SQL Joins.
An SQL JOIN clause combines records
from two or more tables in a
database. It creates a set that
can be saved as a table or used as is.
A JOIN is a means for combining
fields from two tables by using values
common to each. ANSI standard SQL
specifies four types of JOINs: INNER,
OUTER, LEFT, and RIGHT. In special
cases, a table (base table, view, or
joined table) can JOIN to itself in a
self-join.
A programmer writes a JOIN predicate
to identify the records for joining.
If the evaluated predicate is true,
the combined record is then produced
in the expected format, a record set
or a temporary table, for example.
There is an older syntax that uses WHERE clauses to imply INNER JOINs. While it works, and will produce queries that run exactly like queries specified with the INNER JOIN syntax, it is deprecated because most people find it more confusing.
Here's the documentation for the MySQL JOIN syntax.
This query:
SELECT c.customer_name, o.order_id, o.total
FROM customers c, orders o
WHERE c.id = o.customer_id
and this
SELECT c.customer_name, o.order_id, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
make no difference on what is executed on the mysql server but the join is (imho) way more readable. Expecially for more complex joins:
Here is an article about the topic: http://www.mysqlperformanceblog.com/2010/04/14/is-there-a-performance-difference-between-join-and-where/

Whats the advantage of using INNER JOIN? [duplicate]

This question already has answers here:
INNER JOIN ON vs WHERE clause
(12 answers)
Closed 7 years ago.
I see that this example:
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID=Customers.CustomerID;
produces the same output as this example:
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders, Customers
WHERE Orders.CustomerID=Customers.CustomerID;
Is there any advantage in the INNER JOIN command?
Your former example is standard ANSI/ISO SQL. Your latter example is deprecated SQL. With the former syntax (ANSI/ISO) , the join criteria are clearly delineated; with the latter (deprecated) syntax, the join criteria are intermixed with the where clause filters making things much more difficult to sort out.
It doesn't matter so much for simply queries like yours, but when you have to deal with much more complicated queries (try 12 or 15 tables with a mixture of left, right and inner joins!), you will understand why the old syntax is deprecated.
INNER JOIN is for clarity. Using WHERE as in your second example can get lost in more complex queries, making it difficult to determine how your tables are joined and make it more difficult to troubleshoot problems.