when to use join or simple 2 table condition? [duplicate] - mysql

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
SQL left join vs multiple tables on FROM line?
I think this is much clearer:
"SELECT * FROM t1,t2 WHERE t2.foreignID = t1.id "
than a query with a JOIN.
are there any specs on when to use one or another?
Thanks

Personally, I prefer the explicitness of
SELECT t1.*, t2.*
FROM t1
JOIN t2 ON t1.id = t2.foreignID
I like to see exactly what my JOIN conditions are, and the I use WHERE to further filter the results (current year, only a certain user, etc). It shouldn't matter in a simple query like this, but will definitely help with longer, more complex queries.
It doesn't make sense to me to have one style for shorter queries and a different for longer ones. Usually the simple ones turn into complex queries soon enough.

This is a cartesian join which is actually very, very slightly different from an inner join. More than 99.99% of the time, you will get identical results. But there are edge cases that will run much slower, or possibly give extra rows. But you are more likely to run into parsing problems when you add left joins. The MySQL documentation gives a little bit of explanation, but not much. The differences lie in how the query optimiser chooses how to scan the tables.
This format also gets messy when you start adding other conditions in your WHERE clause.

Related

Want to Understand the performance of join in MYSQL

There are different types of join in mysql like below:
1.JOIN
2.LEFT JOIN
3.RIGHT JOIN
4.INNER JOIN
5.LEFT OUTER JOIN
6.RIGHT OUTER JOIN
And i want to know which one perform better in query. And how we decide that this one suitable for this query. As JOIN and INEER JOIN fetch same data.In this case which one suitable.
I will repeat what #Dai said in the comments, joins should be used based on which operation you need, not on performance. The answers to this question cover what the different types of joins are. In particular I like this visual explanation.
Analyzing why a query is slow is usually done with EXPLAIN. It will tell you the plan for the query and you can determine things like if its doing a full table scan and what rows might need to be indexed. Here is a good writeup of how to use an EXPLAIN.

MySQL - SELECT, JOIN

Few months ago I was programming a simple application with som other guy in PHP. There we needed to preform a SELECT from multiple tables based on a userid and another value that you needed to get from the row that was selected by userid.
My first idea was to create multiple SELECTs and parse all the output in the PHP script (with all that mysql_num_rows() and similar functions for checking), but then the guy told me he'll do that. "Okay no problem!" I thought, just much more less for me to write. Well, what a surprise when i found out he did it with just one SQL statement:
SELECT
d.uid AS uid, p.pasmo_cas AS pasmo, d.pasmo AS id_pasmo ...
FROM
table_values AS d, sectors AS p
WHERE
d.userid='$userid' and p.pasmo_id=d.pasmo
ORDER BY
datum DESC, p.pasmo_id DESC
(shortened piece of the statement (...))
Mostly I need to know the differences between this method (is it the right way to do this?) and JOIN - when should I use which one?
Also any references to explanations and examples of these two would come in pretty handy (not from the MySQL ref though - I'm really a novice in this kind of stuff and it's written pretty roughly there.)
, notation was replaced in ANSI-92 standard, and so is in one sense now 20 years out of date.
Also, when doing OUTER JOINs and other more complex queries, the JOIN notation is much more explicit, readable, and (in my opinion) debuggable.
As a general principle, avoid , and use JOIN.
In terms of precedence, a JOIN's ON clause happens before the WHERE clause. This allows things like a LEFT JOIN b ON a.id = b.id WHERE b.id IS NULL to check for cases where there is NOT a matching row in b.
Using , notation is similar to processing the WHERE and ON conditions at the same time.
This definitely looks like the ideal scenario for a join so you can avoid returning more data then you actually need. This: http://www.w3schools.com/sql/sql_join.asp or this: http://en.wikipedia.org/wiki/Join_(SQL) should help you get started with joins. I'm also happy to help you write the statement if you can give me a brief outline of the columns / data in each table (primarily I need two matching columns to join on).
The use of the WHERE clause is a valid approach, but as #Dems noted, has been superseded by the use of the JOINS syntax.
However, I would argue that in some cases, use of the WHERE clauses to achieve joins can be more readable and understandable than using JOINs.
You should make yourself familiar with both methods of joining tables.

Is nested select clause decreases the database performance??

I used to write select clause in side select clause to avoid joins in from clause. But I am afraid that is it a good coading practice or it will degrade database performance. Below is the query which contains multiple tables but I have written it using nested select clause without any join statement. Please let me know if I am making any mistake or it is ok. At this moment, I am getting accurate result.
SELECT * ,
(select POrderNo from PurchaseOrderMST POM
where POM.POrderID=CET.POrderID)as POrderNo,
(select SiteName from SiteTRS ST where ST.SiteID=CET.SiteID)as SiteName,
(select ParticularName from ParticularMST PM where
PM.ParticularID=CET.ParticularID)as ParticulerName
FROM ClaimExpenseTRS CET
WHERE ClaimID=#ClaimID
I'd use joins for this because it is best practice to do so and will be better for the query optimizer.
But for the learning just try to execute the script with join and without and see what happens on the query plan and the execution time. Usually this answers your questions right away.
Your solution is just fine.
As long as you are only using 1 column for each "joined" table, and has no multiple matching rows, it is fine. In some cases, even better than joining.
(the db engine could anytime change the direction of a join, if you are not using tricks to force a given direction, which could cause performance suprises. It is called query optimiyation, but as far as you really know your database, you should be the one to decide how the query should run).
I think you should JOIN indeed.
Now your creating your own JOIN with where and select statements.

What's better: joins or multiple sub-select statements as part of one query

Performance wise, what is better?
If I have 3 or 4 join statements in my query or use embedded select statements to pull the same information from my database as part of one query?
I would say joins are better because:
They are easier to read.
You have more control over whether you want to do an inner, left/right outer join or full outer join
join statements cannot be so easily abused to create query abominations
with joins it is easier for the query optimizer to create a fast query (if the inner select is simple, it might work out the same, but with more complicated stuff joins will work better).
embedded select's can only simulate left/right outer join.
Sometimes you cannot do stuff using joins, in that case (and only then) you'll have to fall back on an inner select.
It rather depends on your database: sizes of tables particularly, but also the memory parameters and sometimes even how the tables are indexed.
On less than current versions of MySQL, there was a real possibility of a query with a sub-select being considerably slower than a query that would return the same results structured with a join. (In the MySQL 4.1 days, I have seen the difference to be greater than an order of magnitude.) As a result, I prefer to build queries with joins.
That said, there are some types of queries that are extremely difficult to build with a join and a sub-select is the only way to really do it.
Assuming the database engine does absolutely no optimization, I would say it depends on how consistent you need your data to be. If you're doing multiple SELECT statements on a busy database, where the data you are looking at may change rapidly, you may run into issues where your data does not match up, between queries.
Assuming your data contains no inter-dependencies, then multiple queries will work fine. However, if your data requires consistency, use a single query.
This viewpoint boils down to keeping your data transactionally safe. Consider the situation where you have to pull a total of all accounts receivable, which is kept in a separate table from the monetary transaction amounts. If someone were to add another transaction in between your two queries, the accounts receivable total would not match the sum of the transaction amounts.
Most databases will optimize both queries below into the same plan, so whether you do:
select A.a1, B.b1 from A left outer join B on A.id = B.a_id
or
select A.a1, (select B.b1 from B where B.a_id = A.id) as b1 from A
It ends up being the same. However, in most cases for non-trivial queries you'd better stick with joins whenever you can, especially since some types of joins (such as an inner join) are not possible to achieve using sub-selects.

MySQL FULLTEXT Search Across >1 Table

As a more general case of this question because I think it may be of interest to more people...What's the best way to perform a fulltext search on two tables? Assume there are three tables, one for programs (with submitter_id) and one each for tags and descriptions with object_id: foreign keys referring to records in programs. We want the submitter_id of programs with certain text in their tags OR descriptions. We have to use MATCH AGAINST for reasons that I won't go into here. Don't get hung up on that aspect.
programs
id
submitter_id
tags_programs
object_id
text
descriptions_programs
object_id
text
The following works and executes in a 20ms or so:
SELECT p.submitter_id
FROM programs p
WHERE p.id IN
(SELECT t.object_id
FROM titles_programs t
WHERE MATCH (t.text) AGAINST ('china')
UNION ALL
SELECT d.object_id
FROM descriptions_programs d
WHERE MATCH (d.text) AGAINST ('china'))
but I tried to rewrite this as a JOIN as follows and it runs for a very long time. I have to kill it after 60 seconds.
SELECT p.id
FROM descriptions_programs d, tags_programs t, programs p
WHERE (d.object_id=p.id AND MATCH (d.text) AGAINST ('china'))
OR (t.object_id=p.id AND MATCH (t.text) AGAINST ('china'))
Just out of curiosity I replaced the OR with AND. That also runs in s few milliseconds, but it's not what I need. What's wrong with the above second query? I can live with the UNION and subselects, but I'd like to understand.
Join after the filters (e.g. join the results), don't try to join and then filter.
The reason is that you lose use of your fulltext index.
Clarification in response to the comment: I'm using the word join generically here, not as JOIN but as a synonym for merge or combine.
I'm essentially saying you should use the first (faster) query, or something like it. The reason it's faster is that each of the subqueries is sufficiently uncluttered that the db can use that table's full text index to do the select very quickly. Joining the two (presumably much smaller) result sets (with UNION) is also fast. This means the whole thing is fast.
The slow version winds up walking through lots of data testing it to see if it's what you want, rather than quickly winnowing the data down and only searching through rows you are likely to actually want.
Just in case you don't know: MySQL has a built in statement called EXPLAIN that can be used to see what's going on under the surface. There's a lot of articles about this, so I won't be going into any detail, but for each table it provides an estimate for the number of rows it will need to process. If you look at the "rows" column in the EXPLAIN result for the second query you'll probably see that the number of rows is quite large, and certainly a lot larger than from the first one.
The net is full of warnings about using subqueries in MySQL, but it turns out that many times the developer is smarter than the MySQL optimizer. Filtering results in some manner before joining can cause major performance boosts in many cases.
If you join both tables you end up having lots of records to inspect. Just as an example, if both tables have 100,000 records, fully joining them give you with 10,000,000,000 records (10 billion!).
If you change the OR by AND, then you allow the engine to filter out all records from table descriptions_programs which doesn't match 'china', and only then joining with titles_programs.
Anyway, that's not what you need, so I'd recommend sticking to the UNION way.
The union is the proper way to go. The join will pull in both full text indexes at once and can multiple the number of checks actually preformed.