I used to write select clause in side select clause to avoid joins in from clause. But I am afraid that is it a good coading practice or it will degrade database performance. Below is the query which contains multiple tables but I have written it using nested select clause without any join statement. Please let me know if I am making any mistake or it is ok. At this moment, I am getting accurate result.
SELECT * ,
(select POrderNo from PurchaseOrderMST POM
where POM.POrderID=CET.POrderID)as POrderNo,
(select SiteName from SiteTRS ST where ST.SiteID=CET.SiteID)as SiteName,
(select ParticularName from ParticularMST PM where
PM.ParticularID=CET.ParticularID)as ParticulerName
FROM ClaimExpenseTRS CET
WHERE ClaimID=#ClaimID
I'd use joins for this because it is best practice to do so and will be better for the query optimizer.
But for the learning just try to execute the script with join and without and see what happens on the query plan and the execution time. Usually this answers your questions right away.
Your solution is just fine.
As long as you are only using 1 column for each "joined" table, and has no multiple matching rows, it is fine. In some cases, even better than joining.
(the db engine could anytime change the direction of a join, if you are not using tricks to force a given direction, which could cause performance suprises. It is called query optimiyation, but as far as you really know your database, you should be the one to decide how the query should run).
I think you should JOIN indeed.
Now your creating your own JOIN with where and select statements.
Related
Sorry about the poorly worded question.. but I don't know how else to explain this...
MySQL... I have a query with several extremely complex subqueries in it. I and selecting from a table and I need to find out what "place" each record is in according to a variety of criteria .. So I have this
Select record.id, record.title
(select count(*) from (complex-query-that-returns-newer-records)) as agePlace,
(select count(*) from (complex-query-that-returns-records-with-better-ROI)) as ROIPlace...
From record...
Now the issue is that the query is slow - as I had expected give the amount of crunching required. But I realized that there are there are situations where results of 2 subqueries will be the same, and there is no need for me to run the subquery twice.. (or have it in my code twice). So I would like to wrap one of the subqueries in an if statement and if the criteria are met, use the value from another column that already calculated that data, else, run the subquery as normal .
I have tried just putting the other subquery's alias, but it says unknown column totalSales because the field is in the query, not one of the tables.
Is there any way around this?
UPDATE: I have reposted this as a query refortoring question - thanks for the suggestions.. How to refactor select subqueries into joins?
There really isn't a way around this. The SQL engine compiles the query to run the whole query, not just part of it. During compile time, the query engine does not know that the results will be the same.
More likely, you can move the subqueries to the from clause and find optimizations there.
If that is of interest, you should write another question with the actual queries you are using. That is a different question from this one ("how to rephrase this query" rather than "how can I conditionally make this run").
I have read that creating a temporary table is best if the number of parameters passed in the IN criteria is large. This is for select queries. Does this hold true for update queries as well ?? I have an update query which uses 3 table joins (Inner Joins) and passes 1000 parameters in the IN criteria and this query runs in a loop for 200 or more times. Which is the best approach to execute this query ?
IN operations are usually slow. Passing 1000 parameters to any query sounds awful. If you can avoid that, do it. Now, I'd really have a go with the temp table. You can even play with the indexing of the table. I mean, instead of just putting values in it, play with the indexes that would help you optimize your searches.
On the other hand, adding with indexes is slower that adding without indexes. Go for an empiric test there. Now, what I think is a must, bear in mind that when using the other table you don't need to use the IN clause because you can use the EXISTS clause which results usually in better performance. I.E.:
select * from yourTable yt
where exists (
select * from yourTempTable ytt
where yt.id = ytt.id
)
I don't know your query, nor data, but that would give you an idea about how to do it. Note the inner select * is as fast as select aSingleField, as the database engine optimizes it.
Those are all my thoughts. But remember, to be 100% sure of what is best for your problem, there is nothing like performing both tests and timing them :) Hope this help.
Few months ago I was programming a simple application with som other guy in PHP. There we needed to preform a SELECT from multiple tables based on a userid and another value that you needed to get from the row that was selected by userid.
My first idea was to create multiple SELECTs and parse all the output in the PHP script (with all that mysql_num_rows() and similar functions for checking), but then the guy told me he'll do that. "Okay no problem!" I thought, just much more less for me to write. Well, what a surprise when i found out he did it with just one SQL statement:
SELECT
d.uid AS uid, p.pasmo_cas AS pasmo, d.pasmo AS id_pasmo ...
FROM
table_values AS d, sectors AS p
WHERE
d.userid='$userid' and p.pasmo_id=d.pasmo
ORDER BY
datum DESC, p.pasmo_id DESC
(shortened piece of the statement (...))
Mostly I need to know the differences between this method (is it the right way to do this?) and JOIN - when should I use which one?
Also any references to explanations and examples of these two would come in pretty handy (not from the MySQL ref though - I'm really a novice in this kind of stuff and it's written pretty roughly there.)
, notation was replaced in ANSI-92 standard, and so is in one sense now 20 years out of date.
Also, when doing OUTER JOINs and other more complex queries, the JOIN notation is much more explicit, readable, and (in my opinion) debuggable.
As a general principle, avoid , and use JOIN.
In terms of precedence, a JOIN's ON clause happens before the WHERE clause. This allows things like a LEFT JOIN b ON a.id = b.id WHERE b.id IS NULL to check for cases where there is NOT a matching row in b.
Using , notation is similar to processing the WHERE and ON conditions at the same time.
This definitely looks like the ideal scenario for a join so you can avoid returning more data then you actually need. This: http://www.w3schools.com/sql/sql_join.asp or this: http://en.wikipedia.org/wiki/Join_(SQL) should help you get started with joins. I'm also happy to help you write the statement if you can give me a brief outline of the columns / data in each table (primarily I need two matching columns to join on).
The use of the WHERE clause is a valid approach, but as #Dems noted, has been superseded by the use of the JOINS syntax.
However, I would argue that in some cases, use of the WHERE clauses to achieve joins can be more readable and understandable than using JOINs.
You should make yourself familiar with both methods of joining tables.
I am trying to query data from two tables into one tables using OUTER JOIN. The thing is that to uniquely identify the rows, three fields are needed. This brings me to query containing this expression:
FROM Data1 DB
RIGHT OUTER JOIN Data2 FT on (DB.field1 = FT.Value1
and DB.field2 = FT.field2
and DB.field3 = FT.field3)
However, the query runs for pretty much forever. To test the whole thing I used WHERE conditions and FULL OUTER JOIN and in the case of WHERE conditions it is done almost instantly whereas using the FULL OUTER JOIN I had the same trouble and usually ended up cancelling the whole thing after 5 minutes or so.
Can anyone see what I am doing wrong with my query? Thanks for any help!
Do you really need all the records back from the query? Some WHERE criteria could cut execution time down considerably.
Yes, and indexes. Check the plan and create recomended indexes.
Your best bet is to view the execution plan (and if you are comfortable with it, post a screenshot of it in your question). That'll tell you where the most expensive portion of the query is happening.
Performance wise, what is better?
If I have 3 or 4 join statements in my query or use embedded select statements to pull the same information from my database as part of one query?
I would say joins are better because:
They are easier to read.
You have more control over whether you want to do an inner, left/right outer join or full outer join
join statements cannot be so easily abused to create query abominations
with joins it is easier for the query optimizer to create a fast query (if the inner select is simple, it might work out the same, but with more complicated stuff joins will work better).
embedded select's can only simulate left/right outer join.
Sometimes you cannot do stuff using joins, in that case (and only then) you'll have to fall back on an inner select.
It rather depends on your database: sizes of tables particularly, but also the memory parameters and sometimes even how the tables are indexed.
On less than current versions of MySQL, there was a real possibility of a query with a sub-select being considerably slower than a query that would return the same results structured with a join. (In the MySQL 4.1 days, I have seen the difference to be greater than an order of magnitude.) As a result, I prefer to build queries with joins.
That said, there are some types of queries that are extremely difficult to build with a join and a sub-select is the only way to really do it.
Assuming the database engine does absolutely no optimization, I would say it depends on how consistent you need your data to be. If you're doing multiple SELECT statements on a busy database, where the data you are looking at may change rapidly, you may run into issues where your data does not match up, between queries.
Assuming your data contains no inter-dependencies, then multiple queries will work fine. However, if your data requires consistency, use a single query.
This viewpoint boils down to keeping your data transactionally safe. Consider the situation where you have to pull a total of all accounts receivable, which is kept in a separate table from the monetary transaction amounts. If someone were to add another transaction in between your two queries, the accounts receivable total would not match the sum of the transaction amounts.
Most databases will optimize both queries below into the same plan, so whether you do:
select A.a1, B.b1 from A left outer join B on A.id = B.a_id
or
select A.a1, (select B.b1 from B where B.a_id = A.id) as b1 from A
It ends up being the same. However, in most cases for non-trivial queries you'd better stick with joins whenever you can, especially since some types of joins (such as an inner join) are not possible to achieve using sub-selects.