With a single query, I need to get a list of ALL objects A and an additional column returning "1" when there is an association to an Object C in Table B, or returning "0" when there is no associaton to an Object C in Table B.
Table A holds all objects A
Table B holds all objects A associated to another object C.
I know the ID of object C.
Currently I am using a Query with LEFT JOIN and two conditions with AND in the JOIN.
For the return value column I am using "(TableB.id IS NOT NULL) as associated".
Table A will likely hold only between dozens up to a hundred records.
Table B will likely hold between thousands to hundreds of thousands records.
Table C will likely hold between thousands to hundreds of thousands records.
TableA.id is index
TableB.tablea_id is index
TableB.id is index
TableB.tablec_id is index
TableC.id is index
My query currently looks like this:
SELECT TableA.name, TableA.code, (TableB.id IS NOT NULL) AS associated
FROM TableA
LEFT JOIN TableB ON TableA.id=TableB.tablea_id AND TableB.tablec_id = $input
Is there any performance concern about the method I use for the SQL query or a better way of achieving the desired result ?
Your query looks fine as far as I can see.
However, you should keep in mind that conditions on join should be for matching foreign keys. Following this practice at least makes your queries more readable and maintainable.
Related
I have a query to get data of friends of user. I have 3 tables, one is user table, second is a user_friend table which has user_id and friend_id (both are foreign key to user table) and 3rd table is feed table which has user_id and feed content. Feed can be shown to friends. I can query in two ways either by join or by using IN clause (I can get all the friends' ids by graph database which I am using for networking).
Here are two queries:
SELECT
a.*
FROM feed a
INNER JOIN user_friend b ON a.user_id = b.friend_id
WHERE b.user_id = 1;
In this query I get friend ids from graph database and will pass to this query:
SELECT
a.*
FROM feed a
WHERE a.user_id IN (2,3,4,5)
Which query runs faster and good for performance when I have millions of records?
With suitable indexes, a one-query JOIN (Choice 1) will almost always run faster than a 2-query (Choice 2) algorithm.
To optimize Choice 1, b needs this composite index: INDEX(user_id, friend_id). Also, a needs an index (presumably the PRIMARY KEY?) starting with user_id.
This depends on your desired result when you have a compared big data in your subquery their always a join is much preferred for such conditions. Because subqueries can be slower than LEFT [OUTER] JOINS / INNER JOIN [LEft JOIN is faster than INNER JOIN], but in my opinion, their strength is slightly higher readability.
So if your data have fewer data to compare then why you chose a complete table join so that depends on how much data you have.
In my opinion, if you have a less number of compared data in IN than it's good but if you have a subquery or big data then you must go for a join...
I'm trying to see if my understanding of JOINs is correct.
For the following query:
SELECT * FROM tableA
join tableB on tableA.someId = tableB.someId
join tableC on tableA.someId = tableC.someId;
Does the RDMS basically execute similar pseudocode as follows:
List tempResults
for each A_record in tableA
for each B_record in tableB
if (A_record.someId = B_record.someId)
tempResults.add(A_record)
List results
for each Temp_Record in tempResults
for each C_record in tableC
if (Temp_record.someId = C_record.someId)
results.add(C_record)
return results;
So basically the more records with the same someId tableA has with tableB and tableC, the more records the RDMS have the scan? If all 3 tables have records with same someId, then essentially a full table scan is done on all 3 tables?
Is my understanding correct?
Each vendor's query processor is of course written (coded) slightly differently, but they probably share many common techniques. Implementing a join can be done in a variety of ways, and which one is chosen, in any vendor's implementation, will be dependent on the specific situation, but factors that will be considered include whether the data is already sorted by the join attribute, the relative number of records in each table (a join between 20 records in one set of data with a million records in the other will be done differently than one where each set of records is of comparable size). I do not know the internals for MySQL, but for SQL server, there are three different join techniques, a Merge Join, a Loop Join, and a Hash Join. Take a look at this.
i have two tables as below:
Table 1 "customer" with fields "Cust_id", "first_name", "last_name" (10 customers)
Table 2 "cust_order" with fields "order_id", "cust_id", (26 orders)
I need to display "Cust_id" "first_name" "last_name" "order_id"
to where i need count of order_id group by cust_id like list total number of orders placed by each customer.
I am running below query, however, it is counting all the 26 orders and applying that 26 orders to each of the customer.
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order, customer cus
GROUP BY cust_id;
Could you please suggest/advice what is wrong in the query?
You issue here is that you have told the database how these two tables are 'connected', or what they should be connected by:
Have a look at this image:
~IMAGE SOURCE
This effectively allows you to 'join' two tables together, and use a query between them.
so you might want to use something like:
SELECT COUNT(B.order_id), A.cust_id, A.first_name, A.last_name
FROM customer A
LEFT JOIN cust_order B //this is using a left join, but an inner may be appropriate also
ON (A.cust_id= B.Cust_id) //what links them together
GROUP BY A.cust_id; // the group by clause
As per your comment requesting some further info:
Left Join (right joins are almost identical, only the other way around):
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row in the result, but with NULL in each column from right table. ~Tutorials Point.
This means that a left join returns all the values from the left table, plus matched values from the right table or NULL in case of no matching join predicate.
LEFT joins will be used in the cases where you wish to retrieve all the data from the table in the left hand side, and only data from the right that match.
Execution Time
While the accepted answer in this case may work well in small datasets, it may however become 'heavy' in larger databases. This is because it was not actually designed for this type of operation.
This was the purpose of Joins to be introduced.
Much work in database-systems has aimed at efficient implementation of joins, because relational systems commonly call for joins, yet face difficulties in optimising their efficient execution. The problem arises because inner joins operate both commutatively and associatively. ~Wikipedia
In practice, this means that the user merely supplies the list of tables for joining and the join conditions to use, and the database system has the task of determining the most efficient way to perform the operation. A query optimizer determines how to execute a query containing joins. So, by allowing the dbms to choose the way your data is queried, you can save a lot of time.
Other Joins/Summary
AN INNER JOIN will return data from both tables where the keys in each table match
A LEFT JOIN or RIGHT JOIN will return all the rows from one table and matching data from the other table.
Use a join when you want to query multiple tables.
Joins are much faster than other ways of querying >=2 tables (speed can be seen much better on larger datasets).
You could try this one:
SELECT COUNT(cus_order.order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order cus_order, customer cus
WHERE cus_order.cust_id = cus.cust_id
GROUP BY cust_id;
Maybe an left join will help you
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name ]
FROM customer cus
LEFT JOIN cust_order co
ON (co.cust_id= cus.Cust_id )
GROUP BY cus.cust_id;
Why I am getting so many records for this
SELECT e.OneColumn, fb.OtherColumn
FROM dbo.TABLEA FB
INNER JOIN dbo.TABLEB eo ON Fb.Primary = eo.foregin
INNER JOIN dbo.TABLEC e ON eo.Primary =e.Foreign
WHERE FB.SomeOtherColumn = 0
When I am running this I am getting Millions of records which is not the correct case, all tables has less number of records.
I need to get the columns from TableA and TableC and because they are not joined logically so I have to use TableB to act as bridge
EDIT
Below is the count:
TABLEA = 273551
TABLEB = 384412
TABKEC = 13046
Above Query = After 2 minutes I have forcefully canceled the query.. till that time the count was 11437613
Any suggestion?
To figure out what is going on in such a query where the results are not as expected, I tend to do this. First I change to a SELECT * (Note this is only for figuring out the problem, do not use SELECT * on production, ever!) Then I add an order by for the ID frield from tableA if there is not one in the query.
So now I run the query up to the first table including any where conditions that are from the first table. I comment out the rest. I note the number of records returned.
Now I add in the second table and any where conditions from it. If I am expecting a one to relationship, and if this query doesn't return the smae number of records, then I look at the data that is being returned to see if I can figure out why. Since the contents are ordered by the table1 ID, you can ususally see examples of some records that are duplicated fairly easily and then scroll over until you find the field that causes the differnce. Often this means that you need some sort of addtional where clause or aggregation on the fields in the next table to limit to only one record. JUSt note down the problem at this point though as you may be able tomake the change more effectively in the next join.
So add inteh the third table and again, not the number of records and then look closely at the data where the id from A is repeated. LOok at the columns you intend to return, are they always teh same for an id? If they are differnt then you do not havea one-one relationship and you need to understand that either theri is a data integrity problem or you are mistaken in thinking there is a one-to-one. If tehy are the same, then a derived table may be in order. You only need the ids from tableb so the join could look something like this:
JOIN (SELECT MIn(Primary), foreign FROM TABLEB GROUP BY foreign) EO ON Fb.Primary = eo.foreign
Hope this helps.
I want to create a query in MS Access that will display information from two tables based on the values in one table. Both of these tables have the same exact columns. One has set records and the other one has records a visitor can insert/edit/delete. For the purpose of this question I will call the tables TableA and TableB. TableA has the predetermined records and can not be changed. Multiple users will be using these records. Visitors would add records to TableB. I need a query that will display the records from TableA unless a visitor adds a record to TableB and then it displays that record. The field I need to join on is CategoryID. So what I need is basically like this;
If TableB.CategoryID Is Not Null Then
Select * From TableB
Else
Select * From TableA
End If
Thanks for any assistance anyone can provide.
JW
You get part of the way there by unioning the individual table queries; that works if there's nothing in B, but shows the A records if there is.
So suppose we created a table just like A, say A2, but with an added column: the number of records in B. And then we select all of the records in A2 where this new column 0, and only the columns originally in A; call this A3.
Now consider the union of A3 & B. If B is empty, we get A. If B is not empty, then none of the records from A2 will be chosen for A3, and we'e left with just B.
That is easier than it seems at first. You'll have to join both tables on CategoryID and then conditionally select the right item like this:
SELECT tA.CategoryID, IIF(tB.CategoryID IS NULL, tA.txtEntry, tB.txtEntry) AS EntryText,
tB.CategoryID IS NULL AS bOriginalEntry
FROM TableA AS tA LEFT JOIN TableB AS tB ON tA.CategoryID=tB.CategoryID
However there is one caveat: If TableB is empty then the join is producing an empty set! Just populate TableB with at least one record (preferably one with an invalid CategoryID so it won't join with a valid record in TableA.
The bOriginalEntry is just a boolean expression to show whether the EntryText stems from TableA or TableB.
I found this thread searching for a similar problem. Note to self and others.
You can use the Join Types to cope with potential different values in a conditional select,
MS Access doesn't have the full range of JOIN that MS SQL has, but you can "fudge" it.
eg
Full outer joins: all the data, combined where feasible
In some systems, an outer join can include all rows from both tables, with rows combined when they correspond. This is called a full outer join, and Access doesn’t explicitly support them. However, you can use a cross join and criteria to achieve the same effect.
https://support.office.com/en-us/article/join-tables-and-queries-3f5838bd-24a0-4832-9bc1-07061a1478f6#typesofjoins