I have a SELECT query that returns the response based on an unique ID, so I always get just one row.
I thought that I could save my machine an extra SELECT query if I simply added the prices table to the result, and read them to memory later on.
Would that be a good approach or am I missing something ?
(I tried it out and seems to get the job done)
SELECT *
FROM subscriptions
LEFT JOIN prices ON 1=1
WHERE subscriptions.ID = 100
edit: The prices table has no ID. I just need to get the complete table, I used to have a different SELECT just for that
This looks like a terrible idea... you should join the subscriptions table to the prices table using the foreign key that you (supposedly/should) have.
Assuming your prices table has a subscription ID column then your query should look something like this:
SELECT *
FROM subscriptions LEFT JOIN prices ON subscriptions.ID=prices.ID
WHERE subscriptions.ID=100
What this will do is produce a cartesian join - not too bad since you're limiting the 'subscriptions' side of things to a single record, but will still produce as many rows as there's records in the price side. Where this gets bad is when you've got multiple rows on both sides. Then you get n x m results - think of how big the result set would be if you had 50,000 subscriptions joined against 1000 prices: 50,000 x 1,000 = 50 million result rows.
First off, this approach is going to be much less clear what you're doing than two SELECT statements unless there is an actual relation between the tables. Second, it's probably going to be slower, because you're transferring much more data (each row of prices additionally gets all the fields from subscriptions copied).
If subscriptions and prices are related, you want to change that ON condition to use the relation, so you're only pulling the data you need.
SELECT *
FROM subscriptions s LEFT JOIN prices p ON (s.subscription_id = p.subscription_id)
WHERE s.subscription_id = 100
One thing you definitely don't want to do is this:
SELECT *
FROM subscriptions s LEFT JOIN prices p ON (1=1)
as that'd pull the full Cartesian product. Once your tables get sufficiently large, that will run you out of temporary table space.
why your condition have 1=1 ?
I thing that is's must something like this:
SELECT s.*,p.*
FROM subscriptions as s
LEFT JOIN prices as p ON p.product_id=s.product_id
WHERE s.ID = 100
show me your full fields of tables subscriptions and prices to help for you
This?
SELECT *
FROM subscriptions, prices
WHERE subscriptions.ID = 100
You'll get horrible results like this, but it seems this is what you wanted.
The table with less rows will have its rows repeating. Again, this is not a good practice.
Use two SELECTs.
This is a cross join http://en.wikipedia.org/wiki/Join_(SQL)#Cross_join
which means your resultset will contain as many rows as you have in the prices table.
So I guess it is not a good idea
Related
I have got a somewhat complicated problem. This is my situation (ERD).
For a dashboard i need to create a pivot table that shows me the total amount of competences used by the vacancies. Therefore I need to:
Count the amount of vacancies per template
Count the amount of templates per competence
and last: multiply these numbers to get the total amount of comps used.
I have the first query:
SELECT vacancytemplate_id, count(id)
FROM vacancies
group by vacancytemplate_id;
And the second query isn't that difficult either, but I don't know what the right solution will be. I'm literally brainstuck. My mind can't comprehend how I can achieve the next step and put it down in a query. Please kind stranger, help me out :)
EDIT: my desired result is something like this
NameOfComp, NrOfTimesUsed
Leading, 17
Inspiring, 2
EDIT2: the meta query it should look like:
SELECT NameOfComp, (count of the competences used by templates) * (number of vacancies per template)
EDIT3: http://sqlfiddle.com/#!9/2773ca SQLFiddle
Thanks a lot!
If I am understanding your request correctly, you are wanting a count of competences per vacancy. This can be done very simply due to your table structure:
Select v.ID, count(*) from vacancy as v inner join CompTemplate_Table as CT
on v.Template_ID = CT.Template_ID group by v.ID;
The reason you can do only one join is because there will be a record in the CompTemplate_Table for every competency in each template. Additionally, the same key is used to join vacancy to templates as is used to join templates to CompTemplate_Table, so they represent the same key value (and you can skip joining the Templates table if you don't need data from there).
If you are wanting to add this data to a pivot table, I will leave that exercise to you. There are a number of tutorials available if you do a quick google search and it should not be that hard.
UPDATE: For the second query you are looking at something like:
Select cp.NameOfComp, count(*) from vacancy as v inner join CompTemplate_Table as CT
on v.Template_ID = CT.Template_ID inner join competencies as CP
on CP.ID = CT.Comp_ID
group by CP.NameOfComp
The differences here are you are adding in the comptetencies table, as you need data from that, and grouping by the CP.NameOfComp instead of the vacancy id. You can also restrict this to specific templates, competencies, or vacancies by adding in search conditions (e.g. where CP.ID = 12345)
I have this 2 two views. I made a view out of them because they have dependencies on other tables. I then created a query joining them together. Two queries separately has over 30k plus data.
Sample Query would be like:
Select main.*, detail.*
FROM main_data as main
INNER JOIN detail_data AS details ON (main.id = detail.main_id)
WHERE main.column_data LIKE '%something%'
This is slow but if I comment out the "detail.*" on the column section with the join being there is returns the data very fast. What is wrong here?
The expected amount of data is only 20 rows. Still takes very long to perform the select. What makes it slow?
I'd like to add that the joined tables on the view are correct. It is joined accordingly. Like I said If queried separately it returns a lot of data in a very short time.
For more details, actually there is a third table here where both table are a child to. I changed it up to something like this. It improves the performance a bit.
Select main.*, detail.*
FROM parent_data AS par
INNER JOIN main_data as main ON (par.id = detail.par_id)
INNER JOIN detail_data AS details ON (par.id = detail.par_id)
WHERE main.column_data LIKE '%something%'
I made a EXPLAIN SELECT on the exact tables and this is there result:
Thank you.
The only reason the query slows down if you include fields in the resultset from detail table is that detail.main_id is indexed, therefore if no fields are selected from detail, then mysql does not have to touch detail table, it is enough to use the index only. If you do include some fields from that table, then simple index lookup will not be enough, mysql will have to open detail table and get the corresponding records from there.
Try using left join cause you need only connected records from right table:
Select main.*, detail.*
FROM main_data as main
LEFT JOIN detail_data AS details
ON (main.id = detail.main_id AND main.column_data LIKE '%something%')
i have two tables as below:
Table 1 "customer" with fields "Cust_id", "first_name", "last_name" (10 customers)
Table 2 "cust_order" with fields "order_id", "cust_id", (26 orders)
I need to display "Cust_id" "first_name" "last_name" "order_id"
to where i need count of order_id group by cust_id like list total number of orders placed by each customer.
I am running below query, however, it is counting all the 26 orders and applying that 26 orders to each of the customer.
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order, customer cus
GROUP BY cust_id;
Could you please suggest/advice what is wrong in the query?
You issue here is that you have told the database how these two tables are 'connected', or what they should be connected by:
Have a look at this image:
~IMAGE SOURCE
This effectively allows you to 'join' two tables together, and use a query between them.
so you might want to use something like:
SELECT COUNT(B.order_id), A.cust_id, A.first_name, A.last_name
FROM customer A
LEFT JOIN cust_order B //this is using a left join, but an inner may be appropriate also
ON (A.cust_id= B.Cust_id) //what links them together
GROUP BY A.cust_id; // the group by clause
As per your comment requesting some further info:
Left Join (right joins are almost identical, only the other way around):
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row in the result, but with NULL in each column from right table. ~Tutorials Point.
This means that a left join returns all the values from the left table, plus matched values from the right table or NULL in case of no matching join predicate.
LEFT joins will be used in the cases where you wish to retrieve all the data from the table in the left hand side, and only data from the right that match.
Execution Time
While the accepted answer in this case may work well in small datasets, it may however become 'heavy' in larger databases. This is because it was not actually designed for this type of operation.
This was the purpose of Joins to be introduced.
Much work in database-systems has aimed at efficient implementation of joins, because relational systems commonly call for joins, yet face difficulties in optimising their efficient execution. The problem arises because inner joins operate both commutatively and associatively. ~Wikipedia
In practice, this means that the user merely supplies the list of tables for joining and the join conditions to use, and the database system has the task of determining the most efficient way to perform the operation. A query optimizer determines how to execute a query containing joins. So, by allowing the dbms to choose the way your data is queried, you can save a lot of time.
Other Joins/Summary
AN INNER JOIN will return data from both tables where the keys in each table match
A LEFT JOIN or RIGHT JOIN will return all the rows from one table and matching data from the other table.
Use a join when you want to query multiple tables.
Joins are much faster than other ways of querying >=2 tables (speed can be seen much better on larger datasets).
You could try this one:
SELECT COUNT(cus_order.order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order cus_order, customer cus
WHERE cus_order.cust_id = cus.cust_id
GROUP BY cust_id;
Maybe an left join will help you
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name ]
FROM customer cus
LEFT JOIN cust_order co
ON (co.cust_id= cus.Cust_id )
GROUP BY cus.cust_id;
Why I am getting so many records for this
SELECT e.OneColumn, fb.OtherColumn
FROM dbo.TABLEA FB
INNER JOIN dbo.TABLEB eo ON Fb.Primary = eo.foregin
INNER JOIN dbo.TABLEC e ON eo.Primary =e.Foreign
WHERE FB.SomeOtherColumn = 0
When I am running this I am getting Millions of records which is not the correct case, all tables has less number of records.
I need to get the columns from TableA and TableC and because they are not joined logically so I have to use TableB to act as bridge
EDIT
Below is the count:
TABLEA = 273551
TABLEB = 384412
TABKEC = 13046
Above Query = After 2 minutes I have forcefully canceled the query.. till that time the count was 11437613
Any suggestion?
To figure out what is going on in such a query where the results are not as expected, I tend to do this. First I change to a SELECT * (Note this is only for figuring out the problem, do not use SELECT * on production, ever!) Then I add an order by for the ID frield from tableA if there is not one in the query.
So now I run the query up to the first table including any where conditions that are from the first table. I comment out the rest. I note the number of records returned.
Now I add in the second table and any where conditions from it. If I am expecting a one to relationship, and if this query doesn't return the smae number of records, then I look at the data that is being returned to see if I can figure out why. Since the contents are ordered by the table1 ID, you can ususally see examples of some records that are duplicated fairly easily and then scroll over until you find the field that causes the differnce. Often this means that you need some sort of addtional where clause or aggregation on the fields in the next table to limit to only one record. JUSt note down the problem at this point though as you may be able tomake the change more effectively in the next join.
So add inteh the third table and again, not the number of records and then look closely at the data where the id from A is repeated. LOok at the columns you intend to return, are they always teh same for an id? If they are differnt then you do not havea one-one relationship and you need to understand that either theri is a data integrity problem or you are mistaken in thinking there is a one-to-one. If tehy are the same, then a derived table may be in order. You only need the ids from tableb so the join could look something like this:
JOIN (SELECT MIn(Primary), foreign FROM TABLEB GROUP BY foreign) EO ON Fb.Primary = eo.foreign
Hope this helps.
I have a table structure like the following:
user
id
name
profile_stat
id
name
profile_stat_value
id
name
user_profile
user_id
profile_stat_id
profile_stat_value_id
My question is:
How do I evaluate a query where I want to find all users with profile_stat_id and profile_stat_value_id for many stats?
I've tried doing an inner self join, but that quickly gets crazy when searching for many stats. I've also tried doing a count on the actual user_profile table, and that's much better, but still slow.
Is there some magic I'm missing? I have about 10 million rows in the user_profile table and want the query to take no longer than a few seconds. Is that possible?
Typically databases are able to handle 10 million records in a decent manner. I have mostly used oracle in our professional environment with large amounts of data (about 30-40 million rows also) and even doing join queries on the tables has never taken more than a second or two to run.
On IMPORTANT lessson I realized whenever query performance was bad was to see if the indexes are defined properly on the join fields. E.g. Here having index on profile_stat_id and profile_stat_value_id (user_id I am assuming is the primary key) should have indexes defined. This will definitely give you a good performance increaser if you have not done that.
After defining the indexes do run the query once or twice to give DB a chance to calculate the index tree and query plan before verifying the gain
Superficially, you seem to be asking for this, which includes no self-joins:
SELECT u.name, u.id, s.name, s.id, v.name, v.id
FROM User_Profile AS p
JOIN User AS u ON u.id = p.user_id
JOIN Profile_Stat AS s ON s.id = p.profile_stat_id
JOIN Profile_Stat_Value AS v ON v.id = p.profile_stat_value_id
Any of the joins listed can be changed to a LEFT OUTER JOIN if the corresponding table need not have a matching entry. All this does is join the central User_Profile table with each of the other three tables on the appropriate joining column.
Where do you think you need a self-join?
[I have not included anything to filter on 'the many stats'; it is not at all clear to me what that part of the question means.]