Basic problem is - I'm trying to use a sub-query's count as a variable in my WHERE CASE statement, but it doesn't appear to let me use it. I've got it somewhat working when I put the SELECT COUNT(id)... statement in the WHERE area, but - if I do that, I'll have to include it 3-4 different times instead of just once if I can put it in the SELECT.
Below is a modified example query that explains my problem. It's not the exact query I'm using, but it gives the same error as my much-longer/more complicated query:
Error:
[Err] 1054 - Unknown column 'matched_sections' in 'where clause'
Query:
SELECT
articles.id,
(SELECT COUNT(id) FROM site_areas_site_sections WHERE
site_areas_site_sections.site_area_id = 8) AS matched_sections
FROM
articles
LEFT JOIN
articles_site_sections ON articles_site_sections.article_id = articles.id
LEFT JOIN
site_areas_site_sections ON articles_site_sections.site_section_id =
site_areas_site_sections.site_section_id
WHERE
(CASE
WHEN
matched_sections > 0
THEN
site_sections.id = site_areas_site_sub_sections.site_sub_section_id
END
)
The problem is when MySQL starts to parse out your query, the SELECT clause is the last bit to be analyzed. So when it's going through the WHERE clause, matched_sections doesn't yet exists (because the SELECT clause has yet to be looked at.
Just a quick look at it, you can try something like this (although I think someone will be able to come up with something a little more elegant):
SELECT
articles.id,
matched_sections.count
FROM
articles
LEFT JOIN
articles_site_sections ON articles_site_sections.article_id = articles.id
LEFT JOIN
site_areas_site_sections ON articles_site_sections.site_section_id =
site_areas_site_sections.site_section_id,
(SELECT COUNT(id) as count FROM site_areas_site_sections WHERE
site_areas_site_sections.site_area_id = 8) matched_sections
WHERE
(CASE
WHEN
matched_sections.count > 0
THEN
site_sections.id = site_areas_site_sub_sections.site_sub_section_id
END
)
Due to the order in which the elements of a SQL statement are processed, matched_sections is not defined at the time the WHERE clause is evaluated.
You might want to prepopulate the count into a variable, and use that in your query.
DECLARE matched_sections INT;
SET matched_sections = (SELECT COUNT(id)
FROM site_areas_site_sections
WHERE site_areas_site_sections.site_area_id = 8);
As you have already been told, the error is related to the fact that the WHERE clause knows nothing about the values you are actually selecting, because it is evaluated before the SELECT clause.
In general, you can solve this problem by using the entire query as a derived table, in which case the matched_sections alias becomes available to the outer query:
SELECT
id,
matched_section
FROM (
SELECT
articles.id,
(SELECT COUNT(id) FROM site_areas_site_sections WHERE
site_areas_site_sections.site_area_id = 8) AS matched_sections
FROM
articles
LEFT JOIN
articles_site_sections ON articles_site_sections.article_id = articles.id
LEFT JOIN
site_areas_site_sections ON articles_site_sections.site_section_id =
site_areas_site_sections.site_section_id
) s
WHERE
CASE
WHEN matched_sections > 0
THEN …
END
If your original condition contains references to columns that are not meant to be retrieved, you will need to add them to the derived table's selection list so you can use them in the outer query's condition.
In cases where the subquery is not correlated with the main query (and it isn't in your example), the solutions by #Joe and by #Dirk are possibly better options than the above suggestion.
Related
I have a populated table with a 'Number' as PK. I would like a query which searches for a specific number, and if it is not found then it would return "NULL" not a no value.
I have managed to do it for one return:
SELECT (SELECT Risk_Impact FROM [dbo].[RFC] WHERE Number = 'RFC-018345')
However I would like to select multiple columns like:
SELECT (SELECT Risk_Impact, Risk_Impact, BI_TestingOutcome FROM [dbo].[RFC] WHERE Number = 'RFC-018345')
However it is giving me an error:
"Msg 116, Level 16, State 1, Line 1
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS."
Can you please assist?
Thank you in advance
Try
select p.* from (select 1 as t ) v
left join (select * from [dbo].[RFC] WHERE Number = 'RFC-018345') p
on 1=1
In sql server should be use isnull function like this
SELECT isnull((SELECT Risk_Impact FROM [dbo].[RFC] WHERE Number = 'RFC-018345'),0)
Refined #Sagar query:
And better indentation would probably be more helpful for people to learn from.
Ex:
SELECT
[dbo].[RFC].*
FROM
(SELECT 1 AS Col1) AS V
CROSS JOIN
[dbo].[RFC]
WHERE Number = 'RFC-018345'
Edit: I got caught up in removing the subquery. Working query:
SELECT
[dbo].[RFC].*
FROM
(SELECT 1 AS Col1) AS V
LEFT JOIN
[dbo].[RFC]
ON Number = 'RFC-018345'
(I would also advocate to not SELECT *, anywhere near production, but to always be explicit about the columns in the result set. Ex:
SELECT
[dbo].[RFC].[Risk_Impact],
[dbo].[RFC].[BI_TestingOutcome]
FROM
(SELECT 1 AS Col1) AS V
LEFT JOIN
[dbo].[RFC]
ON Number = 'RFC-018345'
)
I have SQL query with LEFT JOIN:
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON
(stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
AND stn.stocksEndDate >= UNIX_TIMESTAMP() AND stn.stocksStartDate <= UNIX_TIMESTAMP())
These query I want to select one row from table stocks by conditions and with field equal value a.MedicalFacilitiesIdUser.
I get always count_stocks = 0 in result. But I need to get 1
The count(...) aggregate doesn't count null, so its argument matters:
COUNT(stn.stocksId)
Since stn is your right hand table, this will not count anything if the left join misses. You could use:
COUNT(*)
which counts every row, even if all its columns are null. Or a column from the left hand table (a) that is never null:
COUNT(a.ID)
Your subquery in the on looks very strange to me:
on stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
This is comparing MedicalFacilitiesIdUser to stocksIdMF. Admittedly, you have no sample data or data layouts, but the naming of the columns suggests that these are not the same thing. Perhaps you intend:
on stn.stocksIdMF = ( SELECT b.stocksId
-----------------------------^
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY b.stocksId DESC
LIMIT 1)
Also, ordering by stn.stocksid wouldn't do anything useful, because that would be coming from outside the subquery.
Your subquery seems redundant and main query is hard to read as much of the join statements could be placed in where clause. Additionally, original query might have a performance issue.
Recall WHERE is an implicit join and JOIN is an explicit join. Query optimizers
make no distinction between the two if they use same expressions but readability and maintainability is another thing to acknowledge.
Consider the revised version (notice I added a GROUP BY):
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON stn.stocksIdMF = a.MedicalFacilitiesIdUser
WHERE stn.stocksEndDate >= UNIX_TIMESTAMP()
AND stn.stocksStartDate <= UNIX_TIMESTAMP()
GROUP BY stn.stocksId
ORDER BY stn.stocksId DESC
LIMIT 1
I have a mysql query and it works fine when i use where clause, but when i donot use
where clause it gone and never gives the output and finally timeout.
Actually i have used Explain command to check the performance of the query and in both cases the Explain gives the same number of rows used in joining.
I have attached the image of output got with Explain command.
Below is the query.
I couldn't figure whats the problem here.
Any help is highly appreciated.
Thanks.
SELECT
MCI.CLIENT_ID AS CLIENT_ID, MCI.NAME AS CLIENT_NAME, MCI.PRIMARY_CONTACT AS CLIENT_PRIMARY_CONTACT,
MCI.ADDED_BY AS SP_ID, CONCAT(MUD_SP.FIRST_NAME, ' ', MUD_SP.LAST_NAME) AS SP_NAME,
MCI.FK_PROSPECT_ID AS PROSPECT_ID, MCI.DATE_ADDED AS ADDED_ON,
(SELECT GROUP_CONCAT(LT.TAG_TEXT SEPARATOR ', ')
FROM LK_TAG LT
INNER JOIN M_OBJECT_TAG_MAPPING MOTM
ON LT.PK_ID = MOTM.FK_TAG_ID
WHERE MOTM.FK_OBJECT_ID = MCI.FK_PROSPECT_ID
AND MOTM.OBJECT_TYPE = 1
AND MOTM.IS_ACTIVE = 1
) AS TAGS,
IFNULL(SUM(GET_DIGITS(MMR.RCP_AMOUNT)), 0) AS REVENUE_SO_FAR,
IFNULL(SUM(GET_DIGITS(MMR.RCP_RUPEES)), 0) AS REVENUE_INR,
COUNT(DISTINCT PMI_MONTHLY.PROJECT_ID) AS MONTHLY,
COUNT(DISTINCT PMI_FIXED.PROJECT_ID) AS FIXED,
COUNT(DISTINCT PMI_HOURLY.PROJECT_ID) AS HOURLY,
COUNT(DISTINCT PMI_ANNUAL.PROJECT_ID) AS ANNUAL,
COUNT(DISTINCT PMI_CURRENTLY_RUNNING.PROJECT_ID) AS CURRENTLY_RUNNING_PROJECTS,
COUNT(DISTINCT PMI_YET_TO_START.PROJECT_ID) AS YET_TO_START_PROJECTS,
COUNT(DISTINCT PMI_TECH_SALES_CLOSED.PROJECT_ID) AS TECH_SALES_CLOSED_PROJECTS
FROM
M_CLIENT_INFO MCI
INNER JOIN M_USER_DETAILS MUD_SP
ON MCI.ADDED_BY = MUD_SP.PK_ID
LEFT OUTER JOIN M_MONTH_RECEIPT MMR
ON MMR.CLIENT_ID = MCI.CLIENT_ID
LEFT OUTER JOIN M_PROJECT_INFO PMI_FIXED
ON PMI_FIXED.CLIENT_ID = MCI.CLIENT_ID AND PMI_FIXED.PROJECT_TYPE = 1
LEFT OUTER JOIN M_PROJECT_INFO PMI_MONTHLY
ON PMI_MONTHLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_MONTHLY.PROJECT_TYPE = 2
LEFT OUTER JOIN M_PROJECT_INFO PMI_HOURLY
ON PMI_HOURLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_HOURLY.PROJECT_TYPE = 3
LEFT OUTER JOIN M_PROJECT_INFO PMI_ANNUAL
ON PMI_ANNUAL.CLIENT_ID = MCI.CLIENT_ID AND PMI_ANNUAL.PROJECT_TYPE = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_CURRENTLY_RUNNING
ON PMI_CURRENTLY_RUNNING.CLIENT_ID = MCI.CLIENT_ID AND PMI_CURRENTLY_RUNNING.STATUS = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_YET_TO_START
ON PMI_YET_TO_START.CLIENT_ID = MCI.CLIENT_ID AND PMI_YET_TO_START.STATUS < 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_TECH_SALES_CLOSED
ON PMI_TECH_SALES_CLOSED.CLIENT_ID = MCI.CLIENT_ID AND PMI_TECH_SALES_CLOSED.STATUS > 4
WHERE YEAR(MCI.DATE_ADDED) = '2012'
GROUP BY MCI.CLIENT_ID ORDER BY CLIENT_NAME ASC
Yes, as many people have said, the key is that when you have the where clause, mysql engine filters the table M_CLIENT_INFO --probably drammatically--.
A similar result as removing the where clause is to to add this where clause:
where 1 = 1
You will see that the performance is degraded also because mysql will try to get all the data.
Remove the where clause and all columns from select and add a count to see how many records you get. If it is reasonable, say up to 10k, then do the following,
put back the select columns related to M_CLIENT_INFO
do not include the nested one "TAGS"
remove all your joins
run your query without where clause and gradually include the joins
this way you'll find out when the timeout is caused.
I would try the following. First, MySQL has a keyword "STRAIGHT_JOIN" which tells the optimizer to do the query in the table order you've specified. Since all you left-joins are child-related (like a lookup table), you don't want MySQL to try and interpret one of those as a primary basis of the query.
SELECT STRAIGHT_JOIN ... rest of query.
Next, your M_PROJECT_INFO table, I dont know how many columns of data are out there, but you appear to be concentrating on just a few columns on your DISTINCT aggregates. I would make sure you have a covering index on these elements to help the query via an index on
( Client_ID, Project_Type, Status, Project_ID )
This way the engine can apply the criteria and get the distinct all out of the index instead of having to go back to the raw data pages for the query.
Third, your M_CLIENT_INFO table. Ensure that has an index on both your criteria, group by AND your Order By, and change your order by from the aliased "CLIENT_NAME" to the actual column of the SQL table so it matches the index
( Date_Added, Client_ID, Name )
I have "name" in ticks as it is also a reserved word and helps clarify the column, not the keyword.
Next, the WHERE clause. Whenever you apply a function to an indexed column name, it doesn't work the greatest, especially on date/time fields... You might want to change your where clause to
WHERE MCI.Date_Added between '2012-01-01' and '2012-12-31 23:59:59'
so the BETWEEN range is showing the entire year and the index can better be utilized.
Finally, if the above do not help, I would consider splitting your query some. The GROUP_CONCACT inline select for the TAGS might be a bit of a killer for you. You might want to have all the distinct elements first for the grouping per client, THEN get those details.... Something like
select
PQ.*,
group_concat(...) tags
from
( the entire primary part of the query ) as PQ
Left join yourGroupConcatTableBasis on key columns
In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:
Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.
The following query hangs: (although subqueries perfomed separately are fine)
I don't know how to make the explain table look ok. If someone tells me, I'll clean it up.
select
sum(grades.points)) as p,
from assignments
left join grades using (assignmentID)
where gradeID IN
(select grades.gradeID
from assignments
left join grades using (assignmentID)
where ... grades.date <= '1255503600' AND grades.date >= '984902400'
group by assignmentID order by grades.date DESC);
I think the problem is with the first grades table... the type ALL with that many rows seems to be the cause.. Everything is indexed.
I uploaded the table as an image. Couldn't get the formatting right:
http://imgur.com/AjX34.png
A commenter wanted the full where clause:
explain extended select count(assignments.assignmentID) as asscount, sum(TRIM(TRAILING '-' FROM grades.points)) as p, sum(assignments.points) as t
from assignments left join grades using (assignmentID)
where gradeID IN
(select grades.gradeID from assignments left join grades using (assignmentID) left join as_types on as_types.ID = assignments.type
where assignments.classID = '7815'
and (assignments.type = 30170 )
and grades.contactID = 7141
and grades.points REGEXP '^[-]?[0-9]+[-]?'
and grades.points != '-'
and grades.points != ''
and (grades.pointsposs IS NULL or grades.pointsposs = '')
and grades.date <= '1255503600'
AND grades.date >= '984902400'
group by assignmentID
order by grades.date DESC);
See "The unbearable slowness of IN":
http://www.artfulsoftware.com/infotree/queries.php#568
Super messy, but: (thanks for everyone's help)
SELECT *
FROM grades
LEFT JOIN assignments ON grades.assignmentID = assignments.assignmentID
RIGHT JOIN (
SELECT g.gradeID
FROM assignments a
LEFT JOIN grades g
USING ( assignmentID )
WHERE a.classID = '7815'
AND (
a.type =30170
)
AND g.contactID =7141
g.points
REGEXP '^[-]?[0-9]+[-]?'
AND g.points != '-'
AND g.points != ''
AND (
g.pointsposs IS NULL
OR g.pointsposs = ''
)
AND g.date <= '1255503600'
AND g.date >= '984902400'
GROUP BY assignmentID
ORDER BY g.date DESC
) AS t1 ON t1.gradeID = grades.gradeID
Suppose you use a Real Database (ie, any database except MySQL, but I'll use Postgres as an example) to do this query :
SELECT * FROM ta WHERE aid IN (SELECT subquery)
a Real Database would look at the subquery and estimate its rowcount :
If the rowcount is small (say, less than a few millions)
It would run the subquery, then build an in-memory hash of ids, which also makes them unique, which is a feature of IN().
Then, if the number of rows pulled from ta is a small part of ta, it would use a suitable index to pull the rows. Or, if a major part of the table is selected, it would just scan it entirely, and lookup each id in the hash, which is very fast.
If however the subquery rowcount is quite large
The database would probably rewrite it as a merge JOIN, adding a Sort+Unique to the subquery.
However, you are using MySQL. In this case, it will not do any of this (it is gonna re-execute the subquery for each row of your table) so it will take 1000 years. Sorry.
If your subquery performs fine when it is executed separately, then try using a JOIN rather than IN, like this:
select count(assignments.assignmentID) as asscount, sum(TRIM(TRAILING '-' FROM grades.points)) as p, sum(assignments.points) as t
from assignments left join grades using (assignmentID)
join
(select grades.gradeID from assignments left join grades using (assignmentID) left join as_types on as_types.ID = assignments.type
where assignments.classID = '7815'
and (assignments.type = 30170 )
and grades.contactID = 7141
and grades.points REGEXP '^[-]?[0-9]+[-]?'
and grades.points != '-'
and grades.points != ''
and (grades.pointsposs IS NULL or grades.pointsposs = '')
and grades.date <= '1255503600'
AND grades.date >= '984902400'
group by assignmentID
order by grades.date DESC) using (gradeID);
There really isn't enough information to answer your question, and you've put a ... in the middle of the where clause which is weird. How big are the tables involved and what are the indexes?
Having said that, if there are too many terms in an in clause, you can see seriously degraded performance. Replace the use of in with a right join.
For starters, the table as_types in the in clause is not used. Left joining it serves no purpose so get rid of it.
That leaves the in clause having only the assignments and grades table from the outer query. Clearly the wheres the modify assignments belong in the where clause for the outer query. You should move all of the where grades=whatever into the on clause of the left join to grades.
The query is a little tough to follow, but I suspect that the subquery isn't necessary at all.
It seems like your query is basically thus:
SELECT FOO()
FROM assignments LEFT JOIN grades USING (assignmentID)
WHERE gradeID IN
(
SELECT grades.gradeID
FROM assignments LEFT JOIN grades USING (assignmentID)
WHERE your_conditions = TRUE
);
But, you're not doing anything really fancy in the where clause in the subquery.
I suspect something more like
SELECT FOO()
FROM assignments LEFT JOIN grades USING (assignmentID)
GROUP BY groupings
WHERE your_conditions_with_some_tweaks = TRUE;
would work just as well.
If I'm missing some key logic here please comment back and I'll edit/delete this post.