Why is order by not working in my Union query? - mysql

I'm trying to order by month by doing this query between 3 tables :
SELECT NULL AS `inState`, NULL AS `outState`, mb.`isDuplicate`, mb.`questStatus`, mb.state, mb.`subState`, mb.`recomputedOn`, c.`TSsubmitOn`, c.`submittedOn`, mb.week, mb.month
FROM metric_backlog mb INNER JOIN `CR` c ON c.crdbid = mb.crdbid
WHERE (mb.`productName` = 'ecc' AND mb.`releaseName`
IN ('6.7.3', '6.5.0', '6.7.0', '6.7.1', '6.6.0', '6.7.2', '6.2.0', '6.1.0')) AND mb.month = '1101'
UNION ALL
SELECT mi.`inState`, mi.`outState`, NULL AS sq, NULL AS ee, NULL AS yy, NULL AS qq, NULL AS xx, NULL AS mer, NULL AS yi, mi.week, mi.month as monthh
FROM metric_inout mi INNER JOIN `CR` c ON c.crdbid = mi.crdbid
WHERE mi.month = '1101' AND mi.month != "NULL" AND mi.month IS NOT NULL AND
mi.`productName` = 'ecc' AND mi.`releaseName`
IN ('6.7.3', '6.5.0', '6.7.0', '6.7.1', '6.6.0', '6.7.2', '6.2.0', '6.1.0')
ORDER BY mi.month
I get the error : Unknown column mi.month in order clause
thanks!

try this in the order by :
ORDER BY month
instead of select month from second query
SELECT NULL AS `inState`, NULL AS `outState`, mb.`isDuplicate`, mb.`questStatus`, mb.state, mb.`subState`, mb.`recomputedOn`, c.`TSsubmitOn`, c.`submittedOn`, mb.week, mb.month as month

With UNION ALL you select columns with the names gathered from the first select statement. There are no langer tables names associated with these names. So you select: inState, outState, isDuplicate, questStatus, state, subState, recomputedOn, TSsubmitOn, submittedOn, week, and month. (Names from the second - and further SQL statements if any - are completely irrelevant, by the way.)
Hence you cannot order by mi.month. After UNION ALL being applied, the table-associated field "mi.month" is no longer available. Only the unioned field "month" is. You can only order by month.

Related

Adding null values for records not existing in joined table

I have a table, called top_trends which has the following schema:
id int(11) AI PK
criteria_id int(11)
value varchar(255)
created_at timestamp
updated_at timestamp
Then I have another table, called search_criterias which has the following schema:
id int(11) AI PK
title varchar(255)
created_at timestamp
updated_at timestamp
Here is a query which provides the value based on maximum number of records by a value. So, if theres 2 records in top_trends with both having criteria_id as 1 and both have values of 3 in the top_trends.value column, and then a separate SINGLE record with the same criteria_id 1 but a value of 2, the query will produce a result of the selected criteria (being 1) having a value of 3 since the value 3 occurred more times than any other rows of values with criteria_id 1. So, in simple terms, the query chose the value 3 for criteria id 1 because that occurred the most amount of times based on the records in the top_trends having criteria_id 1
select
x.value as `values`
, sc.id as id
, sc.title
, sc.created_at
, sc.updated_at
, x.criteria_id as search_category_id
from
(
select
criteria_id
, `value`
from
top_trends
group by
`criteria_id`
order by
`value`
) x
left join search_criterias sc
on sc.id = x.criteria_id
group by
criteria_id
My issue is that unfortunately right now we dont have data for all possible search_criterias so some of the records in the search_criteria table are not being aggregated in my query.
For example, we have a search_criteria record of city, with an id of 5, but no records in the top_trends table with having a criteria_id of 5... so the query above is not including that search_criteria.
What I'd like to do is include those records in the search_criteria table that are not in the top_trends table but have the values attribute as null
LEFT JOIN is used when there are rows in the left table that have no match in the right table. Since the missing rows in your case are in top_trends, that should be the right table of the LEFT JOIN. So switch the order of the join.
select
x.value as `values`
, sc.id as id
, sc.title
, sc.created_at
, sc.updated_at
, x.criteria_id as search_category_id
from search_criterias sc
left join
(
select
criteria_id
, `value`
from
top_trends
group by
`criteria_id`
order by
`value`
) x
on sc.id = x.criteria_id
group bysc.id

Selecting distinct count in a group with only null values for a specific column

I have 2 columns like this - id and val.
I require such distinct id's where corresponding to each id there is a null value present.
Is it plausible to use "group by" by id and then use "having" clause where null is there?
I would use NOT EXISTS :
SELECT COUNT(DISTINCT id)
FROM table t
WHERE NOT EXISTS (SELECT 1 FROM table t1 WHERE t1.id = t.id AND t1.val IS NOT NULL);
Other option uses the GROUP BY :
SELECT COUNT(id)
FROM table t
GROUP BY id
HAVING SUM(CASE WHEN val IS NOT NULL THEN 1 ELSE 0 END) = 0;
To get ids that have a NULL value, I would be inclined to start with this:
select id
from t
group by id
having count(*) <> count(val);
This structure allows you to check for other values, such as a non-NULL value.
The simplest method to get the distinct ids with NULL values is:
select distinct id
from t
where val is null;
If you only want the count:
select count(distinct id)
from t
where val is null;

Avoid Subquery returned more than 1 value error in a table valued function

Is there a way to rewrite this query without getting error?: Subquery returned more than 1 value.
This is query is used in a LEFT JOIN in a table-valued function. Per requirement, I need to by default pull two scenario IDs (if parameter value is NULL or empty)
DECLARE #pScenarioName AS VARCHAR(30)
select
externalID,
PropertyAssetId,
LeaseID,
BeginDate
from ae11.dbo.ivw_Leases
WHERE PropertyAssetID IN
(select ID from AE11.dbo.PropertyAssets where scenarioID IN
(CASE WHEN isnull(#pScenarioName, '') = ''
THEN (select top 2 ID from rvw_Scenarios where Name like '[0-9][0-9][0-9][0-9]%'
AND LEN(Name) = 8
order by Name desc)
ELSE
(select ID from aex.dbo.rvw_Scenarios
where [Name] IN (#pScenarioName))
END)
)
I haven't tested this, but I use a similar approach when dealing with parameters. Of course, this won't necessarily work if the order of the ID is crucial in your second subquery.
SELECT ExternalID
,PropertyAssetId
,LeaseID
,BeginDate
FROM ae11.dbo.ivw_Leases
WHERE PropertyAssetID IN
(SELECT ID
FROM AE11.dbo.PropertyAssets
WHERE scenarioID IN
(SELECT TOP 2 ID
FROM rvw_Scenarios
WHERE (#ISNULL(#pScenarioName,'') = ''
AND Name LIKE '[0-9][0-9][0-9][0-9]%'
AND LEN(Name) = 8)
ORDER BY Name DESC
UNION ALL
SELECT ID FROM aex.dbo.rvw_Scenarios
WHERE (#pScenarioName IS NOT NULL)
AND [Name] IN (#pScenarioName)))

SQL Query Still having duplicates after group by

SELECT *
FROM `eBayorders`
WHERE (`OrderIDAmazon` IS NULL
OR `OrderIDAmazon` = "null")
AND `Flag` = "True"
AND `TYPE` = "GROUP"
AND (`Carrier` IS NULL
OR `Carrier` = "null")
AND LEFT(`SKU`, 1) = "B"
AND datediff(now(), `TIME`) < 4
AND (`TrackingInfo` IS NULL
OR `TrackingInfo` = "null")
AND `STATUS` = "PROCESSING"
GROUP BY `Name`,
`SKU`
ORDER BY `TIME` ASC LIMIT 7
I am trying to make sure that none of the names and skus will show up in the same result. I am trying to group by name and then sku, however I ran into the problem where a result showed up that has the same name and different skus, which I dont want to happen. How can I fix this query to make sure that there is always distinct names and skus in the result set?!
For example say I have an Order:
Name: Ben Z, SKU : B000334, oldest
Name: Ben Z, SKU : B000333, second oldest
Name: Will, SKU: B000334, third oldest
Name: John, SKU: B000036, fourth oldest
The query should return only:
Name: Ben Z, SKU : B000334, oldest
Name: John, SKU: B000036, fourth oldest
This is because all of the Names should only have one entry in the set along with SKU.
There are two problems here.
The first is the ANSI standard says that if you have a GROUP BY clause, the only things you can put in the SELECT clause are items listed in GROUP BY or items that use an aggregate function (SUM, COUNT, MAX, etc). The query in your question selects all the columns in the table, even those not in the GROUP BY. If you have multiple records that match a group, the table doesn't know which record to use for those extra columns.
MySql is dumb about this. A sane database server would throw an error and refuse to run that query. Sql Server, Oracle and Postgresql will all do that. MySql will make a guess about which data you want. It's not usually a good idea to let your DB server make guesses about data.
But that doesn't explain the duplicates... just why the bad query runs at all. The reason you have duplicates is that you group on both Name and SKU. So, for example, for Ben Z's record you want to see just the oldest SKU. But when you group on both Name and SKU, you get a seperate group for { Ben Z, B000334 } and { Ben Z, B000333 }... that's two rows for Ben Z, but it's what the query asked for, since SKU is also part of what determines a group.
If you only want to see one record per person, you need to group by just the person fields. This may mean building that part of the query first, to determine the base record set you need, and then JOINing to this original query as part of your full solution.
SELECT T1.*
FROM eBayorders T1
JOIN
( SELECT `Name`,
`SKU`,
max(`TIME`) AS MAX_TIME
FROM eBayorders
WHERE (`OrderIDAmazon` IS NULL OR `OrderIDAmazon` = "null") AND `Flag` = "True" AND `TYPE` = "GROUP" AND (`Carrier` IS NULL OR `Carrier` = "null") AND LEFT(`SKU`, 1) = "B" AND datediff(now(), `TIME`) < 4 AND (`TrackingInfo` IS NULL OR `TrackingInfo` = "null") AND `STATUS` = "PROCESSING"
GROUP BY `Name`,
`SKU`) AS dedupe ON T1.`Name` = dedupe.`Name`
AND T1.`SKU` = dedupe.`SKU`
AND T1.`Time` = dedupe.`MAX_TIME`
ORDER BY `TIME` ASC LIMIT 7
Your database platform should have complained because your original query had items in the select list which were not present in the group by (generally not allowed). The above should resolve it.
An even better option would be the following if your database supported window functions (MySQL doesn't, unfortunately):
SELECT *
FROM
( SELECT *,
row_number() over (partition BY `Name`, `SKU`
ORDER BY `TIME` ASC) AS dedupe_rank
FROM eBayorders
WHERE (`OrderIDAmazon` IS NULL OR `OrderIDAmazon` = "null") AND `Flag` = "True" AND `TYPE` = "GROUP" AND (`Carrier` IS NULL OR `Carrier` = "null") AND LEFT(`SKU`, 1) = "B" AND datediff(now(), `TIME`) < 4 AND (`TrackingInfo` IS NULL OR `TrackingInfo` = "null") AND `STATUS` = "PROCESSING" ) T
WHERE dedupe_rank = 1
ORDER BY T.`TIME` ASC LIMIT 7
You are trying to obtain a result set which doesn't have repeats in either the SKU nor the Name column.
You might have to add a subquery to your query, to accomplish that. The inner query would group by Name, and the Outer query would group by SKU, such that you won't have repeats in either column.
Try this :
SELECT *
FROM
(SELECT *
FROM eBayorders
WHERE (`OrderIDAmazon` IS NULL
OR `OrderIDAmazon` = "null")
AND `Flag` = "True"
AND `TYPE` = "GROUP"
AND (`Carrier` IS NULL
OR `Carrier` = "null")
AND LEFT(`SKU`, 1) = "B"
AND datediff(now(), `TIME`) < 4
AND (`TrackingInfo` IS NULL
OR `TrackingInfo` = "null")
AND `STATUS` = "PROCESSING"
GROUP BY Name)
GROUP BY `SKU`
ORDER BY `TIME` ASC LIMIT 7
With this approach you just filter out rows that do not contain the largest/latest value for TIME.
SELECT SKU, Name
FROM eBayOrders o
WHERE NOT EXISTS (SELECT 0 FROM eBayOrders WHERE Name = o.name and Time > o.Time)
GROUP BY SKU, Name
Note: If two records have exactly the same Name and Time values, you may still end up getting duplicates, because the logic you have specified does not provide any way to break up a tie.

How to bypass a reference to an outer table in subquery?

I've been dealing with these two tables:
Document
id company_id etc
=======================
1 2 x
2 2 x
Version
id document_id version date_created date_issued date_accepted
==========================================================================
1 1 1 2013-04-29 2013-04-30 NULL
2 2 1 2013-05-01 NULL NULL
3 1 2 2013-05-01 2013-05-01 2013-05-03
There's a page where I want to list all documents with their attributes.
And I would like to add a single have status from each document.
The status can be derived from the most present date that corresponding Versions have.
It is possible that an older version is being accepted.
The query result I am looking for is like this:
id company_id etc status
==================================
1 2 x accepted
2 2 x created
I started out by making a query which combines all dates and add a status next to it.
It works as expected and when I add the document_id things look alright.
SELECT `status`
FROM (
SELECT max(date_created) as `date`,'created' as `status` FROM version WHERE document_id = 1
UNION
SELECT max(date_issued),'issued' FROM version WHERE document_id = 1
UNION
SELECT max(date_accepted),'accepted' FROM version WHERE document_id = 1
ORDER BY date DESC
LIMIT 1
) as maxi
When I try to incorporate this query as a subquery, I can't make it work.
SELECT *, (
SELECT `status` FROM (
SELECT max(date_created) as `date`,'created' as `status`FROM version WHERE document_id = document.id
UNION
SELECT max(date_issued),'issued' FROM version WHERE document_id = document.id
UNION
SELECT max(date_accepted),'accepted' FROM version WHERE document_id = document.id
ORDER BY date DESC
LIMIT 1
) as maxi
) as `status`
FROM `document`
This will get me the error Unknown column 'document.id' in 'where clause'. So I've read around at SO and figured it simply can't reach the value offer.id since it's a subquery in a subquery. So I tried to take another approach and get all the statuses at once, to avoid the WHERE statement, and JOIN them. I ended up with the next query.
SELECT MAX(`date`),`status`, document_id
FROM (
SELECT datetime_created as `date`, 'created' as `status`,document_id FROM `version`
UNION
SELECT datetime_issued, 'issued',document_id FROM `version`
UNION
SELECT datetime_accepted, 'accepted',document_id FROM `version`
) as dates
GROUP BY offer_id
No error this time but I realized that the status couldn't be the correct one since it got lost during the GROUP BY. I've tried combinations of the two but both flaws keep hindering me. Could any one suggest how to do this in a single query without changing my database? (I know that saving the dates in a separate table would simply things)
I have not tested this, but you can do it like this (you might need to tweak the details)
It is basically looking at it from a completely different angle.
select
d.*,
(CASE GREATEST(ifnull(v.date_created, 0), ifnull(v.date_issued,0), ifnull(v.date_accepted,0) )
WHEN null THEN 'unknown'
WHEN v.date_accepted THEN 'accepted'
WHEN v.date_issued THEN 'issued'
WHEN v.date_created THEN 'created'
END) as status
from document d
left join version v on
v.document_id = d.document_id and
not exists (select 1 from (select * from version) x where x.document_id = v.document_id and x.id <> v.id and x.version > v.version)
Can you normalise your table designs to move the status / dates onto a different table from the Versions?
If no possibly something like this:-
SELECT Document.id, Document.company_id, Document.etc, CASE WHEN Sub1.status = 3 THEN 'accepted' WHEN Sub1.status = 2 THEN 'issued' WHEN Sub1.status = 1 THEN 'created' ELSE NULL END AS status
FROM Document
INNER JOIN (
SELECT document_id, MAX(CASE WHEN date_accepted IS NOT NULL THEN 3 WHEN date_issued IS NOT NULL THEN 2 WHEN date_created IS NOT NULL THEN 1 ELSE NULL END) AS status
FROM Version
GROUP BY document_id
) Sub1
ON Document.id = Sub1.document_id
The subselect gets the highest status for any document from the version table. Each possible versions highest status is returned as a number, and by grouping that on the document id it will get the highest status of any version. This is joined back against the Document table and the number for the version number converted into the text description.
select Doc.document_id,Doc.company_id,Doc.etc,f.status
from Document Doc
inner join
(select Ver.document_id,
case when Ver.date_accepted is not null then 'Accepted'
when Ver.date_issued is not null then 'Issued'
when Ver.date_created is not null then 'Created'
end as status
from version Ver
inner join (
select document_id,MAX(version) VersionId
from version
group by document_id
)t on t.document_id=Ver.document_id
where t.VersionId=Ver.version
)f on Doc.document_id=f.document_id
SQL Fiddle