I have a query which gives result as below, how to replace duplicate values with NULL
Query:
SELECT
word.lemma,
synset.definition,
synset.pos,
sampletable.sample
FROM
word
LEFT JOIN
sense ON word.wordid = sense.wordid
LEFT JOIN
synset ON sense.synsetid = synset.synsetid
LEFT JOIN
sampletable ON synset.synsetid = sampletable.synsetid
WHERE
word.lemma = 'good'
Result:
Required Result: all the greyed out results as NULL
First, this is the type of transformation that is generally better done at the application level. The reason is that it presupposes that the result set is in a particular order -- and you seem to be assuming this even with no order by clause.
Second, it is often simpler in the application.
However, in MySQL 8+, it is not that hard. You can do:
SELECT w.lemma,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY w.lemma, ss.definition ORDER BY st.sample) = 1
THEN ss.definition
END) as definition,
ss.pos,
st.sample
FROM word w LEFT JOIN
sense s
ON w.wordid = s.wordid LEFT JOIN
synset ss
ON s.synsetid = ss.synsetid LEFT JOIN
sampletable st
ON ss.synsetid = st.synsetid
WHERE w.lemma = 'good'
ORDER BY w.lemma, ss.definition, st.sample;
For this to work reliably, the outer ORDER BY clause needs to be compatible with the ORDER BY for the window function.
If you are using Mysql 8 try with Rank().. As I didn't have your table or data couldn't test this query.
SELECT
word.lemma
,case when r = 1 synset.definition else null end as definition
,synset.pos
,sampletable.sample
FROM
(
SELECT
word.lemma
,synset.definition
,synset.pos
,sampletable.sample
,RANK() OVER (PARTITION BY synset.definition ORDER BY synset.definition) r
FROM
(
SELECT
word.lemma,
synset.definition,
synset.pos,
sampletable.sample
FROM
word
LEFT JOIN
sense ON word.wordid = sense.wordid
LEFT JOIN
synset ON sense.synsetid = synset.synsetid
LEFT JOIN
sampletable ON synset.synsetid = sampletable.synsetid
WHERE
word.lemma = 'good'
) t
)t1;
Related
I want to use a query similar to the following to retrieve all rows in events that have at least one corresponding event_attendances row for 'male' and 'female'. The below query returns no rows (where there certainly are some events that have event_attendances from both genders).
Is there a way to do this without a subquery (due to the way the SQL is being generated in my application, a subquery would be considerably more difficult for me to implement)?
SELECT * FROM events e
LEFT JOIN event_attendances ea ON (e.id = ea.event_id)
GROUP BY e.id
HAVING ea.gender = 'female' AND ea.gender = 'male'
Use
HAVING sum(ea.gender = 'female') > 0
AND sum(ea.gender = 'male') > 0
or
HAVING count(distinct ea.gender) = 2
BTW you should use a subquery to get all data when you group.
SELECT *
FROM events
where id in
(
SELECT events.id
FROM events
LEFT JOIN event_attendances ON (events.id = event_attendances.event_id)
GROUP BY events.id
HAVING count(distinct event_attendances.gender) = 2
)
HAVING generally used with aggregate functions.
You should do self-jointo get the desired results, since ea.gender = 'female' AND ea.gender = 'male' is contradictory,which always returns empty set.
You can try this
SELECT T1.*
FROM events T1
INNER JOIN
(SELECT events.id
FROM events
LEFT JOIN event_attendances ON (events.id = event_attendances.event_id)
GROUP BY events.id
HAVING COUNT(DISTINCT event_attendances.gender) = 2) T2 ON T1.events.id=T1.events.id
Hope this helps.
In relation to the answer I accepted for this post, SQL Group By and Limit issue, I need to figure out how to create that query using SQLAlchemy. For reference, the query I need to run is:
SELECT t.id, t.creation_time, c.id, c.creation_time
FROM (SELECT id, creation_time
FROM thread
ORDER BY creation_time DESC
LIMIT 5
) t
LEFT OUTER JOIN comment c ON c.thread_id = t.id
WHERE 3 >= (SELECT COUNT(1)
FROM comment c2
WHERE c.thread_id = c2.thread_id
AND c.creation_time <= c2.creation_time
)
I have the first half of the query, but I am struggling with the syntax for the WHERE clause and how to combine it with the JOIN. Any one have any suggestions?
Thanks!
EDIT: First attempt seems to mess up around the .filter() call:
c = aliased(Comment)
c2 = aliased(Comment)
subq = db.session.query(Thread.id).filter_by(topic_id=122098).order_by(Thread.creation_time.desc()).limit(2).offset(2).subquery('t')
subq2 = db.session.query(func.count(1).label("count")).filter(c.id==c2.id).subquery('z')
q = db.session.query(subq.c.id, c.id).outerjoin(c, c.thread_id==subq.c.id).filter(3 >= subq2.c.count)
this generates the following SQL:
SELECT t.id AS t_id, comment_1.id AS comment_1_id
FROM (SELECT count(1) AS count
FROM comment AS comment_1, comment AS comment_2
WHERE comment_1.id = comment_2.id) AS z, (SELECT thread.id AS id
FROM thread
WHERE thread.topic_id = :topic_id ORDER BY thread.creation_time DESC
LIMIT 2 OFFSET 2) AS t LEFT OUTER JOIN comment AS comment_1 ON comment_1.thread_id = t.id
WHERE z.count <= 3
Notice the sub-query ordering is incorrect, and subq2 somehow is selecting from comment twice. Manually fixing that gives the right results, I am just unsure of how to get SQLAlchemy to get it right.
Try this:
c = db.aliased(Comment, name='c')
c2 = db.aliased(Comment, name='c2')
sq = (db.session
.query(Thread.id, Thread.creation_time)
.order_by(Thread.creation_time.desc())
.limit(5)
).subquery(name='t')
sq2 = (
db.session.query(db.func.count(1))
.select_from(c2)
.filter(c.thread_id == c2.thread_id)
.filter(c.creation_time <= c2.creation_time)
.correlate(c)
.as_scalar()
)
q = (db.session
.query(
sq.c.id, sq.c.creation_time,
c.id, c.creation_time,
)
.outerjoin(c, c.thread_id == sq.c.id)
.filter(3 >= sq2)
)
I am trying to bring back a string based on an IF statement but it is extremely slow.
It has something to do with the first subquery but I am unsure of how to rearrange this as to bring back the same results but faster.
Here is my SQL:
SELECT IF
(
(
SELECT COUNT(*)
FROM
(
SELECT DISTINCT enquiryId, type
FROM parts_enquiries, parts_service_types AS pst
WHERE parts_enquiries.serviceTypeId = pst.id
) AS parts
WHERE parts.enquiryId = enquiries.id
) > 1, 'Mixed',
(
SELECT DISTINCT type
FROM parts_enquiries, parts_service_types AS pst
WHERE parts_enquiries.serviceTypeId = pst.id AND enquiryId = enquiries.id
)
) AS partTypes
FROM enquiries,
entities
WHERE enquiries.entityId = entities.id
How can I make it faster?
I have modified my original query below, but I am getting the error that subquery returns more than one row:
SELECT
(SELECT
CASE WHEN COUNT(DISTINCT type) > 1 THEN 'Mixed' ELSE `type` END AS type
FROM parts_enquiries
INNER JOIN parts_service_types AS pst ON parts_enquiries.serviceTypeId = pst.id
INNER JOIN enquiries ON parts_enquiries.enquiryId = enquiries.id
INNER JOIN entities ON enquiries.entityId = entities.id
GROUP BY enquiryId) AS partTypes
FROM enquiries,
entities
WHERE enquiries.entityId = entities.id
Please have a look if this query yields the same results:
SELECT
enquiryId,
CASE WHEN COUNT(DISTINCT type) > 1 THEN 'Mixed' ELSE `type` END AS type
FROM parts_enquiries
INNER JOIN parts_service_types AS pst ON parts_enquiries.serviceTypeId = pst.id
INNER JOIN enquiries ON parts_enquiries.enquiryId = enquiries.id
INNER JOIN entities ON enquiries.entityId = entities.id
GROUP BY enquiryId
But N.B.'s comment is still valid. To see if and index is used and other information we need to see the EXPLAIN and the table definitions.
This should get you what you want.
I would first pre-query your parts enquiries and parts service types looking for both the count and MINIMUM of the part 'type', grouped by the enquiry ID.
then, run your IF() against that result. If the distinct count is > 0, then 'Mixed'. If only one, since I did the MIN(), it would only have the description of that one value that you desire anyhow.
SELECT
E.ID
IF ( PreQuery.DistTypes > 1, 'Mixed', PreQuery.FirstType ) as PartType
from
Enquiries E
JOIN ( SELECT
PE.EnquiryID,
COUNT( DISTINCT PE.ServiceTypeID ) as DistTypes,
MIN( PST.Type ) as FirstType
from
Parts_Enquiries PE
JOIN Parts_Service_Types PST
ON PE.ServiceTypeID = PST.ID
group by
PE.EnquiryID ) as PreQuery
ON E.ID = PreQuery.EnquiryID
I have following SQL query but this is not quite what I want:
SELECT
TOP (20) Attribs.ImageID AS ItemID
FROM
Attribs
LEFT OUTER JOIN
Items ON Attribs.ImageID = Items.ImageID
WHERE
(attribID IN ('a','b','c','d','e'))
AND (deleted NOT IN (1,2))
AND Attribs.attribID = 'a' AND Attribs.attribID = 'b'
GROUP BY
Attribs.ImageID
ORDER BY
COUNT(DISTINCT attribID) DESC
What I need is to query
AND Attribs.attribID = 'a' AND Attribs.attribID = 'b'
first, then rest of the WHERE clause based on the above query results.
Is this possible to achieve using sub query?
I'm using SQL Server 2008
Thank you
I'm not totally getting the reason why you want to do this one query first before the other.... but you could use a Common Table Expression (CTE) - something like this:
;WITH FirstQuery AS
(
SELECT a.ImageId
FROM dbo.Attribs a
WHERE a.attribID = 'a' AND a.attribID = 'b'
)
SELECT
TOP (20) a.ImageID AS ItemID
FROM
dbo.Attribs a
INNER JOIN
FirstQuery fq ON a.ImageId = fq.ImageId
LEFT OUTER JOIN
dbo.Items i ON a.ImageID = i.ImageID
WHERE
(attribID IN ('a','b','c','d','e'))
AND (deleted NOT IN (1,2))
GROUP BY
a.ImageID
ORDER BY
COUNT(DISTINCT attribID) DESC
With this, you first select the ImageID from your dbo.Attribs table in the CTE, and then join that result set with the result of the table and join to the Items table.
You want to do that for performance issues? Because splitting this up won't change the results.
Anyway, you can do this like:
SELECT TOP (20) rn_Attribs.ImageID AS ItemID
FROM (SELECT *
FROM Attribs
WHERE Attribs.attribID = '123' AND Attribs.attribID = '456') rn_Attribs
LEFT OUTER JOIN Items ON rn_Attribs.ImageID = Items.ImageID
WHERE(attribID IN ('a','b','c'))
AND (deleted NOT IN (1,2))
GROUP BY rn_Attribs.ImageID
ORDER BY COUNT(DISTINCT attribID) DESC
I have a correlated subquery that will return a list of quantities, but I need the highest quantity, and only the highest. So I tried to introduce an order by and a LIMIT of 1 to achieve this, but MySQL throws an error stating it doesn't yet support limits in subqueries. Any thoughts on how to work around this?
SELECT Product.Name, ProductOption.Name, a.Qty, a.Price, SheetSize.UpgradeCost,
FinishType.Name, FinishOption.Name, FinishTierPrice.Qty, FinishTierPrice.Price
FROM `Product`
JOIN `ProductOption`
ON Product.idProduct = ProductOption.Product_idProduct
JOIN `ProductOptionTier` AS a
ON a.ProductOption_idProductOption = ProductOption.idProductOption
JOIN `PaperSize`
ON PaperSize.idPaperSize = ProductOption.PaperSize_idPaperSize
JOIN `SheetSize`
ON SheetSize.PaperSize_idPaperSize = PaperSize.idPaperSize
JOIN `FinishOption`
ON FinishOption.Product_idProduct = Product.idProduct
JOIN `FinishType`
ON FinishType.idFinishType = FinishOption.Finishtype_idFinishType
JOIN `FinishTierPrice`
ON FinishTierPrice.FinishOption_idFinishOption = FinishOption.idFinishOption
WHERE Product.idProduct = 1
AND FinishTierPrice.idFinishTierPrice IN (SELECT FinishTierPrice.idFinishTierPrice
FROM `FinishTierPrice`
WHERE FinishTierPrice.Qty <= a.Qty
ORDER BY a.Qty DESC
LIMIT 1)
This is a variation of the greatest-n-per-group problem that comes up frequently.
You want the single row form FinishTierPrice (call it p1), matching the FinishOption and with the greatest Qty, but still less than or equal to the Qty of the ProductOptionTier.
One way to do this is to try to match a second row (p2) from FinishTierPrice that would have the same FinishOption and a greater Qty. If no such row exists (use an outer join and test that it's NULL), then the row found by p1 is the greatest.
SELECT Product.Name, ProductOption.Name, a.Qty, a.Price, SheetSize.UpgradeCost,
FinishType.Name, FinishOption.Name, FinishTierPrice.Qty, FinishTierPrice.Price
FROM `Product`
JOIN `ProductOption`
ON Product.idProduct = ProductOption.Product_idProduct
JOIN `ProductOptionTier` AS a
ON a.ProductOption_idProductOption = ProductOption.idProductOption
JOIN `PaperSize`
ON PaperSize.idPaperSize = ProductOption.PaperSize_idPaperSize
JOIN `SheetSize`
ON SheetSize.PaperSize_idPaperSize = PaperSize.idPaperSize
JOIN `FinishOption`
ON FinishOption.Product_idProduct = Product.idProduct
JOIN `FinishType`
ON FinishType.idFinishType = FinishOption.Finishtype_idFinishType
JOIN `FinishTierPrice` AS p1
ON p1.FinishOption_idFinishOption = FinishOption.idFinishOption
AND p1.Qty <= a.Qty
LEFT OUTER JOIN `FinishTierPrice` AS p2
ON p2.FinishOption_idFinishOption = FinishOption.idFinishOption
AND p2.Qty <= a.Qty AND (p2.Qty > p1.Qty OR p2.Qty = p1.Qty
AND p2.idFinishTierPrice > p1.idFinishTierPrice)
WHERE Product.idProduct = 1
AND p2.idFinishTierPrice IS NULL