Count if avg is below/above X - mysql

I am trying to get the number of 'critics' and 'promoters' from average of ratings from a joined table on a specific group of questions
SELECT category
, SUM( IF( round(avg(items.value) ) <= 6, 1, 0) ) AS critics
, SUM( IF( round(avg(items.value) ) >= 9, 1, 0) ) AS promoters
FROM reviews
INNER JOIN items
ON reviews.id = items.review_id
AND items.question_id in (1, 2, 4)
GROUP BY category
However I get the error:
General error: 1111 Invalid use of group function

I think you should try with using having with it, something like below:
SELECT
category,
COUNT(items.id) AS critics
FROM reviews
INNER JOIN items ON reviews.id = items.review_id AND
items.question_id IN (1, 2, 4)
GROUP BY category
HAVING ROUND(AVG(items.value)) <= 6

First retrieve category wise rounded average value and then apply condition either it is critics and promoters.
-- MySQL
SELECT t.category
, CASE WHEN t.avg_value <= 6
THEN 1
ELSE 0
END critics
, CASE WHEN t.avg_value >= 9
THEN 1
ELSE 0
END promoters
FROM (SELECT category
, ROUND(AVG(items.value)) avg_value
FROM reviews
INNER JOIN items
ON reviews.id = items.review_id
AND items.question_id IN (1, 2, 4)
GROUP BY category) t
Please check this url for finding out pseudocode https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=2679b2be50c3059c73ab9754c612179c
First retrieve category and review_id wise rounded average value and then apply condition either it is critics and promoters.
SELECT t.category
, SUM(CASE WHEN t.avg_value <= 6
THEN 1
ELSE 0
END) critics
, SUM(CASE WHEN t.avg_value >= 9
THEN 1
ELSE 0
END) promoters
FROM (SELECT category
, items.review_id
, ROUND(AVG(items.value)) avg_value
FROM reviews
INNER JOIN items
ON reviews.id = items.review_id
AND items.question_id IN (1, 2, 4)
GROUP BY category
, items.review_id) t
GROUP BY t.category

Related

Count number of ratings

I want to calculate number of every rating group by given date range. I wrote the following query which is working perfect:
SELECT c.day,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 1 AND r.campaign_id = 2) AS rating1s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 2 AND r.campaign_id = 2) AS rating2s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 3 AND r.campaign_id = 2) AS rating3s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 4 AND r.campaign_id = 2) AS rating4s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 5 AND r.campaign_id = 2) AS rating5s
FROM calendar c
WHERE c.day >= '2018-08-01'
GROUP BY c.day
ORDER BY c.day
LIMIT 0, 31
But this is not an optimized way due to 5 sub queries and query is taking almost 2mins on my localhost, how can I optimize this query? The sample output is attached and I need same output.
You can rephrase this as conditional aggregation:
SELECT DATE(r.created_at),
COUNT(DISTINCT CASE WHEN r.rating = 1 THEN r.user_id END) as raging_1,
COUNT(DISTINCT CASE WHEN r.rating = 2 THEN r.user_id END) as raging_2,
COUNT(DISTINCT CASE WHEN r.rating = 3 THEN r.user_id END) as raging_3,
COUNT(DISTINCT CASE WHEN r.rating = 4 THEN r.user_id END) as raging_4,
COUNT(DISTINCT CASE WHEN r.rating = 5 THEN r.user_id END) as raging_5
FROM ratings r
WHERE r.campaign_id = 2 AND
r.created_at >= '2018-08-01'
GROUP BY DATE(r.created_at);
COUNT(DISTINCT) can be expensive. Remove it if you can.
Otherwise, it might be faster to do the DISTINCT once:
SELECT dte,
SUM( r.rating = 1 ) as raging_1,
SUM( r.rating = 2 ) as raging_2,
SUM( r.rating = 3 ) as raging_3,
SUM( r.rating = 4 ) as raging_4,
SUM( r.rating = 5 ) as raging_5
FROM (SELECT DISTINCT user_id, rating, DATE(r.created_at) as dte
FROM ratings r
WHERE r.campaign_id = 2 AND
r.created_at >= '2018-08-01'
) urd
GROUP BY dte;
This returns rows for each day that has at least one rating. If some days would have all zeroes, then you'll need an outer join of some sort. That adds almost nothing to the performance, so it can be tacked on if one of the above solutions works.
Here is a query I made using #Gordon's answer:
SELECT DATE(r.created_at),
COUNT(
DISTINCT
CASE
WHEN r.rating = 1
THEN user_id
ELSE 0
END
) as rating1s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 2
THEN user_id
ELSE 0
END
) as rating2s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 3
THEN user_id
ELSE 0
END
) as rating3s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 4
THEN user_id
ELSE 0
END
) as rating4s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 5
THEN user_id
ELSE 0
END
) as rating5s
FROM ratings r
WHERE r.campaign_id = 2 AND
DATE(r.created_at) >= '2018-08-01'
GROUP BY DATE(r.created_at)
This is still not optimized but much better than my initial solution.

Mysql query subselect getting NULL

I'm trying to join over three tables and get the active plan of a vendor. It is possible, that the vendor had a lot of plans in the past, but the active on is that counts.
The whole query is bigger (counting items he has aso) and because of that i did it with a subselect, but for this example it should be enough.
I always get plantitle and planstatus of NULL. How can i fix this?
Query
SELECT v.title
, plans.title AS plantitle
, uplans.status AS planstatus
, uplans.uid
, COUNT(DISTINCT obs.id) AS obj_count
, sum(case when obs.published = -1 then 1 else 0 end) trash
, sum(case when obs.published = 1 then 1 else 0 end) published
, sum(case when obs.published = 0 then 1 else 0 end) unpublished
FROM `vendors` AS v
LEFT JOIN objects AS obs ON obs.vid = v.id
LEFT JOIN `userplans` AS uplans ON uplans.uid = (
SELECT up.id
FROM `userplans`AS up
WHERE up.uid=v.uid AND status = "ACTIVE" LIMIT 1)
LEFT JOIN `plans` AS plans ON plans.id=uplans.pid
GROUP BY v.id
ORDER BY v.id asc
Tables
Vendors
id, uid, title
10, 1, Name 1
20, 4, Name 2
30, 5, Name 3
Plans
id, title
40, Plan 1
50, Plan 2
Userplans
id, uid, pid, status
1, 1, 40, CANCELED
2, 1, 50, CANCELED
3, 1, 40, CANCELED
4, 4, 50, CANCELED
5, 4, 50, CANCELED
6, 4, 50, ACTIVE
7, 1, 40, ACTIVE
Lets get the object counts 1st as the associations to other tables may be 1-M which would result in larger counts. then join to the other needed information.
This still assumes that a the combination of a user and plan in userPlan can only have 1 active record. If it can have more than 1 I still need to know which active userPlan to select.
Also why the left joins? are you after all vendors regardless of plans and objects and userplans? Is it possible that a vendor HAS no active plans in which case the title would be null?
SELECT v.title
, P.title AS plantitle
, UP.status AS planstatus
, up.uid
, O.obj_count
, O.trash
, O.published
, O.unpublished
FROM vendors v
LEFT JOIN userplans UP
ON V.uid = UP.UID
AND UP.status = 'ACTIVE'
LEFT JOIN (SELECT obs.VID
,COUNT(DISTINCT obs.id) AS obj_count
,sum(case when obs.published = -1 then 1 else 0 end) trash
,sum(case when obs.published = 1 then 1 else 0 end) published
,sum(case when obs.published = 0 then 1 else 0 end) unpublished
FROM OBJECTS obs
GROUP BY obs.VID) O
ON O.vid = v.id
LEFT JOIN `plans` P
ON P.id=UP.pid
ORDER BY v.id asc
And to address the comment to get the "Latest" Plan regardless of status (assuming latest would have the highest ID in the userPlans table.
SELECT v.title
, P.title AS plantitle
, UP.status AS planstatus
, up.uid
, O.obj_count
, O.trash
, O.published
, O.unpublished
FROM vendors v
LEFT JOIN (SELECT * -- though really we should just pull in the columns needed.
FROM USERPLANS U1
INNER JOIN (SELECT max(ID) ID, PID, UID
FROM UserPlans
GROUP BY PID, UID) U2
on U1.ID = U2.ID) UP
ON V.uid = UP.UID
LEFT JOIN (SELECT obs.VID
,COUNT(DISTINCT obs.id) AS obj_count
,sum(case when obs.published = -1 then 1 else 0 end) trash
,sum(case when obs.published = 1 then 1 else 0 end) published
,sum(case when obs.published = 0 then 1 else 0 end) unpublished
FROM OBJECTS obs
GROUP BY obs.VID) O
ON O.vid = v.id
LEFT JOIN `plans` P
ON P.id=UP.pid
ORDER BY v.id asc
In your join of userplans - you are joining uplans.uid with the selected id from the same table - you need to join on the same column - change the line to :
LEFT JOIN `userplans` AS uplans ON uplans.id = (
Something like this might work:
SELECT Vendors.title, Plans.title, Userplans.status, Userplans.uid FROM Vendors, Plans, Userplans
WHERE Vendors.uid = Userplans.uid AND Plans.id = Userplans.pid AND Userplans.status = 'Active'
This assumes that you can only ever have one Active per user

How to split SQL query results into columns based on two WHERE conditions and two calculated COUNT fields?

I have the following (simplified) database schema:
Persons:
[Id] [Name]
-------------------
1 'Peter'
2 'John'
3 'Anna'
Items:
[Id] [ItemName] [ItemStatus]
-------------------
10 'Cake' 1
20 'Dog' 2
ItemDocuments:
[Id] [ItemId] [DocumentName] [Date]
-------------------
101 10 'CakeDocument1' '2016-01-01 00:00:00'
201 20 'DogDocument1' '2016-02-02 00:00:00'
301 10 'CakeDocument2' '2016-03-03 00:00:00'
401 20 'DogDocument2' '2016-04-04 00:00:00'
DocumentProcessors:
[PersonId] [DocumentId]
-------------------
1 101
1 201
2 301
I have also set up an SQL fiddle to play with: http://www.sqlfiddle.com/#!3/e6082
The relation logic is the following: every Person can work on zero or infinite number of ItemDocuments (many-to-many); each ItemDocument belongs to exactly one Item (one-to-many). Item has status 1 - Active, 2 - Closed
What I need is a report that fulfills the following requirements:
for each person in Persons table, display count of Items that have ItemDocuments related to this person
the counts should be split in two columns by ItemStatus
the query should be filterable by two optional date periods (using two BETWEEN conditions on ItemDocuments.Date field) and the Item counts should also be split into two periods
if a Person does not have any ItemDocuments assigned, it still should be shown in the results with all count values set to 0
if a Person has more than one ItemDocument for an Item, the Item still should be counted only once
Essentially, here is how the results should look like if I use both periods to NULL (to read all the data):
[PersonName] [Active Items for period 1] [Closed Items for period 1] [Active Items for period 2] [Closed Items for period 2]
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
'Peter' 1 1 1 1
'John' 1 0 1 0
'Anna' 0 0 0 0
While I can create an SQL query for each requirement separately, I have a problem to understand how to combine all of them together into one.
For example, I can split ItemStatus counts in two columns using
COUNT(CASE WHEN t.ItemStatus = 1 THEN 1 ELSE NULL END) AS Active,
COUNT(CASE WHEN t.ItemStatus = 2 THEN 1 ELSE NULL END) AS Closed
and I can filter by two periods (with max/min date constants from MS SQL server specification to avoid NULLs for optional period dates) using
between coalesce(#start1, '1753-01-01') and coalesce(#end1, '9999-12-31')
between coalesce(#start2, '1753-01-01') and coalesce(#end2, '9999-12-31')
but how to combine all of this together, considering also JOINs between tables?
Is there any technique, join or MS SQL Server specific approach to do this in efficient way?
My first attempt seems to work as required but it looks like ugly subquery duplications multiple times:
DECLARE #start1 DATETIME, #start2 DATETIME, #end1 DATETIME, #end2 DATETIME
-- SET #start2 = '2017-01-01'
SELECT
p.Name,
(SELECT COUNT(1)
FROM Items i
WHERE i.ItemStatus = 1 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start1, '1753-01-01') AND COALESCE(#end1, '9999-12-31')
)
) AS Active1,
(SELECT COUNT(*)
FROM Items i
WHERE i.ItemStatus = 2 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start1, '1753-01-01') AND COALESCE(#end1, '9999-12-31')
)
) AS Closed1,
(SELECT COUNT(1)
FROM Items i
WHERE i.ItemStatus = 1 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start2, '1753-01-01') AND COALESCE(#end2, '9999-12-31')
)
) AS Active2,
(SELECT COUNT(*)
FROM Items i
WHERE i.ItemStatus = 2 AND EXISTS(
SELECT 1
FROM DocumentProcessors AS dcp
INNER JOIN ItemDocuments AS idc ON dcp.DocumentId = idc.Id
WHERE dcp.PersonId = p.Id AND idc.ItemId = i.Id
AND idc.Date BETWEEN COALESCE(#start2, '1753-01-01') AND COALESCE(#end2, '9999-12-31')
)
) AS Closed2
FROM Persons p
I'm not absolutely sure if I really got what you want, but you might try this
WITH AllData AS
(
SELECT p.Id AS PersonId
,p.Name AS Person
,id.Date AS DocDate
,id.DocumentName AS DocName
,i.ItemName AS ItemName
,i.ItemStatus AS ItemStatus
,CASE WHEN id.Date BETWEEN COALESCE(#start1, '1753-01-01') AND COALESCE(#end1, '9999-12-31') THEN 1 ELSE 0 END AS InPeriod1
,CASE WHEN id.Date BETWEEN COALESCE(#start2, '1753-01-01') AND COALESCE(#end2, '9999-12-31') THEN 1 ELSE 0 END AS InPeriod2
FROM Persons AS p
LEFT JOIN DocumentProcessors AS dp ON p.Id=dp.PersonId
LEFT JOIN ItemDocuments AS id ON dp.DocumentId=id.Id
LEFT JOIN Items AS i ON id.ItemId=i.Id
)
SELECT PersonID
,Person
,COUNT(CASE WHEN ItemStatus = 1 AND InPeriod1 = 1 THEN 1 ELSE NULL END) AS ActiveIn1
,COUNT(CASE WHEN ItemStatus = 2 AND InPeriod1 = 1 THEN 1 ELSE NULL END) AS ClosedIn1
,COUNT(CASE WHEN ItemStatus = 1 AND InPeriod2 = 1 THEN 1 ELSE NULL END) AS ActiveIn2
,COUNT(CASE WHEN ItemStatus = 2 AND InPeriod2 = 1 THEN 1 ELSE NULL END) AS ClosedIn2
FROM AllData
GROUP BY PersonID,Person

most recent entry made in table bases on one year interval mysql

Using the following sqlfiddle here How would I find the most recent payment made between the months of 2012-04-1 and 2012-03-31 using the case statement as in the previous queries
I tried this:
max(case when py.pay_date >= STR_TO_DATE(CONCAT(2012, '-04-01'),'%Y-%m-%d') and py.pay_date <= STR_TO_DATE(CONCAT(2012, '-03-31'), '%Y-%m-%d') + interval 1 year then py.amount end) CURRENT_PAY
However the answer I am getting is incorrect, where the actual answer should be:(12, '2012-12-12', 20, 1)
Please Provide me with some assistance, thank you.
Rather than a CASE inside your MAX() aggregate, that condition belongs in the WHERE clause. This joins against a subquery which pulls the most recent payment per person_id by joining on MAX(pay_date), person_id.
SELECT payment.*
FROM
payment
JOIN (
SELECT MAX(pay_date) AS pay_date, person_id
FROM payment
WHERE pay_date BETWEEN '2012-04-01' AND DATE_ADD('2012-03-31', INTERVAL 1 YEAR)
GROUP BY person_id
) maxp ON payment.person_id = maxp.person_id AND payment.pay_date = maxp.pay_date
Here is an updated fiddle with the ids corrected in your table (since a bunch of them were 15). This returns record 18, for 2013-03-28.
Update
After seeing the correct SQL fiddle... To incorporate the results of this query into your existing one, you can LEFT JOIN against it as a subquery on p.id.
select p.name,
v.v_name,
sum(case when Month(py.pay_date) = 4 then py.amount end) april_amount,
(case when max(py.pay_date)and month(py.pay_date)= 4 then py.amount else 0 end) max_pay_april,
sum(case
when Month(py.pay_date) = Month(curdate())
then py.amount end) current_month_amount,
sum(case
when Month(py.pay_date) = Month(curdate())-1
then py.amount end) previous_month_amount,
maxp.pay_date AS last_pay_date,
maxp.amount AS last_pay_amount
from persons p
left join vehicle v
on p.id = v.person_veh
left join payment py
on p.id = py.person_id
/* LEFT JOIN against the subquery: */
left join (
SELECT MAX(pay_date) AS pay_date, amount, person_id
FROM payment
WHERE pay_date BETWEEN '2012-04-01' AND DATE_ADD('2012-03-31', INTERVAL 1 YEAR)
GROUP BY person_id, amount
) maxp ON maxp.person_id = p.id
group by p.name,
v.v_name

how to join after left join complex mysql queries

I have this query
SELECT
currency_code,
SUM(CASE WHEN TYPE = 'buy'THEN to_amount END ) AS BUY,
SUM(CASE WHEN TYPE = 'sell' THEN to_amount END ) AS SELL,
SUM(CASE WHEN TYPE = 'sell' THEN rate END ) AS SELL_RATE,
SUM(CASE WHEN TYPE = 'buy' THEN rate END ) AS BUY_RATE,
AVG(CASE WHEN TYPE = 'buy' THEN rate END ) AS AVG_BUY_RATE,
AVG(CASE WHEN TYPE = 'sell' THEN rate END ) AS AVG_SELL_RATE
FROM tb_currency
LEFT JOIN tb_bill
ON tb_currency.CURRENCY_ID = tb_bill.CURRENCY_ID
AND tb_bill.TYPE IN ('buy', 'sell')
AND date( DATE_TIME ) >= '2011-01-01'
AND date( DATE_TIME ) <= '2011-01-11'
GROUP BY currency_code
that will output this:
Right now i want to join this query with another table called tb_user
the tb_user have PK called user_id and the tb_bill that is use in the query above also have foreign key called user_id
tb_user
user_id (pk)| user_name | branch_id
tb_bill
bill_id (pk) | user_id (fk)|
Desired result should be the above picture plus one column branch_id.
If it doesnt have branch_id, return null.
I tried several times but still cant join it correctly. Hope you guys can help.
Thanks.
The three conditions in the join (the AND clauses) might be giving you trouble. Those three conditions are selection criteria, not join criteria.
Also, your use of CASE looks odd to me. I'm sure it works, but IF might be better suited for a one-condition function. In the below, if the fields are floating point rather than integer then replace the 0 with 0.0.
SELECT currency_code,
SUM(IF(TYPE = 'buy', to_amount, 0)) AS BUY,
SUM(IF(TYPE = 'sell', to_amount, 0)) AS SELL,
SUM(IF(TYPE = 'sell', rate, 0)) AS SELL_RATE,
SUM(IF(TYPE = 'buy', rate, 0)) AS BUY_RATE,
AVG(IF(TYPE = 'buy', rate, 0)) AS AVG_BUY_RATE,
AVG(IF(TYPE = 'sell', rate, 0)) AS AVG_SELL_RATE,
tb_user.whatever_field,
tb_user.whatever_other_field
FROM tb_currency
LEFT JOIN tb_bill ON tb_currency.CURRENCY_ID = tb_bill.CURRENCY_ID
LEFT JOIN tb_user ON tb_bill.user_id = tb_user.user_id
WHERE tb_bill.TYPE IN ('buy', 'sell')
AND date( DATE_TIME ) >= '2011-01-01'
AND date( DATE_TIME ) <= '2011-01-11'
GROUP BY currency_code, tb_user.user_id
Finally, all-cap field names look odd to my eye as well. Whatever works for you though.
add user_id to SELECT part
after
LEFT JOIN tb_bill ON tb_currency.CURRENCY_ID = tb_bill.CURRENCY_ID
place
LEFT JOIN tb_user ON tb_user.id = tb_bill.user_id
also you missing WHERE ( put instead first AND )
and
GROUP BY currency_code, user_id