How to I get the mysql average of a join with conditions? - mysql

I am trying to get the average ratings of a user by project type but include all users that have ratings regardless of type.
SELECT projects.user_id, AVG(ratings.rating) AS avg1
FROM projects
JOIN ratings
ON projects.project_id = ratings.project_id
WHERE projects.type='0'
GROUP BY projects.user_id;
Thanks in advance for the help.
The output I get for type 0 is:
user_id | avg1
-----------------
11 | 2.25
but I am trying to get:
user_id | avg1
-----------------
11 | 2.25
12 | 0
because user 12 has a project in the rating table but not of type 0 I still want it output with avg1 = 0
The output for type 1 works as expected because all users that have ratings also have type 1:
user_id | avg1
-----------------
11 | 4
12 | 2.5
Projects table is: (only the first 4 projects are in the ratings table)
project_id |user_id | type
--------------------------
51 11 0
52 12 1
53 11 0
54 11 1
55 12 1
56 13 0
57 14 1
Ratings table is:
project_id | rating
-------------------
51 0
51 1
52 4
51 5
52 2
53 3
54 4
52 1.5

Use conditional aggregation:
SELECT p.user_id,
COALESCE(AVG(CASE WHEN p.type = '0' THEN r.rating END), 0) AS avg1
FROM projects p JOIN ratings r
ON p.project_id = r.project_id
GROUP BY p.user_id;
See the demo.

Related

Getting wrong data from DB when joining MySql [duplicate]

I have a table of revenue as
title_id revenue cost
1 10 5
2 10 5
3 10 5
4 10 5
1 20 6
2 20 6
3 20 6
4 20 6
when i execute this query
SELECT SUM(revenue),SUM(cost)
FROM revenue
GROUP BY revenue.title_id
it produces result
title_id revenue cost
1 30 11
2 30 11
3 30 11
4 30 11
which is ok, now i want to combine sum result with another table which has structure like this
title_id interest
1 10
2 10
3 10
4 10
1 20
2 20
3 20
4 20
when i execute join with aggregate function like this
SELECT SUM(revenue),SUM(cost),SUM(interest)
FROM revenue
LEFT JOIN fund ON revenue.title_id = fund.title_id
GROUP BY revenue.title_id,fund.title_id
it double the result
title_id revenue cost interest
1 60 22 60
2 60 22 60
3 60 22 60
4 60 22 60
I can't understand why is it double it,please help
Its doubling because you have title repeated in fund and revenue tables. This multiplies the number of records where it matches. This is pretty easy to see if you remove the aggregate functions and look at the raw data. See here
The way to get around this is to create inline views of your aggregates and join on the those results.
SELECT R.title_id,
R.revenue,
R.cost,
F.interest
FROM (SELECT title_id,
Sum(revenue) revenue,
Sum(cost) cost
FROM revenue
GROUP BY revenue.title_id) r
LEFT JOIN (SELECT title_id,
Sum(interest) interest
FROM fund
GROUP BY title_id) f
ON r.title_id = F.title_id
output
| TITLE_ID | REVENUE | COST | INTEREST |
----------------------------------------
| 1 | 30 | 11 | 30 |
| 2 | 30 | 11 | 30 |
| 3 | 30 | 11 | 30 |
| 4 | 30 | 11 | 30 |
demo
The reason for this is that you have joined the table the first derived table from the second table without grouping it. To solve the problem, group the second table (fund) and join it with the first derived table using LEFT JOIN.
SELECT b.title_id,
b.TotalRevenue,
b.TotalCost,
d.TotalInterest
FROM
(
SELECT a.title_id,
SUM(a.revenue) TotalRevenue,
SUM(a.cost) TotalCost
FROM revenue a
GROUP BY a.title_id
) b LEFT JOIN
(
SELECT c.title_id,
SUM(a.interest) TotalInterest
FROM fund c
GROUP BY c.title_id
) d ON b.title_id = d.title_id
There are two rows for each title_id in revenue table.

MySql select all rows in one table based on MAX value in another table

I want to be able to get all the data from table 1 and table 3 below but in addition to this I also want to get the latest application stage from table 2. The latest application stage is determined by getting the max stage_date for each application.
Table 1: applications
id | applicant_id | col_x | col_y | col_z
-----------------------------------------
10 300 a b c
11 310 a b c
12 320 a b c
13 330 a b c
14 340 a b c
Table 2: application_progress
id | application_id | application_stage | stage_date | stage_notes
------------------------------------------------------------------
1 10 DRAFT 2013-01-01 (NULL)
2 10 APPLICATION 2013-01-14 (NULL)
3 10 PHASE1 2013-01-30 (NULL)
4 11 DRAFT 2013-01-01 (NULL)
4 12 DRAFT 2013-01-01 (NULL)
5 13 DRAFT 2013-01-01 (NULL)
6 14 DRAFT 2013-01-01 (NULL)
7 14 APPLICATION 2013-01-14 (NULL)
EDIT: third table
Table 3: applicants
id | applicant_name | applicant_address | programme_id
------------------------------------------------------
300 Applicant 1 abc 1
310 Applicant 2 xyz 2
320 Applicant 3 xyz 2
330 Applicant 4 xyz 2
340 Applicant 5 xyz 2
Returned data set
applicant_id | applicant_name | current_stage
---------------------------------------------------------
300 Applicant 1 PHASE1
310 Applicant 2 DRAFT
320 Applicant 3 DRAFT
330 Applicant 4 DRAFT
340 Applicant 5 APPLICATION
Am struggling with this one and would appreciate any help.
PS. Tried to put an example of sqlfiddle but it's down at the minute. I'll update this with the sqlfiddle when it's back up if haven't had an answer before this.
You can do this with a correlated subquery:
select a.*,
(select application_stage
from application_progress ap
where ap.application_id = a.id
order by stage_date desc
limit 1
) MostRecentStage
from applications a;
EDIT:
You can joining in the applicant data with something like this::
select a.*, aa.*,
(select application_stage
from application_progress ap
where ap.application_id = a.id
order by stage_date desc
limit 1
) MostRecentStage
from applications a join
applicant aa
on a.applicant_id = aa.id;

SQL: how to select a single id that meets multiple criteria from multiple rows

On a MySQL database, I have the table below
package_content :
id | package_id | content_number | content_name | content_quality
1 99 11 Yellow 1
2 99 22 Red 5
3 101 11 Yellow 5
4 101 33 Green 5
5 101 44 Black 5
6 120 11 Yellow 5
7 120 55 White 5
8 135 66 Pink 5
9 135 99 Orange 5
10 135 11 Yellow 5
and i am looking a possibility to make search queries on it:
I would like to select the package_id where content_number could be 11 AND 22 (In this case it should select only package_id 99
I really don't know if it's possible in SQL since the statement AND will always results as false. If i use the statement OR i also get the package_id 99, 101, 120, 135 and that's not what i want.
Maybe my table is not well designed too, but any suggestions would help!
Thanks in advance
Edit
I added the content_quality column
I used the sql query from juergen, works very well
select package_id
from package_content
where content_number in (11,22)
group by package_id
having count(distinct content_number) = 2
My last question is how could i now add another criteria : Select the package_id where content_number is 11 and 22 and content_number 11 has content_quality 1
Edit 2:
For the 2nd question i use now this query. Thanks to both of you who helped me! :)
SELECT *
FROM (
SELECT package_id
FROM package_content
WHERE
(content_number=11 AND content_quality > 1)
OR (content_number = 33 AND content_quality = 5)
OR (content_number = 44 AND content_quality =5 AND content_name like 'Black')
GROUP BY package_id
HAVING count( DISTINCT content_number) = 3
)t1
LEFT JOIN package_content ON package_content.package_id = t1.package_id
This will output
id | package_id | content_number | content_name | content_quality
3 101 11 Yellow 5
4 101 33 Green 5
5 101 44 Black 5
You need to group by the package_id and then use having to perform an aggregate function over the grouped data
select package_id
from package_content
where content_number = 22
or
(
content_number = 11 and content_quality = 1
)
group by package_id
having count(distinct content_number) = 2
You could query with a self join for that:
SELECT DISTINCT package_id
FROM package_content a, package_content b
WHERE a.package_id = b.package_id
AND a.content_number = 11 AND b.content_number = 22
Edit: For your second question: Just add that to the query. The package_content renamed to a is responsible for the content_number 11. Therefore you can ask, wether a has content_quality 1:
SELECT DISTINCT package_id
FROM package_content a, package_content b
WHERE a.package_id = b.package_id
AND a.content_number = 11 AND b.content_number = 22
AND a.content_quality = 1

Select min/max from multiple items

I'll try to explain it as simple as possible:
First some database structure with dummy data.
Structure
tb_spec_fk
feature value
-----------------
1 1
1 2
1 3
1 4
1 5
2 2
2 3
3 1
3 4
4 2
4 3
4 4
5 1
5 3
5 5
6 3
6 5
tb_spec_feature
feature_id filter
------------------
1 2
2 2
3 2
4 2
5 1
6 0
tb_spec_value
value_id name
----------------
1 10
2 20
3 30
4 40
5 50
Now, what I want is the follow result
Result
feature_id min_value max_value
---------------------------------
1 10 50
2 20 30
3 10 40
4 20 40
But how?
Logic
Get from the tb_spec_feature where "filter" equals 2 the highest and lowest values which are present in the tb_spec_value table and connected together trough the tb_spec_fk table.
My attemps
A lot! But I'll spare you :)
SELECT
f.feature_id AS feature_id,
MAX(value.name) AS max_value,
MIN(value.name) AS min_value
FROM tb_spec_feature AS f
JOIN tb_spec_fk AS fk ON f.feature_id=fk.feature
JOIN tb_spec_value AS value ON fk.value=value.id
WHERE f.filter=2
GROUP BY f.feature_id
The two JOIN statements "link" the a feature to a value. GROUP BY groups all rows with the same feature id, and then you can take the min or max or any other aggregate function on those columns.
Demo
Here is how you can do it
select
tsf.feature_id,
tsvl.name as Min_Value,
tsvr.name as Max_Value
from tb_spec_feature as tsf
inner join (select feature , MIN(value) MinV,MAX(value)MaxV from tb_spec_fk group by feature order by feature)as tsfkl on tsfkl.feature = tsf.feature_id
left join tb_spec_value as tsvl on tsvl.value_id = tsfkl.MinV
left join tb_spec_value as tsvr on tsvr.value_id = tsfkl.MaxV
where tsf.filter = 2
group by tsf.feature_id
Output
feature_id | Min_Value | Max_Value
---------------------------------
1 | 10 | 50
2 | 20 | 30
3 | 10 | 40
4 | 20 | 40
Fiddle Demo

Mysql join and sum is doubling result

I have a table of revenue as
title_id revenue cost
1 10 5
2 10 5
3 10 5
4 10 5
1 20 6
2 20 6
3 20 6
4 20 6
when i execute this query
SELECT SUM(revenue),SUM(cost)
FROM revenue
GROUP BY revenue.title_id
it produces result
title_id revenue cost
1 30 11
2 30 11
3 30 11
4 30 11
which is ok, now i want to combine sum result with another table which has structure like this
title_id interest
1 10
2 10
3 10
4 10
1 20
2 20
3 20
4 20
when i execute join with aggregate function like this
SELECT SUM(revenue),SUM(cost),SUM(interest)
FROM revenue
LEFT JOIN fund ON revenue.title_id = fund.title_id
GROUP BY revenue.title_id,fund.title_id
it double the result
title_id revenue cost interest
1 60 22 60
2 60 22 60
3 60 22 60
4 60 22 60
I can't understand why is it double it,please help
Its doubling because you have title repeated in fund and revenue tables. This multiplies the number of records where it matches. This is pretty easy to see if you remove the aggregate functions and look at the raw data. See here
The way to get around this is to create inline views of your aggregates and join on the those results.
SELECT R.title_id,
R.revenue,
R.cost,
F.interest
FROM (SELECT title_id,
Sum(revenue) revenue,
Sum(cost) cost
FROM revenue
GROUP BY revenue.title_id) r
LEFT JOIN (SELECT title_id,
Sum(interest) interest
FROM fund
GROUP BY title_id) f
ON r.title_id = F.title_id
output
| TITLE_ID | REVENUE | COST | INTEREST |
----------------------------------------
| 1 | 30 | 11 | 30 |
| 2 | 30 | 11 | 30 |
| 3 | 30 | 11 | 30 |
| 4 | 30 | 11 | 30 |
demo
The reason for this is that you have joined the table the first derived table from the second table without grouping it. To solve the problem, group the second table (fund) and join it with the first derived table using LEFT JOIN.
SELECT b.title_id,
b.TotalRevenue,
b.TotalCost,
d.TotalInterest
FROM
(
SELECT a.title_id,
SUM(a.revenue) TotalRevenue,
SUM(a.cost) TotalCost
FROM revenue a
GROUP BY a.title_id
) b LEFT JOIN
(
SELECT c.title_id,
SUM(a.interest) TotalInterest
FROM fund c
GROUP BY c.title_id
) d ON b.title_id = d.title_id
There are two rows for each title_id in revenue table.