i have a query that returns a score for each item listed in a table that has the year earlier than 1980. I have a similar one for years after 1980. when i try to compute the average of one minus the average the other, it for some reason is computing the averages as the same (so as to be ignoring my where clause. am i doing something wrong?
Select avg(stars)
from
(Select stars
from Rating, Movie
where movie.mid = rating.mid and movie.year < 1980);
Change your cartesian product join to a regular join would be the first step... we need to see some data to understand the question better... but heres a first pass
SELECT AVG(stars)
FROM movie
JOIN rating ON rating.mid = movie.mid
WHERE movie.year < 1980
GROUP BY movie.mid
if this doesn't work then please post some data so I can test it
Related
Using the ER diagram of IMBD I need to find the time period in which each actor was active, by listing the earliest and the latest year in which the actor starred in a film, but only for the actors that have starred in at least 10 movies.
I wrote the part in regards to the period of acting, but am struggling with at least 10 movies one. I understand I should use HAVING COUNT
My answer so far is:
SELECT r.actor_id, min(m.year), max(m.year)
FROM roles r
LEFT JOIN movies m ON r.movie_id = m.id
GROUP BY r.actor_id
Try the following. As pointed out my Barmar you don't need the left join.
SELECT r.actor_id, min(m.year), max(m.year)
FROM roles r
GROUP BY r.actor_id
Having count(*) >= 10
In case if you ever have to change the table structure for roles to include the scenario of a single actor performing multiple roles then you might have to change your query like below:
SELECT r.actor_id, min(m.year), max(m.year)
FROM roles r
GROUP BY r.actor_id
Having count(distinct r.movie_id) >= 10
I am working on some SQL homework and could someone explain to me how to get this question done.
Display the average raw scores of team ‘Dolphins (1 point)
Here is an image of the data structure.
I need to make a query that returns the average raw score of 4 players.
However, when I try executing the code below it just returns one average.
/* Question 2 */
SELECT AVG(RawScore)
FROM Bowler_Scores
WHERE BowlerID IN
(
SELECT BowlerID
FROM Bowlers
WHERE TeamID =
(
SELECT TeamID
FROM Teams
WHERE TeamName = "Dolphins"));
In bowler scores each bowler id can have multiple scores.
For instance it may have the records - (43,101) (50,301) and (43,106).
I don't know how to write and sql statement that will get the average raw score for each player on that team out of all of there individual raw scores in the bowler scores table.
If you need the average individual scores for each member of the Dolphins Team you can use this:
Select Teams.TeamName, Bowlers.BowlerID, avg(Rawscore)
from Bowlers
inner join Teams
on Bowlers.TeamId = Teams.TeamID
inner join Bowler_Scores
on Bowlers.BowlerID = Bowler_Scores.BowlerID
where teams.teamname = 'Dolphins'
group by TeamName, BowlerID
If you just need one average score for the team then just remove the BowlerID from the SELECT and GROUP BY lines.
Currently I have a simple SQL request to get aall group departure date and the associated group size (teamLength) between 2 dates but it doesn't work properly.
SELECT `groups`.`departure`, COUNT(`group_users`.`group_id`) as 'teamLength'
FROM `groups`
INNER JOIN `group_users`
ON `groups`.`id` = `group_users`.`group_id`
WHERE departure BETWEEN '2017-03-01' AND '2017-03-31'
In fact, if I have more than 1 group between the 2 dates, only 1 date will be recovered in association with the total number of teamLength.
For exemple, if I have 2 groups in the same interval with, for group 1, 2 people and for group 2, 1 people, the result will be:
Here are 2 screenshots of the current state of my groups and group_users tables:
Is it even possible to do what I want in only 1 SQL request ? Thanks
In addition to what jarlh commented (JOIN with ON). Don't ever group data without an explicit GROUP BY. I don't know why MYSQL still allows this...
Change your query to something like this and you should get the result you are looking for. Currently, the other departure dates get lost in the aggregation.
SELECT
groups.departure,
COUNT(1) as team_length
FROM
groups
INNER JOIN group_users
ON groups.id = group_users.group_id
WHERE
groups.departure BETWEEN '2017-03-01' AND '2017-03-31'
GROUP BY
groups.departure
I think that you have a syntax issue in your query. You are missing the ON statement so your database could be trying to get a cartesian product since there is no join clause.
SELECT `groups`.`departure`, COUNT(`group_users`.`id`) as 'teamLength'
FROM `groups`
INNER JOIN `group_users` ON `groups`.`id` = `group_users`.`group_id`
WHERE departure BETWEEN '2017-03-01' AND '2017-03-31'
GROUP BY `groups`.`departure`
You also are missing the GROUP BYclause which is not mandatory in all RDBS but it is a good practice to set it.
I need to find the average price of a movie by genre.
The tables are Movie (movie_genre) and Price (price_rentfee)
I tried:
select movie_genre, avg(price_rentfee)
from movie, price
group by movie_genre;
It lists the movie's by genre with the avg rental fee,
but the average rental fee is the same for all of them.
Is there a way where I can average it out by genre?
Your query says
FROM movie, price
That's probably a mistake. It generates every possible combination of movie and price. You probably need something like this instead to get useful results.
FROM movie
JOIN price ON movie.movie_id = price.movie_id
your query need key column to join both of tables.
select movie_genre, avg(price_rentfee) as avgPrice
from movie, price
where movie.movie_id = price.movie_id
group by movie_genre;
or
select movie_genre, avg(price_rentfee) as avgPrice
from movie
Left Join price
on price.movie_id = movie.movie_id
group by movie_genre;
Turns out I was missing a join, group by, and having clause. Answer i was looking for was
select movie_genre, avg(price_rentfee)
from price
right join movie
on price.price_code = movie.price_code group by movie_genre;
Sorry for the ambiguous question and thanks for the help.
EDIT: This has been solved, requiring a subquery into the appearances table. Here is the working solution.
SELECT concat(m.nameFirst, ' ', m.nameLast) as Name,
m.playerID as playerID,
sum(b.HR) as HR
FROM Master AS m
INNER JOIN Batting AS b
ON m.playerID=b.playerID
WHERE ((m.weight/(m.height*m.height))*703) >= 27.99
AND m.playerID in (SELECT playerID FROM appearances GROUP BY playerID HAVING SUM(G_1b+G_dh)/SUM(G_All) >= .667)
GROUP BY playerID, Name
HAVING HR >= 100
ORDER BY HR desc;
I'm working with the Lahman baseball stat database, if anyone's familiar.
I'm trying to retrieve a list of all large, slugging first basemen, and the data I need is spread across three different tables. The way I'm doing this is finding players of a minimum BMI, who have spent at least 2/3 of their time at first/designated hitter, and have a minimum number of home runs.
'Master' houses player names, height, weight (for BMIs).
'Batting' houses HR.
'Appearances' houses games played at first, games played at DH, and total games.
All three databases are connected by the same 'playerID' value.
Here is my current query:
SELECT concat(m.nameFirst, ' ', m.nameLast) as Name,
m.playerID as playerID,
sum(b.HR) as HR
FROM Master AS m
INNER JOIN Batting AS b
ON m.playerID=b.playerID
INNER JOIN Appearances AS a
ON m.playerID=a.playerID
GROUP BY Name, playerID
HAVING ((m.weight/(m.height*m.height))*703) >= 27.99
AND ((SUM(IFNULL(a.G_1b,0)+IFNULL(a.G_dh,0)))/SUM(IFNULL(a.G_All,0))) >= .667
AND HR >= 200
ORDER BY HR desc;
This appears correct to me, but when entered it never returns (runs forever) - for some reason I think it has something to do with the inner join of the appearances table. I also feel like there's a problem with combining m.weight/m.height in a "HAVING" clause, but with aggregates involved I can't use "WHERE." What should I do?
Thanks for any help!
EDIT: After removing all conditionals, I'm still getting the same (endless) result. This is my simpler query:
SELECT concat(m.nameFirst, ' ', m.nameLast) as Name,
m.playerID as playerID,
sum(b.HR) as HR
FROM Master AS m
INNER JOIN Batting AS b
ON m.playerID=b.playerID
INNER JOIN Appearances AS a
ON m.playerID=a.playerID
GROUP BY playerID, Name
ORDER BY HR desc;
My guess is that the problem with your query is that each player has appeared many times (appearances) and at bat many times. Say a player has been at bat 1000 times in 100 games. Then the join -- as you have written it -- will have 100,000 rows just for that player.
This is just a guess because you have provided no sample data to verify if this is the problem.
The solution is to pre-aggregate the appearances and games tables as subqueries (at the playerId level) and then join them back.