i have site like imdb and we provide movie information sin site..and our website have option to rate all movies for every users.
I have two tables
1 . imdb (its for store movie details)
id,name,actors,vote
2. ratings (its for store users rating details) id,rating_id(its same as id from first table),rating_num,IP
now what am doing is..when anyone rating a movie take the avg of that movie rating by using rating tables (total ratings/number of ratings) and insert that value into "vote" column in first table..my demands this..thats why done like this..
Now my problem is..i want to fetch top rated movies..i mean in vote column which movie have top rating which want to list and one more condition is that that movie should rated by 10 users(use ratings table for that)
thanks in advance
I don't quite understand how your tables are organized. Is there A) a new row for each rating given by a customer in the ratings table or B) is there only 1 row per movie which is updated?
I am gues it is A and rating_num is the rating given by the costumer.
In this case, a simple MySql solution could make use of aggregate functions such as COUNT and AVG. Untested example.
EDIT - To get the details from the imdb table you will just need to join them.
SELECT id as 'ID', COUNT(1) as 'Number of ratings', AVG(r.rating_num) as 'Average rating', i.name, i.actors, i.vote
FROM ratings r
INNER JOIN imbd i ON ( r.id = i.id )
GROUP BY r.id
HAVING `Number of ratings` >= 10
ORDER BY `Average rating` desc
LIMIT 10
Related
I am working on some SQL homework and could someone explain to me how to get this question done.
Display the average raw scores of team ‘Dolphins (1 point)
Here is an image of the data structure.
I need to make a query that returns the average raw score of 4 players.
However, when I try executing the code below it just returns one average.
/* Question 2 */
SELECT AVG(RawScore)
FROM Bowler_Scores
WHERE BowlerID IN
(
SELECT BowlerID
FROM Bowlers
WHERE TeamID =
(
SELECT TeamID
FROM Teams
WHERE TeamName = "Dolphins"));
In bowler scores each bowler id can have multiple scores.
For instance it may have the records - (43,101) (50,301) and (43,106).
I don't know how to write and sql statement that will get the average raw score for each player on that team out of all of there individual raw scores in the bowler scores table.
If you need the average individual scores for each member of the Dolphins Team you can use this:
Select Teams.TeamName, Bowlers.BowlerID, avg(Rawscore)
from Bowlers
inner join Teams
on Bowlers.TeamId = Teams.TeamID
inner join Bowler_Scores
on Bowlers.BowlerID = Bowler_Scores.BowlerID
where teams.teamname = 'Dolphins'
group by TeamName, BowlerID
If you just need one average score for the team then just remove the BowlerID from the SELECT and GROUP BY lines.
I am making a movie rating application where a user can rate a movie as either 'like' or 'dislike'. So I have made 3 tables user, movie and rating. The vote table's example is :
userID movieID Vote
x a li
y a dli
y b li
w a li
The table's schema is :
userID - PrimaryKey
movieID - PrimaryKey
Vote - Enum('li','dli')
I have made userID and movieID as primary key so that if the user updates his/her preference that specific row gets updated if the record is there.
Edit : Here's the movie table's schema.
mID - PrimaryKey
mName - Varchar
mGenre - Varchar
mDesc - Text
mDateOfRelease - Date
My question is that is it possible to select all the columns from the movie table, and like and dislike count of that movie in one db call. If yes how can I do it?
Let' imagine that you have movie and it contains field id. Here is what you can do:
select m.*,sum(case when r.vote like 'li' then 1 else 0 end) likes,
sum(case when r.vote like 'dli' then 1 else 0 end) dislikes from movie m
left join rating r
on r.movieID = m.mID
group by m.mID;
Left join is basically for no-votes. If rating table will be empty for this movie - it will still show it.
Edit:
To explain this I will cut this query.
We need to understand what group by is doing. You can find docs here. In short we create groups of the data based on the different entries of the column from your group by statement.
select count(r.movieID) from rating r
group by r.movieID;
This will give us how many votes each movie got (here we list only movies that got any vote - line in rating table). So after this we can do "smart" count, and use conditional SUM function. Into the definition you can have an expression. That means case when ... then ... end works as well. So we just sum up all the 'li' and do not touch 'dli' for likes and opposite for dislikes.
The only one big drawback of this is we don't have all the movies (think about the case when there is no votes for a movie). Then this select will just skip this. And here we have LEFT JOIN statement. The idea is very simple - we include all the lines of table movie no matter what is going on in rating table. on we use to connect these to tables. Then we recall what we did before with summing up but change group by to mID (it always exists and non-null, so we always have something to group on). In this case you will have exactly what you want. Also play around with this query (resulting one). Try to understand what will happen if you leave your rating column in group by statement. (It's easier to see than read tons of text :) )
If something is not clear - please let me know, will try to improve it.
Hope it helps
try this:
select m.*,
sum(case when v.vote = li then 1 else 0 end) li_count,
sum(case when v.vote = dli then 1 else 0 end) dli_count
from movie m left join vote v on m.movieID = v.movieID
group by m.mId (which is enough in mySQL)
Try this:
SELECT
votes.movieID,
likes.like_count,
dislikes.dislike_count
FROM votes
LEFT join (
SELECT movieID, count(Vote) as like_count
FROM votes
WHERE Vote = 'li'
GROUP BY movieID
) likes ON likes.movieID = votes.movieID
LEFT join (
SELECT movieID, count(Vote) as dislike_count
FROM votes
WHERE Vote = 'dli'
GROUP BY movieID
) likes ON likes.movieID = votes.movieID
This is the over simplified version of the problem, where I have 5 tables: Project (~50K), Organization (~20K), Category, Bidder (~250K), Rating
Project (
id,
owner_id (Organization.id),
title
)
Organization (
id,
name
)
Category (
id,
name
)
Bidder (
id,
organization_id (Organization.id),
project_id (Project.id)
category_id (Category.id),
is_winner
)
Rating (
id,
bidder_a_id (Bidder.id),
bidder_b_id (Bidder.id),
bidder_a_is_winner,
bidder_b_is_winner
)
There are Organizations that bid on a Project in a Category. The bidders can win or lose their bid on a Project and a Rating is then calculated (number of wins / total). The Rating is calculated between an Organization and another (or more than one).
For example:
we would like to show the rating for all the bidders of a project including only the bids on projects where the same organization as the owner of the project was also implied.
we would like to show the rating for all the bidders in a category of a project including only the bids on projects where other selected organizations were also implied.
I understand that the Rating table would not be necessary to get a result, but because of the amount of data, it would take too much time to execute the query. Therefore, I created the Rating table to hold the association of the bidders working on the same project. There might not be a Rating between two bidders if they never worked together before.
I will try to update with my own take, but I cannot seem to make it work yet... I am losing the bidders in the results when they do not have a Rating and I filter with a IN clause for the selected organizations.
Edit: I found a way around my problem. I added a column to my inner query with that returns a boolean if the current row is IN the selected organizations. When I do the SUM, the ones that are NOT IN, are not counted in the calculation of the Ratings.
In summary, when I tried with an HAVING clause, it would eliminate the rows that had no Rating, but I still wanted them in the final result. I wanted to know they were 0.
It looked something like this (notice the use of IN clause inside the SELECT):
select
org.id,
sum(sub.nb_wins),
sum(sub.nb_total),
sum(sub.nb_wins) / sum(sub.nb_total)
from
(select
bid.id as bidder_id,
bid.category_id as category_id,
sum(
case rat.id is not null or rat.bidder_a_id in (...) then bid.is_winner else 0
) as nb_wins,
sum(
case rat.id is not null or rat.bidder_a_id in (...) then 1 else 0
) as nb_total
from
Bidder bid
left outer join
Rating rat on rat.bidder_b_id = bid.id
group by
bid.id,
bid.category_id) as sub
inner join
Organization org on org.id = sub.bidder_id
group by
sub.bidder_id,
sub.category_id
The scenario:
I have a website which let users vote between cars which they like most. Cars are saved in the table cars, votes are saved in votes and the column country_id from the table cars reference to countries (where the carbrand comes from).
I want to show the users which country has the most votes. Simple version of the tables:
CARS
id
name
country_id
Countries
id
name
Votes
id
user_id
car_id
Ideally I would like to show the users the top x countries. And how many votes they all have.
Bonus: would it be possible to use this query for a certain user? So they see their top x with countries they voted on.
And which indexes you suggest? The votes table can grow beyond 10 million votes, the cars table can grow fast too.
I think you can achieve this with a LEFT JOIN query and GROUP BY aggregate function
SELECT COUNT(a.id) as total_votes, c.name as country_name
FROM Votes a
LEFT JOIN CARS b
ON a.car_id = b.id
LEFT JOIN Countries c
ON b.country_id = c.id
GROUP BY b.name, c.name
ORDER BY total_votes DESC
Indexes on Cars.CountryID, Votes.UserID and Votes.CarID would seem reasonable. As mzedler suggested though, when you get up to tens of millions, aggregates can be a bad idea.
There are number of ways of addressing that, triggers, a cache, or adding date voted to votes, so you break down the number of records you have to count in one go. e.g cache votes daily and then just query those made since midnight and then sum them.
I have this table "Ratings" containing an USER_ID, Artist_ID and a Rating (int, the score assigned by a user). And of course a table "Artists" with among others a field "ID" and "Name".
I am trying to get a descending list with artistnames, sorted by the AVG score per artists. First I thought this wouldn't be this hard, but I keep on fiddling...
Here's where I'm stuck. I have to figure out how it can calculate the AVG() of the id's that occur multiple times. And I've tried a variety of queries using the DISTINCT keyword but gained no success.
For example:
SELECT Name, Rating
FROM Artists as a
INNER JOIN Ratings as r
ON a.ID = r.Artists_ID
ORDER BY r.Rating DESC;
Gives me the result:
Name Rating
"The Mars Volta" 9.5
"Simon Felice" 9.0
"Laura Gibson" 8.0
"Calexico" 7.0
"Mira" 7.0
"Guido Belcanto" 6.0
"Guido Belcanto" 1.0
As you can see, the artist "Guido Belcanto" has more than 1 rating. I have no clue at the moment on how to calculate the AVG() of those ID's that occur more than once. It's probably basic MySQL but I'm maybe searching in the wrong direction
You forgot a GROUP BY on your query and actually calculating the average of ratings you pull in (with the AVG function):
SELECT Name, AVG(Rating) AS AVGRating
FROM Artists as a
INNER JOIN Ratings as r
ON a.ID = r.Artists_ID
GROUP BY Name
ORDER BY AVGRating DESC;
Though it is probably better to do the group by on the key of the artist's table (with enough data it is not unthinkable to have several artists with the same name).
Group the ratings by artist so you can compute the avg of the ratings for each group, and join that with Artists to get the names associated with each id.