SQL: Group by in subquery - mysql

I'm trying to find the output of all books that have more than one genre using a group by statement and subquery. However, it keeps returning Subquery returns more than 1 row. This is what I have so far:
SELECT title
FROM book
WHERE 1 < (SELECT COUNT(genre) FROM genres GROUP BY book_id);

Here's an example:
SELECT b.title
FROM ( SELECT g.book_id
FROM genres g
GROUP
BY g.book_id
HAVING COUNT(1) > 1
) m
JOIN book b
ON b.id = m.book_id
The inline view m is meant to return us values of book_id that appear more than one time in the genres table. Depending on uniqueness constraints, we might want to count distinct values of genre
HAVING COUNT(DISTINCT g.genre) > 1
if we want to find books with exactly three related genre:
HAVING COUNT(DISTINCT g.genre) = 3
Once we have a list of book_id values, we can join to the book table. (The query assumes that book_id in genres is a foreign key reference to the id column in book table.)

You seem to what a correlated subquery:
SELECT b.title
FROM book b
WHERE 1 < (SELECT COUNT(*) FROM genres g WHERE g.book_id = b.book_id);

SELECT distinct a.title
FROM book a, (select bookid,count(distinct genre)genres from genres group by bookid)b
WHERE a.book_id=b.bookid and b.genres>1
hope it helps!

Related

How to do multiple tasks in a single SQL query

I was given the database below:
movie(movie_id, movie_name, production_year, votes, ranking, rating)
movie_info(movie_id, movie_genre_id, note)
movie_genre(movie_genre_id, genre_name)
person(person_id, person_name, gender)
role(person_id, movie_id, role_name, role_type_id)
role_type(role_type_id, type_name)
I was asked to display the name of the top 7 directors with at least 3 movies in the list, the number of movies they are in and the average rating of their movies, sorted by the average rating. With the query below I managed to get the name of the directors, the number of movies they are in and the average rating, but I'm having issues limiting it to the top 7 and sorting them by the average rating. I tried using LIMIT and ORDER BY, but I'm getting syntax errors.
SELECT
person_name, COUNT(role.movie_id), AVG(rating)
FROM
movie
INNER JOIN
role
ON role.movie_id = movie.movie_id
INNER JOIN
person
ON role.person_id = person.person_id
INNER JOIN
role_type
ON role.role_type_id = role_type.role_type_id
WHERE
type_name = 'director'
GROUP BY
person_name
HAVING
COUNT(role.movie_id) > 2;
I can even order by the number of movies they did and limit it to the top 7, but for God I cannot order it by the AVG(rating)
person_name COUNT(role.movie_id) AVG(rating)
Hitchcock, Alfred 9 8.2888890372382
Kubrick, Stanley 8 8.2999999523163
Wilder, Billy 6 8.3000000317891
Spielberg, Steven 6 8.4000000953674
Scorsese, Martin 6 8.3166666030884
Nolan, Christopher 6 8.5333331425985
Tarantino, Quentin 6 8.3666666348775
In MySQL, Aliases defined in the Select clauses can be used in the Group By, Order By and Having clauses.
Use Order by .. DESC to sort the result-set in descending order and Limit 7 to get only 7 rows.
You should use proper Aliasing in multi table queries, to avoid ambiguous and unintended behavior.
You need to use Group By on person_id also, as there may be cases where director(s) have same name.
If you have duplicate entries in role table, you will have to use Count(Distinct ...) to avoid counting duplicate rows.
Try the following query:
SELECT
p.person_id,
p.person_name,
COUNT(r.movie_id) AS movies_count,
AVG(m.rating) AS average_rating
FROM
movie AS m
INNER JOIN
role AS r
ON r.movie_id = m.movie_id
INNER JOIN
person AS p
ON r.person_id = p.person_id
INNER JOIN
role_type AS rt
ON r.role_type_id = rt.role_type_id
WHERE
rt.type_name = 'director'
GROUP BY
p.person_id,
p.person_name
HAVING
movies_count > 2
ORDER BY
movies_count DESC,
average_rating DESC
LIMIT 7

How to get count of distinct values from two tables in MySQL?

There are two tables. 1st: blogCategories, 2nd: blog.
Table blogCategories has just 2 fields, id and categoyName. There are many category names inserted before. Table blog has id, blogCatID, header, blog, date fileds.
There are many records in table blog. But not all categoryName's were used.
I try to get a list of categoryNames with their count of uses in blogs. I need 0 (zero) if categoryName is not used in blog.
I used the query below. But categoryNames gets count number 1 even if they have not been used.
SELECT DISTINCT categoryName, COUNT(*) AS totalBlogCount
FROM
(SELECT bc.categoryName
FROM
blogCategories bc
LEFT JOIN blog b ON bc.id=b.blogCatID) AS tot
GROUP BY categoryName
The below query will give you the categoryName and count of usage. if no usage then it will return 0 as count(null value)=0
SELECT bc.categoryName, COUNT(b.blogCatID) AS totalBlogCount
FROM blogCategories bc Left JOIN blog b ON bc.id=b.blogCatID
GROUP BY categoryName
output will looks like
inspirational 5
technical 2
political 0
random 3
Select bc.CategoryName, NullIf(bCounts.NumberOfTimesUsed,0) As NumberOfTimesUsedInBlog
From BlogCategories bc
Left Join
(Select blogCatID, Count(*) as NumberOfTimesUsed
From Blog Group By BlogCatID) bCounts
On bCounts.BlogCatID = bc.ID
SELECT bc.categoryName, COUNT(*) AS totalBlogCount
FROM blogCategories bc INNER JOIN blog b ON bc.id=b.blogCatID
GROUP BY categoryName
First you count the used blogCatID:
select blogCatID, count(*) as Number
from blog
group by blogCatID
later on you can find out which blogCategories are not used:
select *
from blogCategories as bc
where bc.id not in (select blogCatID
from blog
group by blogCatID)
next try a list of blogCategories together with the count of usage in blogs (and 0 if not used):
select isnull(b.numb, 0) as num, bc.*
from blogCategories as bc left join (select blogCatID, count(*) as numb
from blog
group by blogCatID) as b ON bc.id b.blogCatID

group by and having clause issue

I have 3 table obl_books, obl_authors and the link table books_authors.
The question is:
Write a query to select only those books whose all authors belong to Indian Nationality.
And the query I wrote for this is
SELECT obl_books.*, books_authors.author_id
FROM books_authors,obl_authors,obl_books
WHERE books_authors.author_id = obl_authors.author_id
AND books_authors.book_id=obl_books.book_id
GROUP BY books_authors.book_id
HAVING books_authors.author_id IN (SELECT obl_authors.author_id FROM obl_authors WHERE nationality='Indian')
Nationality is the column of obl_authors table and a book can have many authors.
So if book_id (2) has author_id (1), author_id (2) where author_id (1) and (2) are Indians then it should return that and if even one of the author is not Indian it should not.
But my query is returning even that book_id.
I even changed my having clause using ALL keyword in place of IN but it does not return any row.
Try:
SELECT obl_books.*, GROUP_CONCAT(books_authors.author_id)
FROM books_authors
JOIN obl_authors ON books_authors.author_id = obl_authors.author_id
JOIN obl_books ON books_authors.book_id=obl_books.book_id
GROUP BY books_authors.book_id
HAVING MIN(obl_authors.nationality)='Indian' AND
MAX(obl_authors.nationality)='Indian'
As I understand your problem, this is what you want; it just counts whether the number of indian authors is the same as the number of total authors per book, and show the ones where they're equal.
SELECT b.*, GROUP_CONCAT(a.author_id) authors
FROM obl_books b
JOIN books_authors ba ON b.book_id=ba.book_id
LEFT JOIN obl_authors a ON ba.author_id=a.author_id AND a.nationality = 'Indian'
GROUP BY b.book_id
HAVING COUNT(a.author_id)=COUNT(ba.author_id)
An SQLfiddle to test with.
Note that the GROUP BY on book_id only is a MySQL'ism, you'd normally need to group by all selected fields in obl_books.
Try this one query:
SELECT obl_books.*, books_authors.author_id
FROM books_authors,obl_authors,obl_books
WHERE books_authors.author_id = obl_authors.author_id
AND books_authors.book_id=obl_books.book_id and obl_authors.nationality='Indian'
GROUP BY books_authors.book_id
HAVING count(books_authors.book_id) = (SELECT count(*) FROM books_authors group by books_authors.author_id)
Try this
select * from obl_books where id in (select book_id from books_authors where author_id in (select id from obl_authors where nationality='Indian'))
I think, we can do this.. in reverse way also.. First get all the author id,who is not Indian and then except those author select all the remaining author's book (Indian Author's book).
SELECT obl_books.*, books_authors.author_id
FROM books_authors,obl_authors,obl_books
WHERE books_authors.author_id = obl_authors.author_id
AND books_authors.book_id=obl_books.book_id AND books_authors.author_id NOT IN (SELECT obl_authors.author_id FROM obl_authors WHERE nationality<>'Indian')

MYSQL, Max,Group by and Max

I am having the following two table.
1.Movie Detail (Movie-ID,Movie_Name,Rating,Votes,Year)
2.Movie Genre (Movie-ID,Genre)
I am using the following query to perform join and get the movie with highest rating in each
genre.
select Movie_Name,
max(Rating) as Rating,
Genre from movie_test
inner join movie_genre
where movie_test.Movie_ID = movie_genre.Movie_ID
group by Genre
In the output Rating and Genre are correct but the Movie_Name is incorrect.
can anyone suggest what changes I should make to get the correct movie name along with rating and genre.
SELECT g.*, d.*
FROM MovieGenre g
INNER JOIN MovieDetail d
ON g.MovieID = d.MovieID
INNER JOIN
(
SELECT a.Genre, MAX(b.Rating) maxRating
FROM MovieGenre a
INNER JOIN MovieDetail b
ON a.MovieID = b.MovieID
GROUP BY a.Genre
) sub ON g.Genre = sub.Genre AND
d.rating = sub.maxRating
There is something wrong with your schema design. If a Movie can have many Genre as well as Genre can be contain on many Movie, it should be a three table design.
MovieDetails Table
MovieID (PK)
MovieName
MovieRating
Genre Table
GenreID (PK)
GenreName
Movie_Genre Table
MovieID (FK) -- compound primary key with GenreID
GenreID (FK)
This is a common MySQL problem - specifying non-aggregate/non-aggregated-by columns in an aggregate query. Other flavours of SQL do not let you do this and will warn you.
When you do a query like yours, you are selecting non-aggregate columns in an aggregated group. Since many rows share the same genre, when you select Movie_Name it picks one row at random from each group and displays that one, because there is no general algorithm to guess the row you want and return the values of that.
You might ask 'why does it pick randomly? It could pick the one that max(Rating) belongs to?' but what about other aggregate columns, like avg(Rating)? What row does it pick there? What if two rows have the same max, anyway? Therefore it cannot have an algorithm to pick a row.
To solve a problem like this, you have to restructure your query, something like:
select Movie_Name,
Rating,
Genre from movie_test mt
inner join movie_genre
where movie_test.Movie_ID = movie_genre.Movie_ID
and Rating = (select max(Rating) from movie_test mt2 where mt.Genre = mt2.Genre
group by Genre
limit 1
This will select the row with the rating being the same as the maximum rating for that genre, using a subquery.
Query:
SELECT t.Movie_Name,
t.Rating,
g.Genre
FROM movie_test t
INNER JOIN movie_genre g ON t.Movie_ID = g.Movie_ID
WHERE t.Movie_ID = (SELECT t1.Movie_ID
FROM movie_test t1
INNER JOIN movie_genre g1 ON t1.Movie_ID = g1.Movie_ID
WHERE g1.Genre = g.Genre
ORDER BY t1.Rating DESC
LIMIT 1)

select all books from one category except books which belong to other category

I have 3 tables: books, book_categories, categories.
book_categories table "joins" books and categories. It contains columns: id,book_id,category_id.
So one Book may belong to many categories and one Categorie may have many books.
I need query which retrieves all books from given_category except books which belongs to given_set_of_categories. So for example I want all books from category A but only if they don't belong also to category B or C. I need also sort (order) the result by Book.inserted column.
I know how to get all books from given_category with 2 joins but can't figure out how to exclude some books from other categories in result. I cant filter books in PHP because I am paginating the search result.
where
category_id = <given category>
and books.book_id not in
(
select book_id from book_categories
where category_id in (<given set of cat>)
)
order by books.inserted
So, if you mean it is in one category but not in any other:
AND EXISTS(SELECT * FROM books b JOIN book_categories bc ON b.id = bc.book_id JOIN categories c ON bc.category_id = c.id AND c.id = 'A')
AND NOT EXISTS(SELECT * FROM books b JOIN book_categories bc ON b.id = bc.book_id JOIN categories c ON bc.category_id = c.id AND c.id != 'A')
I think that this can be achieved through counting provided that book_categories entries are unique, thus the combination book_id & category_id are not repeating. Instead of trying directly to exclude records, we select from the combined set of categories [,] and then we'll count book_id entries that belong to the :
COUNT(IF(category_id = <given_category>, 1, NULL)) as cnt_exists
and after ensuring that it contains the required category, we count the total to see if it belongs to any other category as well:
COUNT(*) AS cnt_total
SELECT * FROM books b JOIN (
SELECT book_id,
COUNT(IF(category_id = <given_category>, 1, NULL)) as cnt_exists,
COUNT(*) AS cnt_total FROM book_categories WHERE
category_id IN(<given_category>, <given_set_of_categories>)
) bc ON b.id = bc.book_id AND
cnt_exists = 1 AND cnt_total = 1 ORDER BY b.inserted