I've got 3 tables: book, publisher, book_category
For a particular book category (fantasy) I have to display list of publisher names supplying that genre.
publisher_name and category_name are linked through book table, so my query is:
SELECT publisher.publisher_name
FROM publisher, book, book_category
WHERE publisher.publisher_id = book.publisher_id
AND book.category_id = book_category.category_id
AND category_name = 'fantasy';
But the result I'm getting is repeating the name of publisher if there's more than one fantasy book supplied by that publisher.
Let's say I've got The Hobbit and The Lord of the Rings,both are fantasy and are supplied by the same PublisherA.
In that case the result of my query is:
PublisherA
PublisherA
Is it possible to get that result just once? Even if there's much more than 2 fantasy books
published by the same publisher?
Just use distinct if you only need publisher_name
SELECT distinct publisher.publisher_name
by the way, try to use JOIN syntax... to join tables
SELECT distinct p.publisher_name
FROM publisher p
join book b on b.publisher_id = p.publisher_id
join book_Category bc on bc.category_id = b.category_id
where bc.category_name = 'fantasy'
Use DISTINCT
SELECT DISTINCT publisher.publisher_name
FROM publisher, book, book_category
WHERE publisher.publisher_id = book.publisher_id
AND book.category_id = book_category.category_id
AND category_name = 'fantasy';
Try adding this to the end of the query: GROUP BY publisher.publisher_name
Everyone is mentioning DISTINCT, which is correct (better than GROUP BY in MySQL, because of the way the optimizer is set up), but I figured I would also add a modification for performance enhancements.
Currently you have implicit cross joins to get to the other tables, and making these explicit INNER JOINs will increase efficiency because of the order of filtering. Example:
SELECT DISTINCT Publisher.publisher_name
FROM publisher Publisher
INNER JOIN book Book ON Publisher.publisher_id = Book.publisher_id
INNER JOIN book_category Book_Category ON Book.category_id = Book_Category.category_id
WHERE Book_Category.category_name = 'fantasy';
In the original query, you bring in the complete record set of all three tables (publisher, book, book_category), and then from that set you join on the respective keys, and then return the result set. In this new query, your join to Book_Category happens based only upon the record set returned from the join between Publisher and Book. If there is filtering that happens based on this join, you will see a performance increase.
You also have the added benefit of being ANSI-compliant, as well as explicit coding to improve ease of maintenance.
Related
In this database
https://www.databasestar.com/sample-database-movies/
I would like to make the following query: "List the name and genre of all the actors in the film Brazil".
I make this query:
USE movies;
SELECT DISTINCT p.person_name AS 'Nombre', g.gender AS 'Sexo' FROM movie m
JOIN movie_crew mc ON m.movie_id = mc.movie_id
JOIN department d ON mc.department_id = d.department_id
JOIN movie_cast mc2 ON m.movie_id = mc2.movie_id
JOIN person p ON mc2.person_id = p.person_id
JOIN gender g ON mc2.gender_id = g.gender_id
WHERE m.title = 'Brazil' AND d.department_name = 'Actors';
But no results appear and I don't understand where is my mistake.
Thanks.
I recommend you simplify the schema somewhat.
Just use simple strings for genre, language_role, keyword, gender, person_name
Use iso_code in place of country_id
Perhaps simple abbreviations for department and company
These do need normalizing (as you have don): person, movie, company, but mostly because there is other stuff in those entities.
That is, get rid of most of the tables in the leftmost and rightmost columns.
Once you have made that change, the error may mysteriously go away. (And, when you get more data, the queries will run faster. This does assume you have suitable indexes.)
I have a slightly complex table structure that I'm trying to query for a search function, but my queries keep timing out. Basically, it's a book search, and I'm focusing on the subject portion of that search.
The subjects table is simple (id and title), but there's a link table that refers it back to itself called subjects_subjects, which complicates things.
**subjects_subjects**
id (key)
subject_id (reference to subjects table)
see_subject_id (another reference to subjects table)
The reason for the looping reference is to catch subjects that don't contain any books, but point to subjects that do. For example, there's no books under the 'Travel' subject, so that subject has a link to 'Explorers' and 'Voyages' that do contain books. The point is to make searching easier.
So what I'm trying to do is allow the user to search for 'Travel', but return results from 'Explorers' and 'Voyages'. Here's my query that times out:
SELECT
BK.id,
BK.title
FROM
books BK
LEFT OUTER JOIN
books_subjects BS
ON BS.book_id = BK.id
WHERE
BS.subject_id IN (1639,3173)
OR BS.subject_id IN
(
SELECT
SS.see_subject_id
FROM
subjects_subjects SS
WHERE
SS.subject_id IN (1639,3173)
)
GROUP BY
BK.books_id
Extra info: There are 17000 books and over 3000 subjects in the database, with roughly 84000 book/subject references.
Can anyone help me figure out where am I going wrong here?
You're doing two things that MySQL optimizes poorly:
OR in the WHERE clause.
IN (SELECT ...)
Instead of OR, use two queries that you combine with OR. And instead of IN (SELECT ...) use a JOIN.
Also, you shouldn't use LEFT JOIN if you don't need to return rows from the first table with no matches in the second table, use INNER JOIN.
SELECT b.id, b.title
FROM books AS b
JOIN books_subjects AS bs ON bs.book_id = b.id
WHERE bs.subject_id IN (1639, 3173)
UNION
SELECT books AS b
JOIN books_subjects AS bs ON bs.book_id = b.id
JOIN subjects_subjects AS ss ON bs.subject_id = ss.see_subject_id
WHERE ss.subject_id IN (1639, 3173)
Say I have a database of publishers, who employ authors, who write books.
Or to phrase it another way, each book, is written by an author, who works for a publisher.
publishers: id
authors: id, publisher_id
books: id, author_id
I know how to get a list of publishers with how many authors each employs, from this question.
How do I get a list of publishers with how many books each has published?
How can I get both - publishers, each with number of authors and number of books?
try this
SELECT COUNT(DISTINCT b.`id`) noofbooks,COUNT(DISTINCT au.id) noofauthers,pub.id publisher FROM publisher pub
INNER JOIN auther au ON au.`pub_id`= pub.`id`
INNER JOIN books b ON b.`aut_id` = au.`id` GROUP BY pub.id
You need a three table join
SELECT publisher.id, count(*) from publisher
INNER JOIN author on publisher.id = author.publisher_id
INNER JOIN book on author.id = book.author_id GROUP BY publisher.id;
You just need to fire a simple sql join query for that like as follow.
SELECT p.publishers , COUNT(a.authors) totalAuthors, COUNT(b.books) TotalBooks
FROM publishers AS p,authors AS a ,books AS b
WHERE p.publishersid = a.publishersid
AND a.authorsid = b.authorsid
GROUP BY p.publishersid;
I ended up with something similar to bhanu's answer:
SELECT publishers.*,
COUNT(DISTINCT authors.id) AS 'author_count',
COUNT(DISTINCT books.id) AS 'book_count'
FROM publishers
LEFT JOIN authors ON (authors.publisher_id = publishers.id)
LEFT JOIN books ON (books.author_id = authors.id)
GROUP BY publishers.id;
I'm stuck with creating a MySQL query. Below is my database structure.
authors (author_id and author_name)
books (book_id and book_title)
books_authors is the link table (book_id and author_id)
Result of all books and authors:
I need to get all the books for certain author, but if a book has 2 authors the second one must be displayed also. For example the book "Good Omens" with book_id=2 has two authors. When I run the query I get the books for the author_id=1 but I can not include the second author - "Neil Gaiman" in the result. The query is:
SELECT * FROM books
LEFT JOIN books_authors
ON books.book_id=books_authors.book_id
LEFT JOIN authors
ON books_authors.author_id=authors.author_id
WHERE books_authors.author_id=1
And below is the result:
You need to change the WHERE clause to execute a subselect like this:
SELECT b.*, a.*
FROM books b
LEFT JOIN books_authors ba ON ba.book_id = b.book_id
LEFT JOIN authors a ON a.author_id = ba.author_id
WHERE b.book_id IN (
SELECT book_id
FROM books_authors
WHERE author_id=1)
The problem with your query is that the WHERE clause is not only filtering the books you are getting in the result set, but also the book-author associations.
With this subquery you first use the author id to filter books, and then you use those book ids to fetch all the associated authors.
As an aside, I do think that the suggestion to substitute the OUTER JOINs with INNER JOINs in this specific case should apply. The first LEFT OUTER JOIN on books_authors is certainly useless because the WHERE clause guarantees that at least one row exists in that table for each selected book_id. The second LEFT OUTER JOIN is probably useless as I expect the author_id to be primary key of the authors table, and I expect the books_authors table to have a foreign key and a NOT NULL constraint on author_id... which all means you should not have a books_authors row that does not reference a specific authors row.
If this is true and confirmed, then the query should be:
SELECT b.*, a.*
FROM books b
JOIN books_authors ba ON ba.book_id = b.book_id
JOIN authors a ON a.author_id = ba.author_id
WHERE b.book_id IN (
SELECT book_id
FROM books_authors
WHERE author_id=1)
Notice that INNER JOINs may very well be more efficient than OUTER JOINs in most cases (they give the engine more choice on how to execute the stament and fetch the result). So you should avoid OUTER JOINs if not strictly necessary.
I added aliases and removed the redundant columns from the result set.
You don't need a subquery for this:
SELECT *
FROM book_authors ba
JOIN books b
ON b.book_id = ba.book_id
JOIN book_authors ba2
ON ba2.book_id = b.book_id
JOIN authors a
ON a.author_id = ba2.author_id
WHERE ba.author_id = 1
You're pretty close... basically you need to identify all unique book ids for which author_id = ?. Then join that with the book_author table again to get all of the authors associate with those book ids. Then join to books and authors to get your book and author names.
Hopefully the following is very clear in this regard, but if it's not just let me know and I'll help explain it in more detail
SELECT a.*, d.* FROM books as a
INNER JOIN (SELECT book_id FROM books_authors WHERE author_id=?) as b
ON a.book_id=b.book_id
INNER JOIN books_authors as c
ON b.book_id=c.book_id
INNER JOIN authors AS d
ON d.author_id = c.author_id
Btw you could also structure this with a WHERE EXISTS clause. I don't think you'll see much of a performance difference either way, but just FYI you can try that if need be. Use EXPLAIN to view the execution plan for the query. If it's problematic, there are other ways to skin this cat.
Also, make sure you pay attention to indices. Whether you use the method here, or the method described by Frazz, a compound/mutli-column/complex index may make a big difference for you. That is, consider indexing books_authors by both book_id and by (author_id, book_id). Whether you should use an additional join or an IN or an EXISTS subquery... lots of ways to skin the cat. No matter what, though, having a multicolumn index on books_authors is likely to help you out, especially if this table is large
Hi I'm trying to write a query to get information about the author, title, category and medium.
However as the items can be in many mediums and categories, I'm getting the results appearing duplicated in the columns. How can I get the results so I don't see medium as book,book,book and category as Horror,Fantasy,Fiction. I'm assuming I will need some sort of subquery - if so how would I do it?
SELECT book.bookid, book.author, book.title, group_concat(category.categorydesc), group_concat(medium.mediumdesc)
FROM book
Inner JOIN bookscategories ON book.bookid = bookscategories.bookid
Inner JOIN category ON bookscategories.categoryid = category.categoryid
Inner JOIN booksmediums ON book.bookid = booksmediums.bookid
Inner JOIN medium ON booksmediums.mediumid = medium.mediumid
GROUP BY book.bookid
Thanks
Tom
So as stated in comments, solution is to add the DISTINCT keyword in the GROUP_CONCAT() instructions like that:
... book.title, group_concat(DISTINCT category.categorydesc), group_concat(DISTINCT medium.mediumdesc) ...