MySQL limiting rows in a many-to-many relationship - mysql

i have a database with 3 tables in a many-to-many relationship. one is the books, the other is the authors and the third is their junction (used to join the two). the database is MySQL
a book can be made by many authors, and authors make many books.
now i want to get the books (with authors) like say 8 books at a time. i made this query:
//first table
SELECT * FROM `books`
//join the junction
LEFT JOIN books_authors ON books.book_id = books_authors.book_id
//join the authors
LEFT JOIN authors ON books_authors.author_id = authors.author_id
//limit to 8, start at S
limit S, 8
works fine when one-to-one. but when a book has more authors, like say 3 each, the sql result will have 8 x 3 rows in total (due to the 2D nature of the result) for all the details. but the query still clips it to 8 - i don't get all the details.
how to get 8 books with all details?

You could limit the number of books in a subquery:
select *
from (
select *
from books
limit 6, 7
) b
left join
ba ba
on b.book_id = ba.book_id
left join
authors a
on ba.author_id = a.author_id

Do not know what SQL features does your server support, but there are windowing functions that are supported by many. Essentially, you compute a sequential number to every book by given author in a particular ordering, and select only these having that number less or equal than 8 (for some value of 8). The numbering is provided by the function ROW_NUMBER():
SELECT * FROM (
SELECT a.author_id, b.book_id, a.name, b.title,
ROW_NUMBER() OVER (PARTITION BY a.author_id ORDER BY b.title) book_seq
FROM author a
LEFT JOIN book b on a.author_id = b.author_id) dummy
WHERE book_seq <= 8
Above, ORDER BY b.title defines the ordering in which you select the 8 minimal book records.
EDIT: mySQL does not support ROW_NUMBER(), so my answer does not apply as it is. There is an interesting article with examples on simulating ROW_NUMBER with OVER (PARTITION) in mySQL.

Of the top of my head, I'd guess you could try something like
select *
from ( select * from books limit s, 8 ) booksLimited
join books_authors on booksLimited.book_id = books_authors.book_id
join authors on books_authors.author_id = authors.book_id
But I'm not going to go to the effort of installing mySQL just to try this out for you. If it doesn't work, comment on it and I'll delete the answer.

Related

How to select all the authors from database with the number of books assigned to them?

I've the following DB structure:
Authors(id,name);
Books(id,title,authorId);
I want to select all fields from authors and the number of books they are assigned to. I've managed to get the result, but only for the authors that are assigned to at least one book, which is not what I want. I tried with the following query:
SELECT books.*,authors.*
FROM authors
FULL OUTER JOIN books
ON authors.id = books.authorId;
but it doesn't work.
I guess that you want a left join and aggregation:
select a.id, a.name, count(*)
from authors a
left join books b on b.authorId = a.id
group by a.id, a.name
outer join will bring back authors without books. Instead use inner join and your results will only bring back authors with at least 1 book.
I would recommend a correlated subquery:
SELECT a.*,
(SELECT COUNT(*)
FROM books b
WHERE a.id = b.authorId
) as num_books
FROM authors;
This allows you to use SELECT a.* from authors. If you put a GROUP BY in the outer query, you either need to list all the columns separately or be using a database that allows you to aggregate by a primary key, while selecting other columns (this is standard functionality but most databases do not support it).
Definitely you need LEFT JOIN and GROUP BY, but details is not clear enough from the task description. Let's try a kind of
SELECT b.*, ab.count
FROM authors AS a
LEFT JOIN (
SELECT authorId, COUNT(*) AS count
FROM books
GROUP BY authorId
) AS ab ON a.id = ab.authorId;
also, if you don't want to get NULL for some authors, you can apply such expression:
IFNULL(ab.count, 0) AS count

Will this MySQL quearly select all of the authors who wrote a book?

I am haveing a lot of trouble trying to work out this question
Write a query to show the number of authors who have written a book
Author(AuthorID, AuthorName, Address, TelephoneNo, PublisherCode)
Book (BookID, Name, ReleaseDate, Price, AuthorID)
I have
SELECT a.AuthorName, COUNT(b.*) AS ‘number of books written’
FROM Author a JOIN Book b ON a.AuthorID = b.BookID
GROUP BY a.AuthorName;
Which counts the number of books each author has written.
This is not the correct I know, but I can not figure it out??
Assuming the requirement is to count authors that have at least one book, the simplest query to satisfy that would be:
SELECT COUNT(DISTINCT b.authorid)
FROM book b
We probably want to assign an alias (name) to the returned column:
SELECT COUNT(DISTINCT b.authorid) AS `count_of_authors_who_have_at_least_one_book`
FROM book b
We could also do a join to the author table, but that isn't necessary here, unless there are values of authorid in the book table that don't appear in the author table (i.e. there's not a foreign key constraint, or referential integrity is not enforced)
Queries to get authors that have two or more books would be a bit more complicated:
SELECT COUNT(*)
FROM ( -- authors of two or more books
SELECT b.authorid
FROM book b
GROUP
BY b.authorid
HAVING COUNT(1) >= 2
) c
If we want authors that have EXACTLY one book (not two or more) we can tweak the condition in the HAVING clause:
SELECT COUNT(*) AS `count_authors_of_exactly_one_book`
FROM ( -- authors of exactly one book
SELECT b.authorid
FROM book b
GROUP
BY b.authorid
HAVING COUNT(1) = 1
) c
You were pretty close. You need to join on the author ID. You are currently mixing the author and book ID's, which won't match correctly.
SELECT
a.AuthorName,
COUNT(b.*) AS ‘number of books written’
FROM Author a
JOIN Book b ON a.AuthorID = b.AuthorID
GROUP BY a.AuthorName;
If you wanna get just a number that indicate total count of Author that wrote at least on book use below query
select count(*) as author_count from Author where exists (select 1 from Book where Book.AuthorID = Author.AuthorID)

Query to select random values with inner join on three tables

I have a database with tree tables,
person: id, bio, name
book: id, id_person, title, info
file: id, id_book, location
Other information: Book is about ~50,000 rows, File is about ~ 300,000 rows.
What I'm trying to do is to select 12 different authors and select just one book and from that book select location from the table file.
What I tried is the following:
SELECT DISTINCT(`person`.`id`), `person`.`name`, `book`.`id`, `book`.`title`, `book`.`info`, `file`.`location`
FROM `person`
INNER JOIN `book`
ON `book`.`id_person` = `person`.`id`
INNER JOIN `file`
ON `file`.`id_book` = `book`.`id`
LIMIT 12
I have learned that the DISTINCT does not work the way one might expect. Or is it me that I'm missing something? The above code returns books from the same author and goes with the next one. Which is NOT what I want. I want 1 book from each one of the 12 different authors.
What would be the correct way to retrieve this information from the database? Also, I would want to retrieve 12 random people. Not people that are stored in consecutive order in the database,. I could not formulate any query with rand() since I couldn't even get different authors.
I use MariaDB. And I would appreciate any help, especially help that allows to me do this with great performance.
In MySQL, you can do this, in practice, using GROUP BY
SELECT p.`id`, p.`name`, b.`id`, b.`title`, b.`info`, f.`location`
FROM `person` p INNER JOIN
`book` b
ON b.`id_person` = p.`id` INNER JOIN
`file` f
ON f.id_book = b.id
GROUP BY p.id
ORDER BY rand()
LIMIT 12;
However, this is not guaranteed to return the non-id values from the same row (although it does in practice). And, although the authors are random, the books and locations are not.
The SQL Query to do this consistently is a bit more complicated:
SELECT p.`id`, p.`name`, b.`id`, b.`title`, b.`info`,
(SELECT f.location
FROM file f
WHERE f.id_book = b.id
ORDER BY rand()
LIMIT 1
) as location
FROM (SELECT p.*,
(SELECT b.id
FROM book b
WHERE b.id_person = p.id
ORDER BY rand()
LIMIT 1
) as book_id
FROM person p
ORDER BY rand()
LIMIT 12
) p INNER JOIN
book b
ON b.id = p.book_id ;

Selecting rows from one table using values gotten from another table MYSQL

I have currently have 2 mysql tables in my db
Film and Film_Ratings_Report
The primary key for Film is filmid which is used to identify the film ratings in the Film_Ratings_Report table.
I would like to know if its possible using a MYSQL query only to search the ratings table and collect all film ids which fit a certain criteria then use the selected IDs to get the film titles from the Film table. Below is the MYSQL query Im using which isnt working:
SELECT *
FROM film
UNION SELECT filmid
FROM film_rating_report
WHERE rating = 'GE'
LIMIT 0,0
I am relatively green to MYSQL and would appreciate any help on this.
Thanks in Advance
SELECT * FROM film WHERE id IN
(SELECT filmid FROM film_rating_report WHERE rating = 'GE');
should work
It seems you want a semi-join, e.g. a join where only data from one of the 2 joined tables are needed. In this case, all rows from film for which there is a matching row in film_rating_report that has the wanted condition (rating = 'GE').
This is not exactly equivalent to a usual join because even if there are 2 (or more) row in the second table (2 ratings of a film, both with 'GE'), we still want the film to be shown once, not twice (or more times) as it would be shown with a usual join.
There are various ways to write a semi-join and most popular are:
using an EXISTS correlated subquery (#Justin's answer):
SELECT t1.*
FROM film t1
WHERE EXISTS (SELECT filmid
FROM film_rating_report t2
WHERE t2.rating = 'GE'
AND t2.filmid = t1.id);
using an IN (uncorrelated) subquery (#SG 86's answer):
(this should be used with extreme care as it may return unexpected results - or none at all - if the joining columns (the two filmid in this case) are Nullable)
SELECT *
FROM film
WHERE id IN
( SELECT filmid
FROM film_rating_report
WHERE rating = 'GE'
);
using a usual JOIN with a GROUP BY to avoid the duplicate rows in the results (#Tomas' answer):
(and note that this specific use of GROUP BY works in MySQL only and in recent versions of Postgres, if you ever want to write a similar query in other DBMS, you'll have to include all columns: GROUP BY f.filmid, f.title, f.director, ...)
SELECT f.*
FROM film AS f
JOIN film_rating_report AS frr
ON f.filmid = frr.filmid
WHERE frr.rating = 'GE'
GROUP BY f.filmid ;
A variation on #Tomas'es answer, where the GROUP BY is done on a derived table and then the JOIN:
SELECT f.*
FROM film AS f
JOIN
( SELECT filmid
FROM film_rating_report
WHERE rating = 'GE'
GROUP BY filmid
) AS frr
ON f.filmid = frr.filmid ;
Which one to use, depends on the RDBMS and the specific version you are using (for example, IN subqueries should be avoided in most versions of MySQL as they may produce inefficient execution plans), your specific table sizes, distribution, indexes, etc.
I usually prefer the EXISTS solution but it never hurts to first test the various queries with the table sizes you have or expect to have in the future and try to find the best query-indexes combination for your case.
Addition: if there is a unique constraint on the film_rating_report (filmid, rating) combination, which means that no film will ever get two same ratings, or if there is an even stricter (but more plausible) unique constraint on film_rating_report (filmid) that means that every film has at most one rating, you can simplify the JOIN solutions to (and get rid of all the other queries):
SELECT f.*
FROM film AS f
JOIN film_rating_report AS frr
ON f.filmid = frr.filmid
WHERE frr.rating = 'GE' ;
Preferred solution for this is to use join, and don't forget group by so that you don't have duplicate lines:
select film.*
from film
join film_rating_report on film.filmid = film_rating_report.filmid
and rating = 'GE'
group by film.filmid
EDIT: as correctly noted by #ypercube, I was wrong claiming that the performance of join & group by is better than using subqueries with exists or in - quite the opposite.
Query:
SELECT t1.*
FROM film t1
WHERE EXISTS (SELECT filmid
FROM film_rating_report t2
WHERE t2.rating = 'GE'
AND t2.filmid = t1.id);
I believe this will work, thought without knowing your DB structure (consider giving SHOW CREATE TABLE on your tables), I have no way to know for sure:
SELECT film.*
FROM (film)
LEFT JOIN film_rating_report ON film.filmid = film_rating_report.filmid AND film_rating_report.rating = 'GE'
WHERE film_rating_report.filmid IS NOT NULL
GROUP BY film.filmid
(The WHERE film_rating_report.filmid IS NOT NULL prevents lines that don't have the rating you are seeking from sneaking in, I added GROUP BY at the end because film_rating_report might match more than once - not sure as I have visibility to the data stored in it)

How to do calculation using many-to-many relationship in MS Access

I have a database having the following structure. Each book can have multiple authors, and each author can write multiple books.
[book:book_id, book_name, book_price]
[author: author_id, author_name]
[link:book_id, author_id]
{book_id's and author_id's are linked. the complete structure is shown here: http://i.stack.imgur.com/vIFNU.png}
Now each book has a price (currency). 30% of the price should be equally distributed to each author who have contributed to the book.
My question is how to find the total payment for each author for a particular year.
[I thought of a solution my self. I could do only up to step 1. If you can provide me some hints or materials where I can find how do such manipulations, it would be very helpful]
Algorithm of my solution is:
So for each book_id, I need to find the number of author_id's in the middle table who has the same book_id. (could do it by query)
If I divide the book_price by number_of_author_in_book and multiply it with 30/100, I get the money for that book that will go to the each author's account for that book (say payment_of_one_author_in_book)
For each author_id in the middle table, I look up for the corresponding book_id and add the payment_of_one_author_in_book for that author_id to a new variable (author_payment_this_year) corresponding to the author ID, if the year matches to the query year.
Thanks in advance
This example includes aliases and subqueries.
SELECT
a.author_id,
a.author_name,
SUM(share.auth_share) AS author_total
FROM (link l
INNER JOIN (
SELECT
b.book_id,
( [b.book_price] * 0.3 ) / [no_auth] AS auth_share
FROM book b
INNER JOIN (
SELECT
l.book_id,
COUNT(l.author_id) AS no_auth
FROM link l
GROUP BY l.book_id) AS ac
ON b.book_id = ac.book_id) AS share
ON l.book_id = share.book_id)
INNER JOIN author a
ON l.author_id = a.author_id
GROUP BY a.author_id,a.author_name
SELECT author.AuthorId, author.Author_name, book.Book_Name, book.Book_Price, [Book_Price]/DCount("[Author_id]","[Link]","[Book_id]=" & [Book_Id]) AS Share
FROM (author INNER JOIN link AS link_1 ON author.AuthorId = link_1.Author_id) INNER JOIN book ON link_1.Book_id = book.Book_Id;
Basically, it is just a join of the three tables. The only tricky bit is using DCount function to add up how many authors in Link share this book_ID