I need to find the author who has written the most volumes in my database, I have three tables "VOLUMES", "AUTHOR", "WRITTEN BY". Table VOLUMES has columns like title, volume_id(primary key), year, edition_year, etc. Table AUTHOR has columns for generic infos like name, surname, id,etc and table WRITTEN BY is used to connect AUTHOR and VOLUMES, it contains the columns volume_id and author_id.
My data can contain many copies of the same volumes so I guess I need to group every copy of the same volume.
The query I have written is:
SELECT A.NAME, A.SURNAME, COUNT(V.ID) no_of_volumes
FROM VOLUMES AS V, AUTHOR AS A JOIN WRITTEN_BY W
WHERE (V.ID = W.VOLUME_ID AND A.ID = W.A_ID)
GROUP BY A.NAME, A.SURNAME
ORDER BY no_of_volumes DESC
LIMIT 1;
Now, this should print just one author, but I want it to print EVERY author that has written the same number of volumes... How do I do that?
If you are allowed to change the scheme slightly (use unique ID-names) than this could do the trick
select AUTHOR.NAME
, count(*) VolumeCount
from AUTHOR
inner join WRITTEN_BY using(AUTHOR_ID)
inner join VOLUME using(VOLUME_ID)
group by a.AUTHOR_ID
order by VolumeCount desc
limit 1;
x
If not than this should do the trick:
select AUTHOR.NAME
, count(*) VolumeCount
from AUTHOR
inner join WRITTEN_BY on AUTHOR_ID = AUTHOR.ID
inner join VOLUME on VOLUME_ID = VOLUME.ID
group by a.AUTHOR_ID
order by VolumeCount desc
limit 1;
Please reserve where for filtering only. Using it for foreign-key makes messy code.
(And if your table and field names are uppercase it makes sense to use lowercase keywords)
You must aggregate twice inside written_by: once to get the max number of volumes written and then to get the author ids who wrote the max number of volumes.
Then join to authors to get the authors details:
SELECT a.*, t.no_of_volumes
FROM author a INNER JOIN (
SELECT author_id, COUNT(*) no_of_volumes
FROM written_by
GROUP BY author_id
HAVING COUNT(*) = (
SELECT COUNT(*) no_of_volumes
FROM written_by
GROUP BY author_id
ORDER BY no_of_volumes DESC LIMIT 1
)
) t ON t.author_id = a.id
The table volumes is not needed.
For MySql 8.0+ the code can be simplified with the use of the window function RANK():
SELECT a.*, t.no_of_volumes
FROM author a INNER JOIN (
SELECT r.*
FROM (
SELECT author_id, COUNT(*) no_of_volumes,
RANK() OVER (ORDER BY COUNT(*) DESC) rnk
FROM written_by
GROUP BY author_id
) r
WHERE r.rnk = 1
) t ON t.author_id = a.id
Related
I am working on this project at my university, where I need to create a query to the database. I want the query to return the company with most movies in the given genre. At the moment I have this query, but this only return one company, but there can probably be more than one.
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATUAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC LIMIT 1
I beleave I will need a "having" in here.
pls try this, it may because you added limit 1, which only show 1st retrieved record
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATURAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC
I assume by "category" you mean "genre" -- or that they are the same thing.
Do not use NATURAL JOIN. It does not even use properly declared foreign key relationships, instead relying merely on name similarity between tables. It is dangerous because the columns used are not specified and can introduce hard-to-debug errors. I often refer to it as an "abomination" because it does not take table declarations into account.
If you have a given category, then I would expect a WHERE clause:
SELECT CompanyID, COUNT(*) as NumberOfMovies
FROM Movie m JOIN
CategoryFilm cf
ON cf.movie_id = m.movie_id JOIN
Company c
ON c.company_id = m.company_id
WHERE cf.category_id = ?
GROUP BY CategoryID
ORDER BY NumberOfMovies DESC
LIMIT 1;
If you want to allow ties, you can use window function rank():
select *
from (
select
co.companyID,
ca.categoryID,
count(*) NumberOfMovies,
rank() over(partition by c.categoryID order by count(*) desc) rn
from movie m
inner join categoryFilm cf on cf.movieID = m.movieID
inner join category ca on ca.categoryID = cf.categoryID
inner join company co on co.companyID = m.companyID
group by co.companyID, ca.categoryID
) t
where rn = 1
order by ca.categoryID
This gives you the top company for each and every category, ties included. If you want to filter on a given category, you can just add a where clause to the inner query.
Side note: do not use natural joins: they are error-prone. I rewrote the query to use inner joins instead (I made a few assumptions on the relations).
I have an "Author" table, containing Authors(Nicknames & IDs).
In the Content table, each item has a field "Author" containing the ID of the author who made it.
I want to select all authors using a SELECT query, and to order them by them amount of Content they created.
This is what I tried so far :
SELECT id,Nickname FROM Authors
WHERE 1 ORDER BY (SELECT COUNT(*) FROM Content WHERE Author=id) ASC
It runs, but the output is invalid - it has no specific order...
Any help is greatly appreciated.
You could use:
SELECT a.id,a.Nickname
FROM Authors a
LEFT JOIN Content c
ON c.Author=a.id
GROUP BY a.id,a.Nickname
ORDER BY COUNT(*) DESC
This should do what you want:
SELECT a.id, a.Nickname
FROM Authors a
WHERE 1
ORDER BY (SELECT COUNT(*) FROM Content c WHERE c.Author = a.id) ASC;
This makes the correlation explicit. Your version would produce unsorted results if Content had an id column -- which is likely.
More commonly, you would want the count in the SELECT, and you would do:
SELECT a.id, a.Nickname, COUNT(c.Author) as num_content
FROM Authors a LEFT JOIN
Content c
ON c.Author = a.id
GROUP BY a.id, a.Nickname
ORDER BY num_content ASC;
I have a database with tree tables,
person: id, bio, name
book: id, id_person, title, info
file: id, id_book, location
Other information: Book is about ~50,000 rows, File is about ~ 300,000 rows.
What I'm trying to do is to select 12 different authors and select just one book and from that book select location from the table file.
What I tried is the following:
SELECT DISTINCT(`person`.`id`), `person`.`name`, `book`.`id`, `book`.`title`, `book`.`info`, `file`.`location`
FROM `person`
INNER JOIN `book`
ON `book`.`id_person` = `person`.`id`
INNER JOIN `file`
ON `file`.`id_book` = `book`.`id`
LIMIT 12
I have learned that the DISTINCT does not work the way one might expect. Or is it me that I'm missing something? The above code returns books from the same author and goes with the next one. Which is NOT what I want. I want 1 book from each one of the 12 different authors.
What would be the correct way to retrieve this information from the database? Also, I would want to retrieve 12 random people. Not people that are stored in consecutive order in the database,. I could not formulate any query with rand() since I couldn't even get different authors.
I use MariaDB. And I would appreciate any help, especially help that allows to me do this with great performance.
In MySQL, you can do this, in practice, using GROUP BY
SELECT p.`id`, p.`name`, b.`id`, b.`title`, b.`info`, f.`location`
FROM `person` p INNER JOIN
`book` b
ON b.`id_person` = p.`id` INNER JOIN
`file` f
ON f.id_book = b.id
GROUP BY p.id
ORDER BY rand()
LIMIT 12;
However, this is not guaranteed to return the non-id values from the same row (although it does in practice). And, although the authors are random, the books and locations are not.
The SQL Query to do this consistently is a bit more complicated:
SELECT p.`id`, p.`name`, b.`id`, b.`title`, b.`info`,
(SELECT f.location
FROM file f
WHERE f.id_book = b.id
ORDER BY rand()
LIMIT 1
) as location
FROM (SELECT p.*,
(SELECT b.id
FROM book b
WHERE b.id_person = p.id
ORDER BY rand()
LIMIT 1
) as book_id
FROM person p
ORDER BY rand()
LIMIT 12
) p INNER JOIN
book b
ON b.id = p.book_id ;
Ok, so assume I have the following tables:
recipes
id (pk)
name
added
modified
recipe_versions
id (pk)
recipe_id (fk to recipes.id)
version
content
added
What I want is a query that grabs the latest recipe_versions.added data and then joins with the base recipe data. Then sorts all results by recipes.added ASC I have the following, but the group by, is not selecting the latest recipe_versions row, seems to be selecting the first.
SELECT r.`id`,
r.name,
rv.version,
rv.content,
r.added,
r.modified,
FROM recipes r,
recipe_versions rv
WHERE r.`id` = rv.recipe
GROUP BY rv.recipe
HAVING max(rv.added)
ORDER BY r.added ASC
Use this solution:
SELECT
c.*, b.*
FROM
(
SELECT recipe_id, MAX(added) AS mostrecent
FROM recipe_versions
GROUP BY recipe_id
) a
INNER JOIN
recipe_versions b ON
a.recipe_id = b.recipe_id AND
a.mostrecent = b.added
INNER JOIN
recipes c ON a.recipe_id = c.id
ORDER BY
c.added
I need some help figuring out a query
I have 3 tables
sources
id, name, rank
origin
id, source_id (FK to sources id), name
One source can have many origins
product
id, origin_id (FK to origin id), name, time_added
One origin can have many products
Now, what I want is to select the most recent products per source, ordered by rank descending
Any suggestions?
This should do as you have requested, though without sample output it's hard to be 100% certain. Inner query selects products linked to the source id ordered by the date added from newest to oldest, and in turn that's joined to sources and grouped.
SELECT
*
FROM sources AS s
INNER JOIN (
SELECT
origins.source_id,
product.*
FROM origin
INNER JOIN product
ON product.origin_id = origin.origin_id
ORDER BY time_added DESC
) AS productsOrdered
ON productsOrdered.source_id = sources.source_id
ORDER BY s.rank DESC, productsOrdered.time_added DESC
This avoids having to do potentially expensive opreations as the inner select should be pretty fast and can be limited as required
A typical way of doing this is to
Find the MAX(time_added) for each origin
Get the product's id for each of these origins
Join with the sources and origin tables to retrieve all columns
Note that this fails if there are origins with multiple records with the exact same time_added.
SQL Statement
SELECT *
FROM sources s
INNER JOIN origin o ON o.source_id = s.id
INNER JOIN product p ON p.origin_id = o.id
INNER JOIN (
SELECT id
FROM product p
INNER JOIN (
SELECT origin_id
, MAX(time_added) AS time_addded
FROM product p
GROUP BY
origin_id
) pmax ON pmax.origin_id = p.origin_id
AND pmax.time_added = p.time_added
) pmax ON pmax.id = p.id
SELECT o.id,count(o.id) as numOfProdFromOrig p.id, p.name, p.time_added, s.rank
FROM product as p NATURAL JOIN sources as s NATURAL JOIN origin as o
GROUP BY (numOfProdFromOrig)
ORDER BY s.rank DESC
select b.id,(select p.name from origin o inner join product p
on p.origin_id = o.id where o.source_id = b.id order by time_added desc limit 1)a as product_name
from source b ;
Try this: