I have an "Author" table, containing Authors(Nicknames & IDs).
In the Content table, each item has a field "Author" containing the ID of the author who made it.
I want to select all authors using a SELECT query, and to order them by them amount of Content they created.
This is what I tried so far :
SELECT id,Nickname FROM Authors
WHERE 1 ORDER BY (SELECT COUNT(*) FROM Content WHERE Author=id) ASC
It runs, but the output is invalid - it has no specific order...
Any help is greatly appreciated.
You could use:
SELECT a.id,a.Nickname
FROM Authors a
LEFT JOIN Content c
ON c.Author=a.id
GROUP BY a.id,a.Nickname
ORDER BY COUNT(*) DESC
This should do what you want:
SELECT a.id, a.Nickname
FROM Authors a
WHERE 1
ORDER BY (SELECT COUNT(*) FROM Content c WHERE c.Author = a.id) ASC;
This makes the correlation explicit. Your version would produce unsorted results if Content had an id column -- which is likely.
More commonly, you would want the count in the SELECT, and you would do:
SELECT a.id, a.Nickname, COUNT(c.Author) as num_content
FROM Authors a LEFT JOIN
Content c
ON c.Author = a.id
GROUP BY a.id, a.Nickname
ORDER BY num_content ASC;
Related
I need to find the author who has written the most volumes in my database, I have three tables "VOLUMES", "AUTHOR", "WRITTEN BY". Table VOLUMES has columns like title, volume_id(primary key), year, edition_year, etc. Table AUTHOR has columns for generic infos like name, surname, id,etc and table WRITTEN BY is used to connect AUTHOR and VOLUMES, it contains the columns volume_id and author_id.
My data can contain many copies of the same volumes so I guess I need to group every copy of the same volume.
The query I have written is:
SELECT A.NAME, A.SURNAME, COUNT(V.ID) no_of_volumes
FROM VOLUMES AS V, AUTHOR AS A JOIN WRITTEN_BY W
WHERE (V.ID = W.VOLUME_ID AND A.ID = W.A_ID)
GROUP BY A.NAME, A.SURNAME
ORDER BY no_of_volumes DESC
LIMIT 1;
Now, this should print just one author, but I want it to print EVERY author that has written the same number of volumes... How do I do that?
If you are allowed to change the scheme slightly (use unique ID-names) than this could do the trick
select AUTHOR.NAME
, count(*) VolumeCount
from AUTHOR
inner join WRITTEN_BY using(AUTHOR_ID)
inner join VOLUME using(VOLUME_ID)
group by a.AUTHOR_ID
order by VolumeCount desc
limit 1;
x
If not than this should do the trick:
select AUTHOR.NAME
, count(*) VolumeCount
from AUTHOR
inner join WRITTEN_BY on AUTHOR_ID = AUTHOR.ID
inner join VOLUME on VOLUME_ID = VOLUME.ID
group by a.AUTHOR_ID
order by VolumeCount desc
limit 1;
Please reserve where for filtering only. Using it for foreign-key makes messy code.
(And if your table and field names are uppercase it makes sense to use lowercase keywords)
You must aggregate twice inside written_by: once to get the max number of volumes written and then to get the author ids who wrote the max number of volumes.
Then join to authors to get the authors details:
SELECT a.*, t.no_of_volumes
FROM author a INNER JOIN (
SELECT author_id, COUNT(*) no_of_volumes
FROM written_by
GROUP BY author_id
HAVING COUNT(*) = (
SELECT COUNT(*) no_of_volumes
FROM written_by
GROUP BY author_id
ORDER BY no_of_volumes DESC LIMIT 1
)
) t ON t.author_id = a.id
The table volumes is not needed.
For MySql 8.0+ the code can be simplified with the use of the window function RANK():
SELECT a.*, t.no_of_volumes
FROM author a INNER JOIN (
SELECT r.*
FROM (
SELECT author_id, COUNT(*) no_of_volumes,
RANK() OVER (ORDER BY COUNT(*) DESC) rnk
FROM written_by
GROUP BY author_id
) r
WHERE r.rnk = 1
) t ON t.author_id = a.id
I have two tables, one for image records (posts) and the other one is for likes records. So i made an INNER JOIN from one table to another because i needed to select the image and the quantity of likes that particular image has. but i also need to order them by the quantity of likes so i can make a top 10 of most voted images on the site, so here is my query:
SELECT
COUNT(DISTINCT B.votes),
A.id_image,
A.image,
A.title
FROM likes_images AS B INNER JOIN images AS A ON A.id_image = B.id_image
GROUP BY A.title
ORDER BY COUNT(DISTINCT B.votes) ASC
LIMIT 10
It works, but it's only ordering the images by the title (Alphabetical). I want to order them from the most voted to the less voted.
Any ideas?
In most SQL implementations, GROUP BY criterion implies any ORDER BY clause to be ignored in favor of criterion.
So you might try this:
SELECT L.id_image, A.image, A.title,
* FROM (
SELECT COUNT(votes) AS likes, id_image
FROM likes_images
GROUP BY id_image
) AS L
JOIN images B ON B.id_image = L.id_image
ORDER BY L.likes DESC
LIMIT 10
Note that I set ORDER BY to DESC (since you want top10 I don't understand you chose ASC)!
Having this database schema (just for illustration purpose)
[articles (id_article, title)]
[articles_tags (id_tag, id_article)]
[tags (id_tag, name)]
using MySQL it's possible to do:
SELECT a.title, COUNT(at.id_tag) tag_count FROM articles a
JOIN articles_tags at ON a.id_article = at.id_article
JOIN tags t ON t.id_tag = at.id_tag
GROUP BY a.id_article
ORDER BY tag_count DESC
resulting in a result where you have on each row article's title and article's tag count, e.g.
mysql for beginner | 8
ajax for dummies | 4
Since ORACLE doesn't support non-aggregated columns in SELECT statement, is it possible to do this anyhow in one query? When you fulfill ORACLE's needs by either adding aggregate function to SELECT statement or adding the column to GROUP BY statement you already get different results.
Thanks in advance
Yes, it's possible. Return id_article in the SELECT list, instead of title, and wrap that whole query in parens to make it an inline view, and then select from that, and a join to the articles table to get the associated title.
For example:
SELECT b.title
, c.tag_count
FROM ( SELECT a.id_article
, COUNT(at.id_tag) tag_count
FROM articles a
JOIN articles_tags at ON a.id_article = at.id_article
JOIN tags t ON t.id_tag = at.id_tag
GROUP BY a.id_article
) c
JOIN articles b
ON b.id_article = c.id_article
ORDER BY c.tag_count DESC
You can also evaluate whether you really need the articles table included in the inline view. We could do a GROUP BY at.id_article instead.
I think this returns an equivalent result:
SELECT b.title
, c.tag_count
FROM ( SELECT at.id_article
, COUNT(at.id_tag) tag_count
FROM articles_tags at
JOIN tags t ON t.id_tag = at.id_tag
GROUP BY at.id_article
) c
JOIN articles b
ON b.id_article = c.id_article
ORDER BY c.tag_count DESC
I have three main items I am storing: Articles, Entities, and Keywords. This makes 5 tables:
article { id }
entity {id, name}
article_entity {id, article_id, entity_id}
keyword {id, name}
article_keyword {id, article_id, keyword_id}
I would like to get all articles that contain the TOP X keywords + entities. I can get the top X keywords or entities with a simple group by on the entity_id/keyword_id.
SELECT [entity|keyword]_id, count(*) as num FROM article_entity
GROUP BY entity_id ORDER BY num DESC LIMIT 10
How would I get all articles that have a relation to the top entities and keywords?
This was what I imagined, but I know it doesn't work because of the group by entity limiting the article_id's that return.
SELECT * FROM article
WHERE EXISTS (
[... where article is mentioned in top X entities.. ]
) AND EXISTS (
[... where article is mentioned in top X keywords.. ]
);
If I understand you correct the objective of the query is to find the articles that have a relation to both one of the top 10 entities as well as to one of the top 10 keywords. If this is the case the following query should do that, by requiring that the article returned has a match in both the set of top 10 entities and the set of top 10 keywords.
Please give it a try.
SELECT a.id
FROM article a
INNER JOIN article_entity ae ON a.id = ae.article_id
INNER JOIN article_keyword ak ON a.id = ak.article_id
INNER JOIN (
SELECT entity_id, COUNT(article_id) AS article_entity_count
FROM article_entity
GROUP BY entity_id
ORDER BY article_entity_count DESC LIMIT 10
) top_ae ON ae.entity_id = top_ae.entity_id
INNER JOIN (
SELECT keyword_id, COUNT(article_id) AS article_keyword_count
FROM article_keyword
GROUP BY keyword_id
ORDER BY article_keyword_count DESC LIMIT 10
) top_ak ON ak.keyword_id = top_ak.keyword_id
GROUP BY a.id;
The downside to using a simplelimit 10in the two subqueries for top entities/keywords is that it won't handle ties, so if the 11th keyword was just as popular as the 10th it still won't get chosen. This can be fixed though by using a ranking function, but afaik MySQL doesn't have anything build in (like RANK() window functions in Oracle or MSSQL).
I set up a sample SQL Fiddle (but using fewer data points andlimit 2as I'm lazy).
Not knowing the volume of data you are working with, I would first recommend that you have two storage columns on your article table for count of entities and keywords respectively. Then via triggers on adding/deleting from each, update the respective counter columns. This way, you don't have to do a burning query each time needed, especially in a web-based interface. Then, you can just select from the articles table ordered by the E+K counts descending and be done with it, instead of constant sub-querying the underlying tables.
Now, that said, the other suggestions are somewhat similar to what I am posting, but they all appear to be doing a limit of 10 records for each set. Lets throw this scenario into the picture. Say you have articles 1-20 all a range of 10, 9 and 8 entities and 1-2 keywords. Then articles 21-50 have the reverse... 10, 9, 8 keywords and 1-2 entities. Now, you have articles 51-58 that have 7 entities AND 7 keywords total of 14 combined points. None of the queries would have caught this as entities would only return the qualifying 1-20 records and keywords records 21-50. Articles 51-58 would be so far down on the list, it would not even be considered even though its total is 14.
To handle this, each sub-query is a full query specifically on the article ID and its count. Simple order by the article_ID as that is basis of the join to the master article table.
Now, the coalesce() will get the count if so available, otherwise 0 and add the two values together. From that, the results are ordered with the highest counts first (thus getting scenario sample articles 51-58 plus a few of the others) when the limit is applied.
SELECT
a.id,
coalesce( JustE.ECount, 0 ) ECount,
coalesce( JustK.KCount, 0 ) KCount,
coalesce( JustE.ECount, 0 ) + coalesce( JustK.KCount, 0 ) TotalCnt
from
article a
LEFT JOIN ( select article_id, COUNT(*) as ECount
from article_entity
group by article_id
order by article_id ) JustE
on a.id = JustE.article_id
LEFT JOIN ( select article_id, COUNT(*) as KCount
from article_keyword
group by article_id
order by article_id ) JustK
on a.id = JustK.article_id
order by
coalesce( JustE.ECount, 0 ) + coalesce( JustK.KCount, 0 ) DESC
limit 10
I took this in several steps
tl;dr This shows all the articles from the top (4) keywords and entities:
Here's a fiddle
select
distinct article_id
from
(
select
article_id
from
article_entity ae
inner join
(select
entity_id, count(*)
from
article_entity
group by
entity_id
order by
count(*) desc
limit 4) top_entities on ae.entity_id = top_entities.entity_id
union all
select
article_id
from
article_keyword ak
inner join
(select
keyword_id, count(*)
from
article_keyword
group by
keyword_id
order by
count(*) desc
limit 4) top_keywords on ak.keyword_id = top_keywords.keyword_id) as articles
Explanation:
This starts with an effort to find the top X entities. (4 seemed to work for the number of associations i wanted to make in the fiddle)
I didn't want to select articles here because it skews the group by, you want to focus solely on the top entities. Fiddle
select
entity_id, count(*)
from
article_entity
group by
entity_id
order by
count(*) desc
limit 4
Then I selected all the articles from these top entities. Fiddle
select
*
from
article_entity ae
inner join
(select
entity_id, count(*)
from
article_entity
group by
entity_id
order by
count(*) desc
limit 4) top_entities on ae.entity_id = top_entities.entity_id
Obviously the same logic needs to happen for the keywords. The queries are then unioned together (fiddle) and the distinct article ids are pulled from the union.
This will give you all articles that have a relation to the top (x) entities and keywords.
This gets the top 10 keyword articles that are also a top 10 entity. You may not get 10 records back because it is possible that an article only meets one of the criteria (top entity but not top keyword or top keyword but not top entity)
select *
from article a
inner join
(select count(*),ae.article_id
from article_entity ae
group by ae.article_id
order by count(*) Desc limit 10) e
on a.id = e.article_id
inner join
(select count(*),ak.article_id
from article_keyword ak
group by ak.article_id
order by count(*) Desc limit 10) k
on a.id = k.article_id
I have an article table which holds the number of articles views for each day. A new record is created to hold the count for each seperate day for each article.
The query below gets the article id and total views for the top 5 viewed article id for all time :
SELECT article_id,
SUM(article_count) as cnt
FROM article_views
GROUP BY article_id
ORDER BY cnt DESC
LIMIT 5
I also have a seperate article table which holds all the article fields. I want to ammend the query above to join to the article table and get two fields for each article id. I have tried to do this below but count is comming back incorrectly :
SELECT article_views.article_id, SUM( article_views.article_count ) AS cnt, articles.article_title, articles.artcile_url
FROM article_views
INNER JOIN articles ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id
ORDER BY cnt DESC
LIMIT 5
Im not sure exactly what im doing wrong. Do I need to do a subquery?
Add articles.article_title, articles.artcile_url to the GROUP BY clause:
SELECT
article_views.article_id,
articles.article_title,
articles.artcile_url,
SUM( article_views.article_count ) AS cnt
FROM article_views
INNER JOIN articles ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id,
articles.article_title,
articles.artcile_url
ORDER BY cnt DESC
LIMIT 5;
The reason you were not getting correct result set, is that when you select rows that are not included in the GROUP BY nor in an aggregate function in the SELECT clause MySQL picks up random value.
You are using a MySQL (mis) feature called Hidden Columns, because article title is not in the group by. However, this may or may not be causing your problem.
If the counts are wrong, then I think you have duplicate article_id in the article table. You can check this by doing:
select article_id, count(*) as cnt
from articles
group by article_id
having cnt > 1
If any appear, then that is your problem. If they all have different titles, then grouping by the title (as suggested by Mahmoud) would fix the problem.
If not, one way to fix it is the following:
SELECT article_views.article_id, SUM( article_views.article_count ) AS cnt, articles.article_title, articles.artcile_url
FROM article_views INNER JOIN
(select a.* from articles group by article_id) articles
ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id
ORDER BY cnt DESC
LIMIT 5
This chooses an abitrary title for the article.
Your query looks basically right to me...
But the value returned for cnt is going to be dependent upon article_id column being UNIQUE in the articles table. We'd assume that it's the primary key, and absent a schema definition, that's only an assumption.)
Also, we're likely to assume there's a foreign key between the tables, that is, there are no values of article_id in the articles_view table which don't match a value of article_id on a row from the articles table.
To check for "orphan" article_id values, run a query like:
SELECT v.article_id
FROM articles_view v
LEFT
JOIN articles a
ON a.article_id = v.article_id
WHERE a.article_id IS NULL
To check for "duplicate" article_id values in articles, run a query like:
SELECT a.article_id
FROM articles a
GROUP BY a.article_id
HAVING COUNT(1) > 1
If either of those queries returns rows, that could be an explanation for the behavior you observe.