This is really doing my head in, it be a simple query I thought.
blog table:
blog_id
blog_name
blog_copy
comments table:
comment_id
comment_copy
comment_by
blog_id
I want to show blog items where there are more than 3 comments and also order them by the volume of replies.
I tried many queries including this but it just doesn't work:
SELECT
*, blog_id as BID ,
(SELECT blog_id
FROM comments
WHERE blog = BID HAVING COUNT(*) > 3) AS t2
FROM blog
WHERE mostcomments > '3'
ORDER by mostcomments ASC
It says that mostcomments doesn't exist. I've done it other ways and it executes but counts comments on totals overall not per blog_id it's looking up
You need to use GROUP BY:
select b.blog_id, count(*)
from blog b
join comments c on b.blog_id = c.blog_id
group by b.blog_id
having count(*) > 2
order by count(*) desc
SQL Fiddle Demo
Related
I'm trying to show the most commented posts of my web but is being impossible .-. I just got this mysql error all the time (Can't group on 'comments')
the relation between tables is:
table: post / colums: id_post, title, id_comment
table : comment / colums: id_comment, text, id_post
and this is the query I'm trying to use
SELECT p.title AS title, COUNT(c.id_comment) AS comments
FROM post p
INNER JOIN comment c ON p.id_post=c.id_post
GROUP BY comments DESC
Please any alternative or solution for this?
Why would you want to group by on the number of comments? You need order by clause to get the most commented to the top and group by posts:
SELECT p.title AS title, COUNT(c.id_comment) AS comments
FROM post p
INNER JOIN comment c ON p.id_post=c.id_post
GROUP BY p.id_post, p.title
ORDER BY comments DESC
You may want to have a limit clause as well to get the top N commented posts only.
There are two tables: posts (~5,000,000 rows) and relations (~8,000 rows).
posts columns:
-------------------------------------------------
| id | source_id | content | date (int) |
-------------------------------------------------
relations columns:
---------------------------
| source_id | user_id |
---------------------------
I wrote a MySQL query for getting 10 most recent rows from posts which are related to a specific user:
SELECT p.id, p.content
FROM posts AS p
LEFT JOIN relations AS r
ON r.source_id = p.source_id
WHERE r.user_id = 1
ORDER BY p.date DESC
LIMIT 10
However, it takes ~30 seconds to execute it.
I already have indexes at relations for (source_id, user_id), (user_id) and for (source_id), (date), (date, source_id) at posts.
EXPLAIN results:
How can I optimize the query?
Your WHERE clause renders your outer join a mere inner join (because in an outer-joined pseudo record user_id will always be null, never 1).
If you really want this to be an outer join then it is completely superfluous, because every record in posts either has or has not a match in relations of course. Your query would then be
select id, content
from posts
order by "date" desc limit 10;
If you don't want this to be an outer join really, but want a match in relations, then we are talking about existence in a table, an EXISTS or IN clause hence:
select id, content
from posts
where source_id in
(
select source_id
from relations
where user_id = 1
)
order by "date" desc
limit 10;
There should be an index on relations(user_id, source_id) - in this order, so we can select user_id 1 first and get an array of all desired source_id which we then look up.
Of course you also need an index on posts(source_id) which you probably have already, as source_id is an ID. You can even speed things up with a composite index posts(source_id, date, id, content), so the table itself doesn't have to be read anymore - all the information needed is in the index already.
UPDATE: Here is the related EXISTS query:
select id, content
from posts p
where exists
(
select *
from relations r
where r.user_id = 1
and r.source_id = p.source_id
)
order by "date" desc
limit 10;
You could put an index on the date column of the posts table, I believe that will help the order-by speed.
You could also try reducing the number of results before ordering with some additional where statements. For example if you know the that there will likely be ten records with the correct user_id today, you could limit the date to just today (or N days back depending on your actual data).
Try This
SELECT p.id, p.content FROM posts AS p
WHERE p.source_id IN (SELECT source_id FROM relations WHERE user_id = 1)
ORDER BY p.date DESC
LIMIT 10
I'd consider the following :-
Firstly, you only want the 10 most recent rows from posts which are related to a user. So, an INNER JOIN should do just fine.
SELECT p.id, p.content
FROM posts AS p
JOIN relations AS r
ON r.source_id = p.source_id
WHERE r.user_id = 1
ORDER BY p.date DESC
LIMIT 10
The LEFT JOIN is needed if you want to fetch the records which do not have a relations mapping. Hence, doing the LEFT JOIN results in a full table scan of the left table, which as per your info, contains ~5,000,000 rows. This could be the root cause of your query.
For further optimisation, consider moving the WHERE clause into the ON clause.
SELECT p.id, p.content
FROM posts AS p
JOIN relations AS r
ON (r.source_id = p.source_id AND r.user_id = 1)
ORDER BY p.date DESC
LIMIT 10
I would try with a composite index on relations :
INDEX source_user (user_id,source_id)
and change the query to this :
SELECT p.id, p.content
FROM posts AS p
INNER JOIN relations AS r
ON ( r.user_id = 1 AND r.source_id = p.source_id )
ORDER BY p.date DESC
LIMIT 10
I am trying to find the top 10 authors who has written the most number of books.
I have two table as follows:
Author table:
author_name
publisher_key
Publication table:
publisher_id
publisher_key
title
year
pageno
To find the result, I tried using the following query:
SELECT a.author_name, SUM(p.pageno)
FROM author a JOIN publication p ON a.publisher_key = p.publisher_key
GROUP BY a.author_name
LIMIT 10;
I have no idea why when I run this query it takes ages though the number of records is only 200.
Try
SELECT a.author_name, count(*)
FROM author a
INNER JOIN publication p ON a.publisher_key = p.publisher_key
GROUP BY a.author_name
ORDER BY 2 desc
LIMIT 10;
You want to know who write most number of books, so you need to count the number of registries by author.
The order by 2 desc will order your query from the bigger number to the lesser 2 means the second field on the select list.
If it is as ydoow suggested that it maybe a locking issue, try running your select with NOLOCK to confirm.
SELECT a.author_name, count(*)
FROM author a WITH (nolock)
INNER JOIN publication p WITH (nolock) ON a.publisher_key = p.publisher_key
GROUP BY a.author_name
ORDER BY 2 desc
LIMIT 10;
You can get more info here:
Any way to select without causing locking in MySQL?
I have an article table which holds the number of articles views for each day. A new record is created to hold the count for each seperate day for each article.
The query below gets the article id and total views for the top 5 viewed article id for all time :
SELECT article_id,
SUM(article_count) as cnt
FROM article_views
GROUP BY article_id
ORDER BY cnt DESC
LIMIT 5
I also have a seperate article table which holds all the article fields. I want to ammend the query above to join to the article table and get two fields for each article id. I have tried to do this below but count is comming back incorrectly :
SELECT article_views.article_id, SUM( article_views.article_count ) AS cnt, articles.article_title, articles.artcile_url
FROM article_views
INNER JOIN articles ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id
ORDER BY cnt DESC
LIMIT 5
Im not sure exactly what im doing wrong. Do I need to do a subquery?
Add articles.article_title, articles.artcile_url to the GROUP BY clause:
SELECT
article_views.article_id,
articles.article_title,
articles.artcile_url,
SUM( article_views.article_count ) AS cnt
FROM article_views
INNER JOIN articles ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id,
articles.article_title,
articles.artcile_url
ORDER BY cnt DESC
LIMIT 5;
The reason you were not getting correct result set, is that when you select rows that are not included in the GROUP BY nor in an aggregate function in the SELECT clause MySQL picks up random value.
You are using a MySQL (mis) feature called Hidden Columns, because article title is not in the group by. However, this may or may not be causing your problem.
If the counts are wrong, then I think you have duplicate article_id in the article table. You can check this by doing:
select article_id, count(*) as cnt
from articles
group by article_id
having cnt > 1
If any appear, then that is your problem. If they all have different titles, then grouping by the title (as suggested by Mahmoud) would fix the problem.
If not, one way to fix it is the following:
SELECT article_views.article_id, SUM( article_views.article_count ) AS cnt, articles.article_title, articles.artcile_url
FROM article_views INNER JOIN
(select a.* from articles group by article_id) articles
ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id
ORDER BY cnt DESC
LIMIT 5
This chooses an abitrary title for the article.
Your query looks basically right to me...
But the value returned for cnt is going to be dependent upon article_id column being UNIQUE in the articles table. We'd assume that it's the primary key, and absent a schema definition, that's only an assumption.)
Also, we're likely to assume there's a foreign key between the tables, that is, there are no values of article_id in the articles_view table which don't match a value of article_id on a row from the articles table.
To check for "orphan" article_id values, run a query like:
SELECT v.article_id
FROM articles_view v
LEFT
JOIN articles a
ON a.article_id = v.article_id
WHERE a.article_id IS NULL
To check for "duplicate" article_id values in articles, run a query like:
SELECT a.article_id
FROM articles a
GROUP BY a.article_id
HAVING COUNT(1) > 1
If either of those queries returns rows, that could be an explanation for the behavior you observe.
I have a table containing blog posts by many different authors. What I'd like to do is show the most recent post by each of the 10 most recent authors.
Each author's posts are simply added to the table in order, which means there could be runs of posts by a single author. I'm having a heck of time coming up with a single query to do this.
This gives me the last 10 unique author IDs; can it be used as a sub-select to grab the most recent post by each author?
SELECT DISTINCT userid
FROM posts
ORDER BY postid DESC
LIMIT 10
select userid,postid, win from posts where postid in (
SELECT max(postid) as postid
FROM posts
GROUP BY userid
)
ORDER BY postid desc
limit 10
http://sqlfiddle.com/#!2/09e25/1
You need a subquery for the last postid of every author and order by postid DESC. Then, join that result to the posts table:
SELECT B.* FROM
(
SELECT * FROM
(
SELECT userid,MAX(postid) postid
FROM posts GROUP BY userid
) AA
ORDER BY postid DESC
LIMIT 10
) A INNER JOIN posts B
USING (user_id,post_id);
Make sure you have this index
ALTER TABLE posts ADD INDEX user_post_ndx (userid,postid);
SELECT userid
, MAX(postid) AS lastpostid
FROM posts
GROUP BY userid
ORDER BY lastpostid DESC
LIMIT 10