3 second long queries on 5.8 MB database - mysql

I'm running this query;
SELECT p.*,
UNIX_TIMESTAMP(p.upload_date) upload_date_unix,
ph.*,
c.category_name,
c.slug,
(SELECT Count(vote)
FROM picture_votes
WHERE picture_id = p.picture_id) vote_count
FROM pictures p
LEFT JOIN photographers ph
ON ph.photographer_id = p.photographer_id
LEFT JOIN categories c
ON c.category_id = p.category_id
WHERE p.approved = 1
AND ( p.picture_id = p.album_id
OR p.album_id IS NULL )
GROUP BY p.picture_id
ORDER BY p.upload_date DESC
LIMIT 99
And the query takes ~2-3 seconds. If I remove (SELECT count(vote) FROM picture_votes WHERE picture_id = p.picture_id) vote_count the query is like 0.01 seconds. How come it slows the query down so much? picture_votes is only 25,000 rows.
How can I change the query to include the vote count for every picture?
Here's the explain to the query.

Remove your subquery add one more join on picture_votes
SELECT p.*,
Unix_timestamp(p.upload_date) upload_date_unix,
ph.*,
c.category_name,
c.slug,
Count(vote) vote_count
FROM pictures p
LEFT JOIN picture_votes pv ON ( p.picture_id = pv.picture_id )
LEFT JOIN photographers ph
ON ph.photographer_id = p.photographer_id
LEFT JOIN categories c
ON c.category_id = p.category_id
WHERE p.approved = 1
AND ( p.picture_id = p.album_id
OR p.album_id IS NULL )
GROUP BY p.picture_id
ORDER BY p.upload_date DESC
LIMIT 99

Based on the explain for the query, you need index the vote column on picture_votes.
Whenever you have NULL under possible_keys it means that MySQL could not use a relevant index.

Related

MySQL: Sorting results before group by statement

Basically, I have a coppermine gallery and I want to show the last 4 updated albums on the homepage. Here's the query that I've got so far. It basically gets the latest pictures. The subquery works fine on it's own but when it comes time to grouping them to get each album on its own, it doesn't seem to be getting the most recent one from the list.
SELECT *
FROM (
SELECT c.cid, c.name AS catname, a.aid, a.title AS albumtitle, a.category, p.aid AS albumid,p.filepath,p.filename,p.ctime AS creationtime,p.title AS pictitle,p.approved
FROM cpg145_pictures AS p LEFT JOIN `cpg145_albums` AS a ON p.aid = a.aid LEFT JOIN `cpg145_categories` AS c ON a.category = c.cid
WHERE p.approved='YES' AND a.category IN (47,48)
ORDER BY p.ctime DESC) AS T
GROUP BY albumid
ORDER BY creationtime DESC
LIMIT 4
I figured out the answer. Apparently, in MariaDB you have to give the subquery a limit for it to be sorted correctly. So:
SELECT *
FROM (
SELECT c.cid, c.name AS catname, a.aid, a.title AS albumtitle, a.category, p.aid AS albumid,p.filepath,p.filename,p.ctime AS creationtime,p.title AS pictitle,p.approved
FROM cpg145_pictures AS p LEFT JOIN `cpg145_albums` AS a ON p.aid = a.aid LEFT JOIN `cpg145_categories` AS c ON a.category = c.cid
WHERE p.approved='YES' AND a.category IN (47,48)
ORDER BY p.ctime DESC
LIMIT 200) AS T
GROUP BY albumid
ORDER BY creationtime DESC
LIMIT 4

how to sort and group by by it's count

I have the following:
SELECT DISTINCT s.username, COUNT( v.id ) AS cnt
FROM `instagram_item_viewer` v
INNER JOIN `instagram_shop_picture` p ON v.item_id = p.id
INNER JOIN `instagram_shop` s ON p.shop_id = s.id
AND s.expirydate IS NULL
AND s.isLocked =0
AND v.created >= '2014-08-01'
GROUP BY (
s.id
)
ORDER BY cnt DESC
Basically I have an instagram_item_viewer with the following structure:
id viewer_id item_id created
It tracks when a user has viewed an item and what time. So basically I wanted to find shops that has the most items viewed. I tried the query above and it executed fine, however it doesn't seem to give the appropriate data, it should have more count than what it is. What am I doing wrong?
First, with a group by statement, you don't need the DISTINCT clause. The grouping takes care of making your records distinct.
You may want to reconsider the order of your tables. Since you are interested in the shops, start there.
Select s.username, count(v.id)
From instagram_shop s
INNER JOIN instagram_shop_picture p ON p.shop_id = s.shop_id
INNER JOIN instagram_item_viewer v ON v.item_id = p.id
AND v.created >= '2014-08-01'
WHERE s.expirydate IS NULL
AND s.isLocked = 0
GROUP BY s.username
Give thata shot.
As mentioned by #Lennart, if you have a sample data it would be helpful. Because otherwise there will be assumptions.
Try run this to debug (this is not the answer yet)
SELECT s.username, p.id, COUNT( v.id ) AS cnt
FROM `instagram_item_viewer` v
INNER JOIN `instagram_shop_picture` p ON v.item_id = p.id
INNER JOIN `instagram_shop` s ON p.shop_id = s.id
AND s.expirydate IS NULL
AND s.isLocked =0
AND v.created >= '2014-08-01'
GROUP BY (
s.username, p.id
)
ORDER BY cnt DESC
The problem here is the store and item viewer is too far apart (i.e. bridged via shop_picture). Thus shop_picture needs to be in the SELECT statement.
Your original query only gets the first shop_picture count for that store that is why it is less than expected
Ultimately if you still want to achieve your goal, you can expand my SQL above to
SELECT x.username, SUM(x.cnt) -- or COUNT(x.cnt) depending on what you want
FROM
(
SELECT s.username, p.id, COUNT( v.id ) AS cnt
FROM `instagram_item_viewer` v
INNER JOIN `instagram_shop_picture` p ON v.item_id = p.id
INNER JOIN `instagram_shop` s ON p.shop_id = s.id
AND s.expirydate IS NULL
AND s.isLocked =0
AND v.created >= '2014-08-01'
GROUP BY (
s.username, p.id
)
ORDER BY cnt DESC
) x
GROUP BY x.username

mysql query optimization steps or how to optimze query

I don't know much about query optimization but I know the order in which queries get executed
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
This the query I had written
SELECT
`main_table`.forum_id,
my_topics.topic_id,
(
SELECT MAX(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id
) AS `maxpostid`,
(
SELECT my_posts.admin_user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `admin_user_id`,
(
SELECT my_posts.user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `user_id`,
(
SELECT COUNT(my_topics.topic_id) FROM my_topics WHERE my_topics.forum_id = main_table.forum_id ORDER BY my_topics.forum_id DESC LIMIT 1
) AS `topicscount`,
(
SELECT COUNT(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_topics.topic_id DESC LIMIT 1
) AS `postcount`,
(
SELECT CONCAT(admin_user.firstname,' ',admin_user.lastname) FROM admin_user INNER JOIN my_posts ON my_posts.admin_user_id = admin_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `adminname`,
(
SELECT forum_user.nick_name FROM forum_user INNER JOIN my_posts ON my_posts.user_id = forum_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `nickname`,
(
SELECT CONCAT(ce1.value,' ',ce2.value) AS fullname FROM my_posts INNER JOIN customer_entity_varchar AS ce1 ON ce1.entity_id = my_posts.user_id INNER JOIN customer_entity_varchar AS ce2 ON ce2.entity_id=my_posts.user_id WHERE (ce1.attribute_id = 1) AND (ce2.attribute_id = 2) AND my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `fullname`
FROM `my_forums` AS `main_table`
LEFT JOIN `my_topics` ON main_table.forum_id = my_topics.forum_id
WHERE (forum_status = '1')
And now I want to know if there is any way to optimize it ? Because all the logic is written in Select section not From, but I don't know how to write the same logic in From section of the query ?
Does it make any difference or both are same ?
Thanks
Correlated subqueries should really be a last resort, they often end up being executed RBAR, and given that a number of your subqueries are very similar, trying to get the same result using joins is going to result in a lot less table scans.
The first thing I note is that all of your subqueries include the table my_posts, and most contain ORDER BY my_posts.post_id DESC LIMIT 1, those that don't have a count with no group by so the order and limit are redundant anyway, so my first step would be to join to my_posts:
SELECT *
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
WHERE forum_status = '1';
Here the subquery just ensures you get the latest post per topic_id. I have shortened your table aliases here for my convenience, I am not sure why you would use a table alias that is longer than the actual table name?
Now you have the bulk of your query you can start adding in your columns, in order to get the post count, I have added a count to the subquery Maxp, I have also had to add a few more joins to get some of the detail out, such as names:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
t2.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id,
MAX(post_id) AS post_id,
COUNT(*) AS postcount
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
LEFT JOIN
( SELECT forum_id, COUNT(*) AS topicscount
FROM my_topics
GROUP BY forum_id
) AS t2
ON t2.forum_id = f.forum_id
WHERE forum_status = '1';
I am not familiar with your schema so the above may need some tweaking, but the principal remains - use JOINs over sub-selects.
The next stage of optimisation I would do is to get rid of your customer_entity_varchar table, or at least stop using it to store things as basic as first name and last name. The Entity-Attribute-Value model is an SQL antipattern, if you added two columns, FirstName and LastName to your forum_user table you would immediately lose two joins from your query. I won't get too involved in the EAV vs Relational debate as this has been extensively discussed a number of times, and I have nothing more to add.
The final stage would be to add appropriate indexes, you are in the best decision to decide what is appropriate, I'd suggest you probably want indexes on at least the foreign keys in each table, possibly more.
EDIT
To get one row per forum_id you would need to use the following:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
MaxT.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN
( SELECT t.forum_id,
COUNT(DISTINCT t.topic_id) AS topicscount,
COUNT(*) AS postCount,
MAX(t.topic_ID) AS topic_id
FROM my_topics AS t
INNER JOIN my_posts AS p
ON p.topic_id = p.topic_id
GROUP BY t.forum_id
) AS MaxT
ON MaxT.forum_id = f.forum_id
LEFT JOIN my_topics AS t
ON t.topic_ID = Maxt.topic_ID
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
WHERE forum_status = '1';

Fetching top 3 from another table for every row in a 3rd table [duplicate]

This question already has answers here:
Using LIMIT within GROUP BY to get N results per group?
(14 answers)
Closed 9 years ago.
I have 3 tables, one which stores pictures and one that stores votes for pictures (pictures and picture_votes). The last table is categories, which stores the different categories a picture can belong to.
Here are the tables (non-relevant columns omitted);
- Table `pictures`
picture_id INT
category_id INT
and
- Table `picture_votes`
vote TINYINT
picture_id INT
and finally
- Table `categories`
category_id INT
What I want to do is to select the top 3 most voted pictures for each category.
I'm really lost and don't know how to do this most effectively..
If you can accept them in one row per category as a comma delimited list:
select pv.category_id,
substring_index(group_concat(pv.picture_id order by numvotes desc), ',', 3) as Top3
from (select p.category_id, p.picture_id, count(*) as numvotes
from picture_votes pv join
pictures p
on p.picture_id = pv.picture_id
group by p.category_id, p.picture_id
) pv
group by pv.category_id;
I came up with this;
(SELECT p.*
FROM
pictures p
LEFT JOIN
picture_votes pv
ON pv.picture_id = p.picture_id
WHERE p.category_id = n
GROUP BY p.picture_id
ORDER BY SUM(pv.vote) DESC
LIMIT 3)
UNION
(SELECT ...)
UNION
(SELECT ...)
--And so on for every category_id (there are 9)
This seems like a veeeery bad solution and the query takes way too long.
sqlFiddle
SELECT category_id,picture_id,ranking FROM
(
select c.category_id,(select p.picture_id
from pictures p, picture_votes pv
where p.picture_id = pv.picture_id
and p.category_id = c.category_id
group by p.picture_id
order by sum(pv.vote) desc
limit 0,1)as picture_id,1 as ranking
from categories c
union
select c.category_id,(select p.picture_id
from pictures p, picture_votes pv
where p.picture_id = pv.picture_id
and p.category_id = c.category_id
group by p.picture_id
order by sum(pv.vote) desc
limit 1,1)as picture_id,2 as ranking
from categories c
union
select c.category_id,(select p.picture_id
from pictures p, picture_votes pv
where p.picture_id = pv.picture_id
and p.category_id = c.category_id
group by p.picture_id
order by sum(pv.vote) desc
limit 2,1)as picture_id,3 as ranking
from categories c
)result
WHERE picture_id is not null
order by category_id asc,ranking asc
or this sqlFiddle
SELECT picture_id,category_id,sumvotes,voteOrder
FROM
(SELECT picture_id,category_id,sumvotes,
IF(#prevCat <> category_id,#voteOrder:=1,#voteOrder:=#voteOrder+1)
as voteOrder,
#prevCat:=category_id
FROM(SELECT p.picture_id,
p.category_id,
SUM(pv.vote) as sumvotes
FROM pictures p
JOIN picture_votes pv
ON p.picture_id = pv.picture_id
GROUP BY p.picture_id,
p.category_id
ORDER BY p.category_id, sumvotes DESC
)as ppv,
(SELECT #prevCat:=0,#voteOrder:=0)pc
)finalTable
WHERE voteOrder BETWEEN 1 AND 3
ORDER BY category_id ASC, voteOrder ASC

how do I fix this LEFT JOIN query?

I have three tables:
products (product_id, title)
comments (comment_id, product_id, user_id, comment, post_date)
bookmarks (user_id, product_id, read_date)
For each product_id in the products table, I wish to retrieve the number of comments with the same product_id, and whose post_date value is greater than the read_date value for the row in the bookmarks table that shares this product_id, and has user_id=22.
If such a row does not exist in the bookmarks table, I want to retrieve the total number of comments for that product_id regardless of read_date.
So far I have
SELECT p.product_id, COUNT( c.comment_id ) comment_count
FROM products p
LEFT JOIN bookmarks b, comments c ON b.product_id = c.product_id
AND b.user_id =22
AND (
c.post_date > b.read_date
)
AND p.product_id = c.product_id
GROUP BY c.product_id
ORDER BY comment_count DESC
This does not give me the expected results. How can I modify it to make it do what I want?
Will it work for you ?
SELECT p.product_id,
COUNT(CASE
WHEN b.read_date IS NOT NULL AND c.post_date >b.read_date THEN c.comment_id
WHEN b.read_date IS NULL THEN c.comment_id
ELSE NULL //optional, CASE has default ELSE NULL
END) as comment_count
FROM products p
LEFT JOIN bookmarks b ON (b.product_id = p.product_id AND b.user_id=22)
LEFT JOIN comments c ON (p.product_id = c.product_id)
GROUP BY p.product_id
ORDER BY comment_count DESC
UPDATE
GROUP BY c.product_id changed to GROUP BY p.product_id
Maybe this will work for you or atleast point you in the right direction.
SELECT p.product_id COUNT( c.comment_id ) comment_count
FROM products p
LEFT JOIN comments c on c.product_id = p.product_id
LEFT JOIN bookmarks b on b.product_id = c.product_id
WHERE (p.product_id IN (
SELECT *
FROM bookmarks b
WHERE b.user_id = 22
)
AND
c.post_date > b.read_date
)
OR
p.product_id NOT IN (
SELECT *
FROM bookmarks b
WHERE b.user_id = 22
)
GROUP BY c.product_id
ORDER BY comment_count DESC