how do I fix this LEFT JOIN query? - mysql

I have three tables:
products (product_id, title)
comments (comment_id, product_id, user_id, comment, post_date)
bookmarks (user_id, product_id, read_date)
For each product_id in the products table, I wish to retrieve the number of comments with the same product_id, and whose post_date value is greater than the read_date value for the row in the bookmarks table that shares this product_id, and has user_id=22.
If such a row does not exist in the bookmarks table, I want to retrieve the total number of comments for that product_id regardless of read_date.
So far I have
SELECT p.product_id, COUNT( c.comment_id ) comment_count
FROM products p
LEFT JOIN bookmarks b, comments c ON b.product_id = c.product_id
AND b.user_id =22
AND (
c.post_date > b.read_date
)
AND p.product_id = c.product_id
GROUP BY c.product_id
ORDER BY comment_count DESC
This does not give me the expected results. How can I modify it to make it do what I want?

Will it work for you ?
SELECT p.product_id,
COUNT(CASE
WHEN b.read_date IS NOT NULL AND c.post_date >b.read_date THEN c.comment_id
WHEN b.read_date IS NULL THEN c.comment_id
ELSE NULL //optional, CASE has default ELSE NULL
END) as comment_count
FROM products p
LEFT JOIN bookmarks b ON (b.product_id = p.product_id AND b.user_id=22)
LEFT JOIN comments c ON (p.product_id = c.product_id)
GROUP BY p.product_id
ORDER BY comment_count DESC
UPDATE
GROUP BY c.product_id changed to GROUP BY p.product_id

Maybe this will work for you or atleast point you in the right direction.
SELECT p.product_id COUNT( c.comment_id ) comment_count
FROM products p
LEFT JOIN comments c on c.product_id = p.product_id
LEFT JOIN bookmarks b on b.product_id = c.product_id
WHERE (p.product_id IN (
SELECT *
FROM bookmarks b
WHERE b.user_id = 22
)
AND
c.post_date > b.read_date
)
OR
p.product_id NOT IN (
SELECT *
FROM bookmarks b
WHERE b.user_id = 22
)
GROUP BY c.product_id
ORDER BY comment_count DESC

Related

How to do GROUP BY and COUNT(*) in JOIN MySQL

I have tables named company, product, purchase_order, skid, process_record and I want MySQL query result as below.
I tried
SELECT s.id as skidId, s.skidBarcode, po.poNumber, s.companyId, c.companyName, p.productId , p.productName, totalProcessed
FROM skid s
INNER JOIN company c ON s.companyId = c.id
INNER JOIN purchase_order po on s.purchaseOrderId = po.id
INNER JOIN product prdct on p.productId = prdct.id
LEFT JOIN (SELECT skidID, productId , COUNT(*) as processedQuantity FROM process_record GROUP BY productId ) p ON p.skidID= s.id
WHERE s.status = 'closed' ORDER By s.companyId,s.id
However, this query result gives processedQuantity count NULL and random wrong count on some rows.
How can I get the desired MySQL query output as shown in screenshot?
I added GROUP BY skidID, productId instead of GROUP BY productId and it resolved the issue.
SELECT s.id as skidId, s.skidBarcode, po.poNumber, s.companyId, c.companyName, p.productId , p.productName, totalProcessed
FROM skid s
LEFT JOIN (SELECT skidID, productId , COUNT(*) as processedQuantity FROM process_record GROUP BY skidID, productId ) p ON p.skidID= s.id
INNER JOIN company c ON s.companyId = c.id
INNER JOIN purchase_order po on s.purchaseOrderId = po.id
INNER JOIN product prdct on p.productId = prdct.id
WHERE s.status = 'closed' ORDER By s.companyId,s.id

LEFT JOIN returns NULL if there are just one column in table

I try to get the latest created product price. Every Product is unique but can have different prices. However, my query only works if a product have more than a price as row in the product_price table:
This is my query:
SELECT
i.name AS title,
i.id AS product_id,
m.name AS manufacturer,
image,
price_sales,
price_new,
price_used,
price_old
FROM product_info as i
LEFT JOIN product_manufacturer AS m ON i.manufacturer_id = m.id
LEFT JOIN (SELECT * FROM product_price ORDER BY created_at DESC LIMIT 1) AS p ON i.id = p.id_product
WHERE category_id = 2
AND i.is_deactivated IS NULL
LIMIT 0, 20;
I just need the latest created price row.
Result
The problem you have is that the subquery:
(SELECT * FROM product_price ORDER BY created_at DESC LIMIT 1)
Does not get the latest price per product, but simply the latest price, so will only ever return one row, meaning only one of your products will actually have a price.
The way to resolve this is to remove any prices where a newer one exists, so for simplicity if you look just at the price table, the following will give you only the latest product prices:
SELECT p.*
FROM product_price AS p
WHERE NOT EXISTS
( SELECT 1
FROM product_price AS p2
WHERE p2.id_product = p.id_product
AND p2.created_at > p.created_at
);
However, MySQL will optmise LEFT JOIN/IS NULL better than NOT EXISTS (although I think the former conveys intention better), so a more efficient approach would be:
SELECT p.*
FROM product_price AS p
LEFT JOIN product_price AS p2
ON p2.id_product = p.id_product
AND p2.created_at > p.created_at
WHERE p2.id IS NULL;
Finally, introducing this back to your main query, you would end up with:
SELECT i.name AS title,
i.id AS product_id,
m.name AS manufacturer,
i.image,
p.price_sales,
p.price_new,
p.price_used,
p.price_old
FROM product_info as i
LEFT JOIN product_manufacturer AS m
ON m.id = i.manufacturer_id
LEFT JOIN product_price AS p
ON p.id_product = i.id
LEFT JOIN product_price AS p2
ON p2.id_product = p.id_product
AND p2.created_at > p.created_at
WHERE i.category_id = 2
AND i.is_deactivated IS NULL
AND p2.id IS NULL
LIMIT 0, 20;

3 second long queries on 5.8 MB database

I'm running this query;
SELECT p.*,
UNIX_TIMESTAMP(p.upload_date) upload_date_unix,
ph.*,
c.category_name,
c.slug,
(SELECT Count(vote)
FROM picture_votes
WHERE picture_id = p.picture_id) vote_count
FROM pictures p
LEFT JOIN photographers ph
ON ph.photographer_id = p.photographer_id
LEFT JOIN categories c
ON c.category_id = p.category_id
WHERE p.approved = 1
AND ( p.picture_id = p.album_id
OR p.album_id IS NULL )
GROUP BY p.picture_id
ORDER BY p.upload_date DESC
LIMIT 99
And the query takes ~2-3 seconds. If I remove (SELECT count(vote) FROM picture_votes WHERE picture_id = p.picture_id) vote_count the query is like 0.01 seconds. How come it slows the query down so much? picture_votes is only 25,000 rows.
How can I change the query to include the vote count for every picture?
Here's the explain to the query.
Remove your subquery add one more join on picture_votes
SELECT p.*,
Unix_timestamp(p.upload_date) upload_date_unix,
ph.*,
c.category_name,
c.slug,
Count(vote) vote_count
FROM pictures p
LEFT JOIN picture_votes pv ON ( p.picture_id = pv.picture_id )
LEFT JOIN photographers ph
ON ph.photographer_id = p.photographer_id
LEFT JOIN categories c
ON c.category_id = p.category_id
WHERE p.approved = 1
AND ( p.picture_id = p.album_id
OR p.album_id IS NULL )
GROUP BY p.picture_id
ORDER BY p.upload_date DESC
LIMIT 99
Based on the explain for the query, you need index the vote column on picture_votes.
Whenever you have NULL under possible_keys it means that MySQL could not use a relevant index.

Query for finding Persons with most points

I am using Mysql and have these tables: (only important columns shown)
Person
id, primary key
Post
id, primary key
points, INT
Visit
id, primary key
person_id, refers to Person
post_id, refers to Post
What I want to find is the Persons (top 5) with most points overall? And the persons with most points on each Post.
Can anyone please guide me? Any help is deeply apreciated!
Top 5 persons with most points overall:
SELECT
p.id,
SUM(Post.points) AS total_points
FROM
Person p
INNER JOIN Visit v
ON p.id = v.person_id
INNER JOIN Post
ON v.post_id = Post.id
GROUP BY
p.id
ORDER BY
SUM(Post.points) DESC
LIMIT 5
Top 5 persons with most points in one post:
SELECT
p.id,
MAX(Post.points) AS best_post_points
FROM
Person p
INNER JOIN Visit v
ON p.id = v.person_id
INNER JOIN Post
ON v.post_id = Post.id
GROUP BY
p.id
ORDER BY
MAX(Post.points) DESC
LIMIT 5
Top 5 posts:
SELECT
p.id,
Post.points
FROM
Person p
INNER JOIN Visit v
ON p.id = v.person_id
INNER JOIN Post
ON v.post_id = Post.id
ORDER BY
Post.points DESC
LIMIT 5
For each Post
SELECT id FROM Person where id in (SELECT person_id FROM Visit where post_id in
(SELECT id FROM Post order by points DESC limit 5))
Overall (not sure if will work, not tested)
SELECT id FROM Person where id in (SELECT distinct(person_id) FROM Visit where post_id in
(SELECT id FROM Post order by points DESC limit 5) GROUP BY person_id )
SELECT *
FROM
(
SELECT P.id , SUM(PP.points)
FROM PERSON P JOIN VISIT V ON ( V.person_id = P.id )
JOIN POST PP JOIN ON ( PP.id = V.post_id )
GROUP BY P.id
ORDER BY PP.points DESC
)
LIMIT 5;
SELECT *
FROM
(
SELECT P.id , COUNT(*) NUM_OF_POST
FROM PERSON P JOIN VISIT V ON ( V.person_id = P.id )
JOIN POST PP JOIN ON ( PP.id = V.post_id )
GROUP BY P.id
ORDER BY NUM_OF_POST DESC
)
LIMIT 5;

MySQL LEFT JOIN, GROUP BY and ORDER BY not working as required

I have a table
'products' => ('product_id', 'name', 'description')
and a table
'product_price' => ('product_price_id', 'product_id', 'price', 'date_updated')
I want to perform a query something like
SELECT `p`.*, `pp`.`price`
FROM `products` `p`
LEFT JOIN `product_price` `pp` ON `pp`.`product_id` = `p`.`product_id`
GROUP BY `p`.`product_id`
ORDER BY `pp`.`date_updated` DESC
As you can probably guess the price changes often and I need to pull out the latest one. The trouble is I cannot work out how to order the LEFT JOINed table. I tried using some of the GROUP BY functions like MAX() but that would only pull out the column not the row.
Thanks.
It appears that it is impossible to use an ORDER BY on a GROUP BY summarisation. My fundamental logic is flawed. I will need to run the following subquery.
SELECT `p`.*, `pp`.`price` FROM `products` `p`
LEFT JOIN (
SELECT `price` FROM `product_price` ORDER BY `date_updated` DESC
) `pp`
ON `p`.`product_id` = `pp`.`product_id`
GROUP BY `p`.`product_id`;
This will take a performance hit but as it is the same subquery for each row it shouldn't be too bad.
You need to set aliases properly I think and also set what you are joining on:
SELECT p.*, pp.price
FROM products AS p
LEFT JOIN product_price AS pp
ON pp.product_id = p.product_id
GROUP BY p.product_id
ORDER BY pp.date_updated DESC
This will give you the last updated price:
select
p.*, pp.price
from
products p,
-- left join this if products may not have an entry in prodcuts_price
-- and you would like to see a null price with the product
join
(
select
product_price_id,
max(date_updated)
from products_price
group by product_price_id
) as pp_max
on p.product_id = pp.product_id
join products_price pp on
pp_max.prodcuts_price_id = pp.products_price_id
Mysqlism:
SELECT p.*, MAX(pp.date_updated), pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
GROUP BY p.product_id
Will work on some RDBMS:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE (p.product_id, pp.date_updated)
in (select product_id, max(date_updated)
from product_price
group by product_id)
Will work on most RDBMS:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE EXISTS
(
select null -- inspired by Linq-to-SQL style :-)
from product_price
WHERE product_id = p.product_id
group by product_id
HAVING max(date_updated) = pp.date_updated
)
Will work on all RDBMS:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
LEFT JOIN
(
select product_id, max(date_updated) as recent
from product_price
group by product_id
) AS latest
ON latest.product_id = p.product_id AND latest.recent = pp.date_updated
And if nate c's code intent is to just get one row from product_price, no need to table-derive (i.e. join (select product_price_id, max(date_updated) from products_price) as pp_max), he might as well just simplify(i.e. no need to use the product_price_id surrogate primary key) it like the following:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE pp.date_updated = (select max(date_updated) from product_price)