MySQL select join where AND where - mysql

I have two tables in my database:
Products
id (int, primary key)
name (varchar)
ProductTags
product_id (int)
tag_id (int)
I would like to select products having all given tags. I tried:
SELECT
*
FROM
Products
JOIN ProductTags ON Products.id = ProductTags.product_id
WHERE
ProductTags.tag_id IN (1, 2, 3)
GROUP BY
Products.id
But it gives me products having any of given tags, instead of having all given tags. Writing WHERE tag_id = 1 AND tag_id = 2 is pointless, because no rows will be returned.

This type of problem is known as relational division
SELECT Products.*
FROM Products
JOIN ProductTags ON Products.id = ProductTags.product_id
WHERE ProductTags.tag_id IN (1,2,3)
GROUP BY Products.id /*<--This is OK in MySQL other RDBMSs
would want the whole SELECT list*/
HAVING COUNT(DISTINCT ProductTags.tag_id) = 3 /*Assuming that there is a unique
constraint on product_id,tag_id you
don't need the DISTINCT*/

you need to have a group by / count to ensure all are accounted for
select Products.*
from Products
join ( SELECT Product_ID
FROM ProductTags
where ProductTags.tag_id IN (1,2,3)
GROUP BY Products.id
having count( distinct tag_id ) = 3 ) PreQuery
on ON Products.id = PreQuery.product_id

The MySQL WHERE fieldname IN (1,2,3) is essentially shorthand for WHERE fieldname = 1 OR fieldname = 2 OR fieldname = 3. So if you aren't getting the desired functionality with WHERE ... IN then try switching to ORs. If that still doesn't give you the results you want, then perhaps WHERE ... IN is not the function you need to use.

Related

How to count duplicates by multiple records in subtable

Say I have the following table structure:
products
id | name | price
products_ean
id | product_id | ean
A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.
What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?
I've tried something like the following, but that makes the query really slower:
SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12
Using joins is the simplest way to generate related information. I've GROUP BY the product.id which means the eans are the aggregated field because those are the only ones that can duplicate. I've added HAVING part after the query to select only those results with 2 or more (its optional).
SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2
On query efficiency, having the product_id,ean as a composite primary key for the products_ean table is probably most efficient. Since that is unique its not obvious why the products_ean.id column is needed.

How to count number of distinct values when joining multiple tables?

I have following database tables
categories
id, name
products
category_id, id, product_name, user_id
product_comments
product_id, id, comment_text, user_id
I need a count of number of different users in both products and product_comments tables. I have got the following query where I select all those categories where there is atleast one product and each product may have zero or some comments ... but what I can't figure out is that how to get the sum of these different user ids .. if it were just from one table I would try COUNT(products.user_id) ... .here is my query ..
SELECT
c.*
FROM categories c
INNER JOIN products p ON p.category_id = c.id
LEFT JOIN product_comments pc ON pc.product_id = p.id
I need total number of different users IDs from both products and product_comments tables.
I would expect the result data somewhat like below:
Category_id, Category_name, TotalUsers
1, Test Category, 10
2, Another Category, 5
3, Yet another cat, 3
This will give you an overall count rather than a list of all distinct id's:
SELECT COUNT(DISTINCT user_id)
FROM products
LEFT JOIN product_comments comm ON products.id = comm.product_id
If you want distinct users, then you could try something like:
select distinct user_id from (
select user_id from products
UNION
select user_id from product_comments
) as allusers;
You can then count them:
select count(distinct user_id) from (
select user_id from products
UNION
select user_id from product_comments
) as allusers;
select user_id from products
UNION
select user_id from product_comments
This will give you the distinct list of user ids that exist in the tables. The users could be present in just one of the tables, or both and will still be included in the list.
You can use the DISTINCT keyword and COUNT() function
SELECT DISTINCT
COUNT(c.*)
FROM categories c
INNER JOIN products p ON p.category_id = c.id
LEFT JOIN product_comments pc ON pc.product_id = p.id

MySQL id merge question

Sorry for the vague topic but I'm having a hard time explaining this. What I'm trying to do is fetching an ID representing a post which can be posted in different categories, and I want my post to belong to all three categories to match the criteria. The table looks like this
id category_id
1 3
1 4
1 8
What I wanna do is fetch an id that belongs to all 3 of these categories, but since they're on different rows I can't use
SELECT id FROM table WHERE category_id = '3' AND category_id = '4' AND category_id = '8';
This will of course return nothing at all, since no row matches that criteria. I've also tried with the WHERE IN clause
SELECT id from table WHERE category_id IN (3, 4, 8);
This returns any post in any of these categories, the post I want returned has to be in all three of these categories.
So the question becomes, is there any good way to look for an id that belongs to all three of these categories, or do I have to use the WHERE IN clause and see if I get 3 rows with the id 1, then I'll know that it occured three times, therefor belongs to all three categories, seems like a bad solution.
Thanks in advance. I appreciate the help. :)
You could use group_concat to get a comma separated string of your category id's and check if that contains all the categories you're filtering
SELECT id, GROUP_CONCAT(t2.category_id) as categories
FROM table AS t1
INNER JOIN table AS t2 ON t1.id = t2.id
WHERE
FIND_IN_SET('3', categories) IS NOT NULL AND
FIND_IN_SET('4', categories) IS NOT NULL AND
FIND_IN_SET('8', categories) IS NOT NULL
Update
SELECT t1.id, GROUP_CONCAT(t2.category_id) as `categories`
FROM `table` AS t1
INNER JOIN `table` AS t2 ON t1.id = t2.id
HAVING
FIND_IN_SET('3', `categories`) IS NOT NULL AND
FIND_IN_SET('4', `categories`) IS NOT NULL AND
FIND_IN_SET('8', `categories`) IS NOT NULL
The last query would not have worked, since it is a group function the value cannot be used in the WHERE clause but can be used in the HAVING clause.
I think you need to say GROUP BY id like this:
SELECT id FROM table WHERE category_id = '3' OR category_id = '4' OR category_id = '8'; GROUP BY id;
Hope that works for you.
i believe you need to establish the amount of categories a post appears in.
this can be done by applying COUNT and HAVING to the SQL query as follows:
SELECT id,
COUNT(category_id) AS categories
FROM `table`
GROUP BY id
HAVING categories > 3
and if you wish the number of categories in your site to change dynamically you can always have an inner SELECT statement like so:
SELECT id,
COUNT(category_id) AS categories
FROM `table`
GROUP BY id
HAVING categories > (
SELECT COUNT(category_id)
FROM `categories`
)
where categories is the table you store all your categories information

Mysql: SELECT and GROUP BY

Sorry for the abysmal title - if someone wants to change it for something more self-explanatory, great - I'm not sure how to express the problem. Which is:
I have a table like so:
POST_ID (INT) TAG_NAME (VARCHAR)
1 'tag1'
1 'tag2'
1 'tag3'
2 'tag2'
2 'tag4'
....
What I want to do is count the number of POSTs which have both tag1 AND tag2.
I've messed about with GROUP BY and DISTINCT and COUNT but I can't construct a query which does the trick.
Any suggestions?
Edit: In pseudo sql, the query I want is:
SELECT DISTINCT(POST_ID) WHICH HAS TAG_NAME = 'tag1' AND TAG_NAME = 'tag2';
Thanks
Edit: because 'TABLE' was a poor choice for a missing tablename, I'll suppose your table is called Posts.
Join the table against itself:
SELECT * FROM Posts P1
JOIN Posts P2
ON P1.POST_ID = P2.POST_ID
WHERE P1.TAG_NAME = 'tag1'
AND P2.TAG_NAME = 'tag2'
I'm just leaving this (untested) dependent subquery solution here for reference, even though it'll probably be horribly slow once you get to large data sets. Any solution that does the same thing using joins should be chosen over this.
Assuming you have a posts table with an id field, as well:
SELECT count(*) FROM posts WHERE EXISTS(SELECT NULL FROM posts_tags WHERE tag = 'tag1' AND post_id = posts.id) AND EXISTS(SELECT NULL FROM posts_tags WHERE tag = 'tag2' AND post_id = posts.id)
Try the following query:
SELECT COUNT(*) nb_posts
FROM (
SELECT post_id, COUNT(*) nb_tags
FROM table
WHERE tag_name in ('tag1','tag2')
GROUP BY post_id
HAVING COUNT(*) = 2
) t
Edit: based on Konerak answer, here is the query that handles the case when there are duplicated tag names for a given post:
SELECT DISTINCT t1.post_id
FROM table t1
JOIN table t2
ON t1.post_id = t2.post_id
AND t2.tag_name = 'tag2'
WHERE t1.tag_name = 'tag1'

Optimizing Multilevel MySQL subqueries (folksonomy and taxonomy)

I was reading the great tagging article by Nitin Borwankar and he started me thinking of the ways to implement differnet levels of searches using two tables.
tags {
id,
tag
}
post_tags {
id
user_id
post_id
tag_id
}
I started with the simple example of T(U(i)) which means all tags of all users that have an item i. I was able to do it with the following SQL:
/* get all tags from the users found */
SELECT t.*, vt.* FROM verse_tags as vt
LEFT JOIN tags as t ON t.id = vt.tag_id
WHERE user_id in
(
/* Get all user_ids that have taged this item */
SELECT user_id FROM verse_tags WHERE verse_id = 26046 GROUP BY user_id
)
GROUP BY t.id
Then I started with a slightly harder +1 level deep query. T(U(T(u))) which is tags of users using tags like user #.
/* Then get the tags of the user with tags like the user 3 */
SELECT t.id FROM post_tags as pt
LEFT JOIN tags as t ON t.id = pt.tag_id
WHERE user_id in
(
/* Then get users with these tags */
SELECT pt.user_id FROM post_tags as pt
LEFT JOIN tags as t on t.id = pt.tag_id
WHERE tag_id in
(
/* get tags of user */
SELECT t.id FROM post_tags as pt
LEFT JOIN tags as t ON t.id = pt.tag_id
WHERE pt.user_id = 3
GROUP BY t.id
)
GROUP BY user_id
)
GROUP BY t.id
However, it since I normally use JOIN's in my queries I am not sure how something like this could be optimized or what design flaws need to be avoided when using subqueries. I have even read that JOIN's should be used instead, but I have no idea how this would be accomplished with the above queries.
How could these queries be optimized?
UPDATE
1) Replaced GROUP BY with SELECT DISTINCT. (.74 sec)
2) Replace WHERE in with WHERE exists. (.40 sec)
3) Added indexes (oops!) (0.09 sec)
4) Back to WHERE in (0.08 sec)
EXPLAIN SELECT DISTINCT tag_id FROM post_tags WHERE user_id in
(
SELECT DISTINCT user_id FROM post_tags WHERE tag_id in
(
SELECT DISTINCT tag_id FROM post_tags WHERE user_id = 3
)
)
Running EXPLAIN gives me these results:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY post_tags index NULL tag_id 4 NULL 14 Using where
2 DEPENDENT SUBQUERY post_tags index_subquery user_id user_id 4 func 1 Using where
3 DEPENDENT SUBQUERY post_tags index_subquery user_id,tag_id tag_id 4 func 1 Using where
According to me this is the solution:
SELECT DISTINCT(`t`.`id`) FROM `post_tags` as `pt`
left join `tags` as t on `t`.`id` = `pt`.`tag_id`
where `pt`.`user_id` in(
SELECT distinct(`pt`.`user_id`) FROM `post_tags` as `pt`
LEFT JOIN `tags` as `t` on `t`.`id` = `pt`.`tag_id`
WHERE `pt`.`tag_id` in(
SELECT distinct(`tag_id`) FROM `post_tags`
WHERE pt.user_id = 3
)
)