How to search on MySQL using JOINs? - mysql

Can anyone tell me ways to do this kind of search in a database?
I got these tables:
posts (id, tags_cache)
tags (id, name)
posts_tags (post_id, tag_id)
The user enters a search query (say "water blue") and I want to show the posts that have both tags.
The only way I can think of to search is using FIND_IN_SET, this way:
SELECT p.*, GROUP_CONCAT(t.name) AS tags_search
FROM posts p
LEFT JOIN posts_tags pt ON p.id = pt.post_id
LEFT JOIN tags t ON pt.tag_id = t.id
GROUP BY p.id
HAVING FIND_IN_SET('water', tags_search) > 0
AND FIND_IN_SET('blue', tags_search) > 0
The posts.tags_cache text column stores the names and id of the tags it belongs to (this way: water:15 blue:20).
To avoid JOINs by using this column for search, I've tried LIKE and INSTR but these will give inexact results since you can search for "ter" and you'll gets posts tagged 'water' and 'termal' for example. I've also tried REGEXP which gives exact results, but it's a slow process.
I can't use MATCH as tables use InnoDB.
So... is or are there other ways to accomplish this?
[Edit]
I forgot to mention that the user could search for many tags (not just 2), and even exclude tags: search posts tagged 'water' but not 'blue'. With FIND_IN_SET this works for me:
HAVING FIND_IN_SET('water', tags_search) > 0
AND NOT FIND_IN_SET('blue', tags_search) > 0
[Edit2]
I did some performance test (i.e. only checked how long the queries took, cached) as ypercube suggested, and these are the results:
muists | Bill K | ypercu | includes:excludes
--------------------------
0.0137 | 0.0009 | 0.0029 | 2:0
0.0096 | 0.0081 | 0.0033 | 2:1
0.0111 | 0.0174 | 0.0033 | 2:2
0.0281 | 0.0081 | 0.0025 | 5:1
0.0014 | 0.0013 | 0.0015 | 0:2
I don't know if this info is valid resource... But it shows that ypercube's method with a JOIN per tag is the quickest.

I don't understand why you don't want to use JOINs nor why you're trying to use LEFT JOINs. You're looking for things that are there (rather than might be there) so get rid of the LEFT JOINs and just JOIN. And get rid of the tags_cache column, you're only asking for trouble with that sort of thing.
Something like this is what you're looking for:
select p.id
from posts p
join posts_tags pt on p.id = pt.post_id
join tags t on pt.tag_id = t.id
where t.name in ('water', 'blue')
group by p.id
having count(t.id) = 2
The 2 in the HAVING clause is the number of tags you're looking for.
And if you want to exclude certain tags, you could just add that to the WHERE clause like this:
select p.id
from posts p
join posts_tags pt on p.id = pt.post_id
join tags t on pt.tag_id = t.id
where t.name in ('water', 'blue')
and p.id not in (
select pt.post_id
from posts_tags pt
join tags t on pt.tag_id = t.id
where t.name in ('pancakes', 'eggs') -- Exclude these
)
group by p.id
having count(t.id) = 2

Finding posts that match all of several conditions on different rows is a common problem.
Here are two ways to do it:
SELECT p.*
FROM posts p
INNER JOIN posts_tags pt ON p.id = pt.post_id
INNER JOIN tags t ON pt.tag_id = t.id
WHERE t.name IN ('water', 'blue')
GROUP BY p.id
HAVING COUNT(DISTINCT t.name) = 2;
Or:
SELECT p.*
FROM posts p
INNER JOIN posts_tags pt1 ON p.id = pt1.post_id
INNER JOIN tags t1 ON pt1.tag_id = t1.id
INNER JOIN posts_tags pt2 ON p.id = pt2.post_id
INNER JOIN tags t2 ON pt2.tag_id = t2.id
WHERE (t1.name, t2.name) = ('water', 'blue');
Re comment and edit:
The problem with the HAVING solution is that it must perform a table-scan, searching every row in the tables. This is often much slower than a JOIN (when you have appropriate indexes).
To support tag exclusion conditions, here's how I'd write it:
SELECT p.*
FROM posts p
INNER JOIN posts_tags pt1 ON p.id = pt1.post_id
INNER JOIN tags t1 ON pt1.tag_id = t1.id AND t1.name = 'water'
LEFT OUTER JOIN (posts_tags pt2
INNER JOIN tags t2 ON pt2.tag_id = t2.id AND t2.name = 'blue')
ON p.id = pt2.post_id
WHERE t2.id IS NULL;
Avoiding using JOINs because you read it somewhere that they are bad is senseless. You must understand that a JOIN is a basic operation in relational databases, and you should use it where the job calls for it.

For your additional request, excluding some tags, you could use the next approach. It will give you all posts that have both water and blue tags but neither black, white or red:
SELECT p.*
FROM posts p
INNER JOIN posts_tags pt1 ON p.id = pt1.post_id
INNER JOIN tags t1 ON pt1.tag_id = t1.id
INNER JOIN posts_tags pt2 ON p.id = pt2.post_id
INNER JOIN tags t2 ON pt2.tag_id = t2.id
WHERE (t1.name, t2.name) = ('water', 'blue') --- include
AND NOT EXISTS
( SELECT *
FROM posts_tags pt
INNER JOIN tags t ON pt.tag_id = t.id
WHERE p.id = pt.post_id
AND t.name IN ('black', 'white', 'red') --- exclude
)

Related

Select rows that matches multiple and/or conditions

So I created a sql fiddle to explain my problem much clearer:
http://sqlfiddle.com/#!9/3122282/1
As you can see I have 3 tables and 1 of them links the 2 others.
I want to make it so if I say "give me the products that is (color green OR red) and PET (dog)"?
I tried doing:
select `ptl`.`product_id`
from `tags` inner join `tags` as `ptl`
on `tags`.`id` = `ptl`.`tag_id`
where ((`tags`.`tag` = "color" and `tags.value` in ("green", "red"))
or (`tags`.`tag` = "pet" and `tags.value` in ("dog")))
having count(distinct `ptl.tag_id`) = 2
// 2 in that case is the number of tag "category".
but this doesn't seem to work. since having is just checking the count, it will also return the products with 2 color tags without any pet.
You can join the 3 tables, group by product and set the conditions in the HAVING clause:
SELECT p.id, p.name
FROM products p
INNER JOIN product_tags_link pt ON pt.product_id = p.id
INNER JOIN tags t ON pt.tag_id = t.id
GROUP BY p.id, p.name
HAVING SUM(t.tag = 'color' AND t.value IN ('green', 'red')) > 0
AND SUM(t.tag = 'pet' AND t.value IN ('dog')) > 0
See the demo.
You are not joining tags table with product_tags_link and products.
Take this query as a base and add the conditions on the where clause
select *
from products p
inner join product_tags_link ptl on ptl.product_id = p.id
inner join tags t on t.id = ptl.tag_id
where CONDITIONS
a CONDITIONS that can be taken as example
p.id = 1 and t.tag = 'color' and t.value = 'green'

MySQL query posts by tag

I'm trying to search for all posts for a specific tag name, whilst still being able to join all tags for the returned posts.
posts
id
...
tags
id
name
slug
posts_tags
id
post_id
tag_id
I'll do a query such as this:
SELECT * FROM posts p
INNER JOIN posts_tags pt ON pt.post_id = p.id
INNER JOIN tags t ON pt.tag_id = t.d
WHERE t.slug = 'foo'
This will return me all posts with the tag foo, but will no longer join the other tags associated with the posts. How can I write it in such a way so I can still get all tags on the posts?
For example, say I have a post which has 3 tags associated with it: cat, dog and chimp. I want to do a query for posts which have the tag dog. How can I construct a query which will fetch me the posts with the tag dog, ensuring that the cat and chimp tags are also retrieved in the result?
If you want all the tags for all the posts which include foo as well as all other tags, then i think you can do a left join.
SELECT * FROM posts p
INNER JOIN posts_tags pt ON pt.post_id = p.id
LEFT JOIN tags t
ON pt.tag_id = t.d
Above will give you all the posts and the relevant tags for the posts. You can order by slug OR you can add a clause with left join to filter by tag you need like below:
SELECT * FROM posts p
INNER JOIN posts_tags pt ON pt.post_id = p.id
LEFT JOIN tags t
ON pt.tag_id = t.d AND t.slug IN ('foo') --add other tags if needed
If you want all the posts that have 'foo' as a tag, then you need more complicated logic. For your purposes, I think it is probably sufficient to get the tags as a delimited list:
SELECT p.*, GROUP_CONCAT(t.slug) as tags
FROM posts p INNER JOIN
posts_tags pt
ON pt.post_id = p.id INNER JOIN
tags t
ON pt.tag_id = t.d
GROUP BY p.id
HAVING SUM(t.slug = 'foo') > 0;
if you need all the tags slug related to post that are related to foo then you could use
select distinct tags.slug
from tags
inner join (
SELECT post_id from posts_tags pt
INNER JOIN tags t ON pt.tag_id = t.d
WHERE t.slug = 'foo'
) t on t.id = post_tags.post_id
inner join tags on tags.id = post_tags.tag_id
or if you need the related post too
select post.*, tags.slug
from tags
inner join (
SELECT post_id from posts_tags pt
INNER JOIN tags t ON pt.tag_id = t.d
WHERE t.slug = 'foo'
) t on t.id = post_tags.post_id
inner join tags on tags.id = post_tags.tag_id
inner join post on post.id = post_tag.post_id

MySQL Left Join and excluding values

UPDATE
There is a database model in sqfiddle: http://sqlfiddle.com/#!2/8dbb0/10
And I updated the question according to the annotations.
Original Post
I have three tables:
posts
tags
tag_to_post
Lets asume a tag_id 1 that has been used by user 2. Now I want to show user 2 all posts, that another user has tagged with tag_id 1, but user 2 has not tagged with tag_id 1 so far.
The query:
SELECT posts.id AS post_id, tags.id AS tag_id, tag_to_post.user_id AS
user_tagged_post
FROM posts
LEFT JOIN tag_to_post ON tag_to_post.post_id = posts.id
LEFT JOIN tags ON tags.id = tag_to_post.tag_id
WHERE tags.id =1
Produces something like:
post_id | tags_id | user_tagged_post
1 | 1 | 1
1 | 1 | 2
2 | 1 | 2
3 | 1 | 1
So there should only be left post id 3.
First I tried with where-statement like:
WHERE tags.id = 1 AND tag_to_post.user_id != '2'
But this of course doesn't exclude post_id 1 cause it is a douplicate. I think there should be a DISTINCT or GROUPED BY before the WHERE clause, but this seems not to be allowed. So the only way is a sub-query? I didn't find a solution so far. Any ideas?
If I understand you correctly, it would seem like a straight forward LEFT JOIN;
SELECT t1.post_id, p.title, t1.tag_id, t1.user_id
FROM tag_to_post t1
JOIN posts p ON t1.post_id = p.id
LEFT JOIN tag_to_post t2
ON t1.tag_id = t2.tag_id AND t1.post_id = t2.post_id AND t2.user_id = 2
WHERE t1.user_id <> 2 AND t2.user_id IS NULL
An SQLfiddle to test with.
May be you need something like this
SELECT posts.id, posts.title, tags.tag, tag_to_post.user_id
FROM posts
INNER JOIN tag_to_post ON tag_to_post.post_id = posts.id
INNER JOIN tags ON tags.id = tag_to_post.tag_id
WHERE tags.id = 1 AND tag_to_post.user_id <> 2
Based on comments:
SELECT DISTINCT posts.id, posts.title, tags.tag, A.user_id
FROM posts
INNER JOIN tag_to_post A ON A.post_id = posts.id
INNER JOIN tags ON tags.id = A.tag_id
WHERE tags.id = 1
AND A.post_id NOT IN (SELECT post_id FROM tag_to_post WHERE tags.id = 1 AND user_id = 2)

sql many to many select with join

I am trying to make a select statement and I just cant get it work.
I have 3 tables:
places, tags, places_tags
Places:
- id
- name
Tags:
- id
- name
Places_tags:
- place_id
- tag_id
- order
I am trying to select places and join the first tag that inserted (using order)
SELECT p.*, t.tag_id AS tag
FROM `places` as p
LEFT JOIN places_tags t ON (t.place_id = p.id)
group by p.id
That's what I have right now.
I need to add somthing like ORDER BY order DESC...
I think that I'm not doing it right.
Something like the following should work:
SELECT
p.name AS "place",
t.name AS "firstTag"
FROM
places p
LEFT JOIN
places_tags pt1
ON pt1.place_id = p.id
LEFT JOIN
places_tags pt2
ON pt2.place_id = p.id AND pt2.tag_id < pt1.tag_id
LEFT JOIN
tags t
ON t.id = pt1.tag_id
WHERE
pt2.tag_id IS NULL

Searching multiple rows in select with left join

I've got 3 tables, products, products_tags and tags. A product can be connected to multiple tags via the products_tags table.
But if i would like to search on a product now with multiple tags, i do a query like this:
SELECT
*
FROM
products
LEFT JOIN
products_tags
ON
products_tags.product_id = products.id
LEFT JOIN
tags
ON
products_tags.tag_id = tags.id
WHERE
tags.name = 'test'
AND
tags.name = 'test2'
Which doesn't work :(.
If i remove the AND tags.name = 'test2' it works. So i can only search by one tag, i explained the query and it said impossible where.
How can i search on multiple tags using a single query?
Thanks!
Have you tried something like:
WHERE
(tags.name = 'test'
OR
tags.name = 'test2')
Or
WHERE
tags.name in( 'test', 'test2')
Because even if you join one product to multiple tags, each tag record only has one value for name.
you need to join twice for test and test2:
select products.*
from products
join product_tags as product_tag1 on ...
join tags as tag1 on ...
join product_tags as product_tag2 on ...
join tags as tag2 on ...
where tag1.name = 'test'
and tag2.name = 'test2'
for test or test2, you need one join and an in clause and a distinct:
select distinct products.*
from products
join product_tags on ...
join tags as tags on ...
where tags.name IN('test', 'test2')
You'll have to do a group by and COUNT(*) to ensure BOTH (or however many) are ALL found.
The first query (PreQuery) joins the products tags table to tags and looks for same with matching count of tags to find... THEN uses that to join to products for finalized list
SELECT STRAIGHT_JOIN
p.*
FROM
( select pt.product_id
from products_tags pt
join tags on pt.tag_id = tags.id
where tags.name in ('test1', 'test2' )
group by pt.product_id
having count(*) = 2
) PreQuery
join products on PreQuery.Product_ID = Products.ID
If you are searching for products that have BOTH the "test" and "test2" tags, then you will need to join to the product_tag and tag table twice each.
Also, use inner joins since you only want the products that have these tags.
Example:
SELECT products.*
FROM products
INNER JOIN products_tags pt1 ON pt1.product_id = products.id
INNER JOIN products_tags pt2 ON pt2.product_id = products.id
INNER JOIN tags t1 ON t1.id = pt1.tag_id
INNER JOIN tags t2 ON t2.id = pt2.tag_id
WHERE t1.name = 'test'
AND t2.name = 'test2'