Where in with mysql or better? - mysql

Here is my actual query:
$result = mysqli_query($link,"select ids_billets from pref_tags where id='$_GET[id]'");
$tags_ids = $result->fetch_object();
$result = mysqli_query($link,"select id, html from pref_posts where id in ($tags_ids->ids_billets)");
while($posts = $result->fetch_object()) {
.....
}
I have ids in one varchar field of the pref_tags table (ids_billets) - example : "12,16,158"
Is there a better way to query this?
Thanks

Instead of one row with a comma-separated list, I would create a new table linking posts to tags, with one row per post/tag combo, i.e.:
posts
---------------------
post_id | html | etc.
posts_tags
----------------
post_id | tag_id
tags
------------------------
tag_id | tag_name | etc.
Then do something like this:
SELECT p.post_id, p.html
FROM posts p
INNER JOIN posts_tags pt
ON p.post_id = pt.post_id
INNER JOIN tags t
ON pt.tag_id = t.tag_id
WHERE t.tag_name = ?
Or if you already have the tag_id, like you seem to:
SELECT p.post_id, p.html
FROM posts p
INNER JOIN posts_tags pt
ON p.post_id = pt.post_id
WHERE pt.tag_id = ?
You can also do the same query in a different form, using a subquery:
SELECT post_id, html
FROM posts
WHERE post_id IN (SELECT post_id FROM posts_tags WHERE tag_id = ?)
Also, look at prepared statements, which will make it easy to avoid the serious SQL injection problems your current code has.

You should not put multiple values in a single column, because doing so breaks first normal form. Have a look at that link for examples of the problems you're likely to come across, and how to fix them.
Rather, create a separate table where you can have the ID and the Tag ID in separate columns. Then you can pull back the IDs in a subquery in your second example query, and get the benefits of being able to search for and manipulate individual IDs in other queries.

Related

MySQL - Merging two queries that have different conditions and limits

I'm implementing a Tag System for my website, using PHP + MySQL.
In my database, I have three tables:
Posts
Id
Title
DateTime
Primary Key: Id
Tags
Id
Tag
Slug
1
First Tag
first-tag
Primary Key: Id | Key: Slug
TagsMap
Id
Tag
Primary Key: both
(Id = post's Id in Posts; Tag = Tag's Id in Tags)
Given, for instance, the url www. ... .net/tag/first-tag, I need to show:
the tag's name (in this case: "First Tag");
the last 30 published posts having that tag.
In order to achieve this, I'm using two different queries:
firstly
SELECT Tag FROM Tags WHERE Slug = ? LIMIT 1
then
SELECT p.Title FROM Posts p, Tags t, TagsMap tm
WHERE p.Id = tm.Id
AND p.DateTime <= NOW()
AND t.Id = tm.Tag
AND t.Slug = ?
ORDER BY p.Id DESC
LIMIT 30
But I don't think it's a good solution in terms of performance (please, correct me if I'm wrong).
So, my question is: how (if possible) to merge those two queries into just one?
Thanks in advance for Your suggestions.
The query that you have shown above is not a optimal solution as first it creates a cartesian product of all the tables and then filters out the data based on the conditions. If these tables become heavier in future, then your query will start slowing down (SLOW QUERIES).
Please use joins over this approach. ex. INNER JOIN, LEFT JOIN, RIGHT JOIN etc.
Try this SQL:
SELECT t.*, p.* FROM Tags t
INNER JOIN TagsMap tm ON (tm.Tag = t.Id )
INNER JOIN Posts p ON (p.Id = tm.Id AND p.DateTime <= NOW())
WHERE t.slug LIKE 'First Tag'
ORDER BY p.Id DESC
LIMIT 30
Given that you have structured your tables in a manner where you can utilize foreign keys and match them with their counterparts, then you can make use of JOIN's in your query.
SELECT
Tags.Tag,
Posts.title
FROM
Tags
LEFT JOIN
TagsMap ON Tags.id = TagsMap.tag
LEFT JOIN
Posts ON TagsMap.id = Posts.id AND
Posts.DateTime <= NOW()
WHERE
Posts.id = TagsMap.id AND
Tags.Slug = ?
ORDER BY
Posts.id DESC
LIMIT 30
The idea is that the query is optimized, but you will need to filter your result set programmatically in the view, in order to display the Tag only once.
If there is at most one "slug" per "post", include slug as a column in Posts.
If there can be any number of "tags" per "post", then have a table
CREATE Tags (
post_id ... NOT NULL,
tag VARCHAR(..)... NOT NULL,
post_dt DATETIME NOT NULL,
PRIMARY KEY(post_id),
INDEX(tag, dt)
) ENGINE=InnoDB
And you may want to use LEFT JOIN Tags and GROUP_CONCAT(tag).
I don't know what you mean by "first" in "first_tag". Maybe you should get rid of "first"?
The last 30 posts for a given tag:
SELECT p.*,
( SELECT GROUP_CONCAT(tag) FROM Tags ) AS tags
FROM ( SELECT post_id FROM tags WHERE tag = ?
ORDER BY post_dt DESC LIMIT 30 ) AS x
JOIN posts AS p ON p.id = x.post_id

One query for one join that uses first result in second (subquery) to get all rows

so I have been struggling with this one for a while. While I could indeed just make two separate queries for this problem, I wonder if it would be possible to do one query. I think a SQL pro will certainly know how to get this done. So here is the thing:
We have two tables posts and post_translations. Abbreviated for simplicity.
posts
------
id
post_translations
-----------------
id
post_id (which is the FK on posts...)
locale
slug
content
... (and so on)
It's clear for me that I could now do a very simple INNER JOIN to get all posts translated in a specific language, let's say 'en' or 'de' if you want. So I am not gonna bother you further with this.
But as the table will also hold sub locales, such as en (for USA), en-GB, en-AU, de (for Germany), de-AT, de-CH .... the whole thing becomes a bit more complex for following szenario.
Let's say there are posts translated in the languages 'de' only and also in 'de-AT'. The table would then look like:
post_translations
id post_id locale content
1 1 de ...
2 1 de-AT ...
3 2 de ...
So post 1 is available in 'de' and 'de-AT'. Post 2 is only available in 'de-AT'.
If I want to have all posts in 'de' that's easy. I just add a WHERE locale = 'de' and I am good. But let's say I want all posts in 'de-AT' and all other posts in 'de' that are not translated in 'de-AT' so I don't get any duplicates - how can I achieve that in one query? As mentioned earlier, I could run two queries here, first I get all the posts in 'de-AT', then I get all the posts in 'de' and use the 'post_id's I got from the first query with a WHERE not IN query, so I don't get any duplicates.
These queries would be then:
SELECT
pt.post_id
FROM
posts AS p
INNER JOIN
post_translations AS pt ON p.id = pt.post_id
WHERE
pt.locale = 'de-AT';
and from this query you would use the post_id in this one:
SELECT
*
FROM
posts AS p
INNER JOIN
post_translations AS pt ON p.id = pt.post_id
WHERE
pt.locale = 'de' AND pt.post_id NOT IN (*post_ids found in first search*);
So staying with the above mentioned post_translations table the desired result would be:
p.id pt.id pt.post_id pt.locale pt.content
1 2 1 de-AT ...
2 3 2 de ...
p stands for posts and pt for post_translations obviously.
The idea behind the query is to show the specific posts for a region, in this case 'de-AT' but also to show the generic posts that were written for 'de' users.
I hope that makes sense. Would appreciate any help on this. Thank You.
One option uses not exists:
select pt.*
from post_translations pt
where
locale = 'de-AT'
or (
locale = 'de'
and not exists (
select 1
from post_translations pt1
where pt1.post_id = pt.post_id and pt1.locale = 'de-AT'
)
)
Alternatively, if you are running MySQL 8.0, you can also use row_number():
select *
from (
select
pt.*,
row_number() over(partition by post_id order by (locale = 'de')) rn
from post_translations pt
where locale in ('de', 'de-AT')
) pt
where rn = 1
You can easily modify the above queries to join the posts table.
show the specific posts for a region, in this case 'de-AT' but also to
show the generic posts that were written for 'de' users
I believe that a simple WHERE clause with the operator IN would return your expected results.
Then you can use conditional aggregation to flag the posts that have the sub locale that you want and maybe sort these posts first:
SELECT post_id,
MAX(locale = 'de-AT') AS flag
FROM post_translations
WHERE locale IN ('de', 'de-AT')
GROUP BY post_id
ORDER BY flag DESC
You can join the above query to posts to get the details of each post:
SELECT p.*, t.flag
FROM posts AS p INNER JOIN (
SELECT post_id,
MAX(locale = 'de-AT') AS flag
FROM post_translations
WHERE locale IN ('de', 'de-AT')
GROUP BY post_id
) t ON t.post_id = p.id
ORDER BY t.flag DESC

mysql - select posts all that are not tagged hidden

So I have three tables, one is posts , having columns id,title,content,timestamp . Other is tags, having columns id,tag and third posttags describes one to many relation between posts and tags , having columns postid,tagid .
Now instead of having columns like hidden,featured etc in the table posts to describe whether a post should be visible to all or should be displayed on a special featured page, I thought why not use tags to save time. So what I decided is that all posts that have a tag #featured will be featured and all posts with tag #hidden will be hidden.
Implementing first one was easy as I could use a join query and in my where clause I could mention WHERE tag='featured' and this would get all the featured posts for me.
But take an example of a post tagged #sports and #hidden if I were to use the query
SELECT * FROM posts
INNER JOIN posttags ON posttags.postid = posts.id
INNER JOIN tags ON posttags.tagid = tags.id
WHERE tag !='hidden'
but that'd still return the post tagged hidden since its also tagged sports
PS my question is different from this question : Select a post that does not have a particular tag since it uses tagid directly and I'm unable to achieve same result using double join to check against tag name instead of tagid. And also I wish to retrieve the other tags of the post in same query which is not possible using the method in that question's answers
Group the tags by post, then use the HAVING clause to filter the groups for those that do not contain a 'hidden' tag. Because of MySQL's implicit type conversion and lack of genuine boolean types, one can do:
SELECT posts.*
FROM posts
JOIN posttags ON posttags.postid = posts.id
JOIN tags ON posttags.tagid = tags.id
GROUP BY posts.id
HAVING NOT SUM(tag='hidden')
You can do this with a NOT EXISTS subquery:
SELECT p.*, t.* -- what columns you need
FROM posts AS p
INNER JOIN posttags AS pt
ON pt.postid = p.id
INNER JOIN tags AS t
ON pt.tagid = t.id
WHERE NOT EXISTS
( SELECT *
FROM posttags AS pt_no
INNER JOIN tags AS t_no
ON pt_no.tagid = t_no.id
WHERE t_no.tag = 'hidden'
AND pt_no.postid = p.id
) ;
or the equivalent LEFT JOIN / IS NULL:
SELECT p.*, t.*
FROM posts AS p
LEFT JOIN posttags AS pt_no
INNER JOIN tags AS t_no
ON t_no.tag = 'hidden'
AND pt_no.tagid = t_no.id
ON pt_no.postid = p.id
INNER JOIN posttags AS pt
ON pt.postid = p.id
INNER JOIN tags AS t
ON pt.tagid = t.id
WHERE pt_no.postid IS NULL ;
Thsi type of queries are called anti-semijoins or just anti-joins. It's slightly more complex in your case because the condition (tag='hidden') is in a 3rd table.

Select data from the ends of a many-to-many relationship

I have a database of posts, each of which have tags. These tables are named Posts and Tags respectively. I also have a third table, called Posts_Tags which maintains a many-to-many relationship between these two tables.
In order to do this, both my posts and my tags tables have an id column. My Posts_Tags table, therefore, has both a postid and tagid column to store the mappings.
I am querying, for example, all posts with the word "class" in the title. I can do this easily with this query:
SELECT * FROM Posts WHERE title LIKE '%{class}%'
However, now I want to query all posts which not only have "class" in the title, but are also tagged with the "Java" tag. I could do this in two separate queries, where I first get the id of the Java tag:
SELECT id FROM Tags WHERE name='Java'
Then I could plug that into my first query, like this:
SELECT Posts.*
FROM Posts
INNER JOIN Posts_Tags
ON Posts.id=Posts_Tags.postid
WHERE Posts_Tags.tagid='$java_tag_id'
AND title LIKE '%{class}%'
However, I know I can do this in a single query, I just don't know how. I still have to think a lot about joins when doing just one, and doing multiple joins in the same query makes my head spin. How should I structure my query to perform this operation?
SELECT p.*
FROM Posts p
JOIN Posts_Tags pt
ON pt.postid = p.id
JOIN tags t
ON t.id = pt.tagid
WHERE t.tag='java'
AND p.title LIKE '%{class}%';
SELECT
p.*
FROM posts as p
INNER JOIN Posts_Tags pt ON pt.post_id = p.id
INNER JOIN Tags as t ON pt.tags_id = t.id
WHERE t.tag='java'
AND p.title LIKE '%keyword%';

Join on multiple rows

I'm trying to load rows form a posts table based on whether they have multiple rows in another table. Take the below table structures:
posts
post_id post_title
-------------------
1 My Post
2 Another Post
post_tags
post_tag_id post_tag_name
--------------------------
1 My Tag
2 Another Tag
postTags
postTag_id postTag_tag_id postTag_post_id
------------------------------------------
1 1 1
2 2 1
Unsurprisingly, post and post_tags stores the posts and tags, and postTags joins which posts have which tags.
What I'd normally do to join the tables is this:
SELECT * FROM (`posts`)
JOIN `postTags` ON (`postTag_post_id` = `post_id`)
JOIN `post_tags` ON (`post_tag_id` = `postTag_tag_id`)
Then I'd have information on the tags, and can have additional stuff later in the query to search tag names for search terms etc, and then GROUP once I have posts that match the search terms.
What I'm trying to do is only select from posts where a post has both tag 1 AND tag 2, and I can't work out the SQL for it. I think it needs to be done in the actual JOIN rather than having a WHERE clause for it as when I run the join above I'd obviously get two rows back, so I can't have something like
WHERE post_tag_id = 1 AND post_tag_id = 2
as each row will only have one post_tag_id, and I can't check different values for the same column in one row.
What I've tried to do is something like this:
SELECT * FROM (`posts`)
JOIN `postTags` ON (postTag_tag_id = 1 AND postTag_tag_id = 2)
JOIN `post_tags` ON (`post_tag_id` = `postTag_tag_id`)
but this is returning 0 results when I run it; I've put conditions like this in JOINS before for similar things and I'm sure it's close but can't quite work out what to do if this doesn't work.
Am I at least on the right track? Hopefully I'm not missing something obvious.
Thanks.
You are trying to ask the postTags row to be at the same time one thing and another.
You either need to do two joins to post_tags and postTags so you get both. Or you can say that the post can have whatever tag between those two and the total amount of tags must equal two (assuming a post cannot related to the same tag more than once).
First approach:
SELECT *
FROM `posts` as p
WHERE p.`post_id` IN (SELECT pt.`postTag_post_id`
FROM `postTags` as pt
WHERE pt.`postTag_tag_id` = 1)
AND p.`post_id` IN (SELECT pt.`postTag_post_id`
FROM `postTags` as pt
WHERE pt.`postTag_tag_id` = 2);
Second approach:
SELECT *
FROM posts as p
WHERE p.post_id IN (SELECT pt.postTag_post_id
FROM (SELECT count(0) as c, pt.postTag_post_id
FROM postTags as pt
WHERE pt.postTag_tag_id IN (1, 2)
GROUP BY pt.postTag_post_id
HAVING c = 2) as pt);
I want also to add that if you use IN or EXISTS in the first approach then you won't have multiple lines for the same post row just because you have more than one tag. This way you save one DISTINCT later that would make your query slower.
I've used an IN in the second approach just as a rule of thumb I use: if you don't need to show the data you don't need to do a JOIN in the FROM section.
SELECT p.*, t1.*, t2.* FROM posts p
INNER JOIN postTags pt1 ON pt1.postTag_post_id = p.id AND pt1.postTag_tag_id = 1
INNER JOIN postTags pt2 ON pt2.postTag_post_id = p.id AND pt2.postTag_tag_id = 2
INNER JOIN post_tags t1 ON t1.post_tag_id = pt1.postTag_tag_id
INNER JOIN post_tags t2 ON t2.post_tag_id = pt2.postTag_tag_id
Without actually building a db the same as yours this is hard to verify but it should work.
Let me start by saying that this type of query is much easier and much more performant in a database that supports analytic queries (Oracle, MS SQL Server). So in MySQL you have to do it the old, crappy, aggregate way.
I also want to say that having a table that stores the names of the tags in post_tags and then the mapping of post tags to posts in postTags is confusing. If it were me, I would change the name of the mapping table to post_tags_map or post_tags_to_post_map. So you would have posts with post_id, post_tags with post_tags_id, and post_tags_map with post_tags_map_id. And those id columns would be named the same in every table. Having the same column that is named differently in other tables is also confusing.
Anyways, let's solve your problem.
First you want a result set that is 1 post id per row, and only the posts that have tags 1 & 2.
select postTag_post_id, count(1) cnt from (
select postTag_post_id from postTags where postTag_tag_id in (1, 2)
) group by postTag_post_id;`
That should give you back data like this:
postTag_post_id | cnt
1 | 2
Then you can join that result set back to your posts table.
select * from posts p,
(
select postTag_post_id, count(1) cnt from (
select postTag_post_id from postTags where postTag_tag_id in (1, 2)
) group by postTag_post_id;
) t
where p.post_id = t.postTag_post_id
and t.cnt >= 2;
If you need to do another join to the post_tags table in order to get the postTag_tag_id from the post_tag_name, your inner most query would change like so:
select postTag_post_id
from postTags a,
post_tags b
where a.postTag_tag_id = b.post_tag_id
and b.post_tag_name in ('tag 1', 'tag 2');
That should do the trick.
Assuming you already know tag IDs (1 and 2), you could do something like this:
SELECT post_id, post_title
FROM posts JOIN postTags ON (postTag_post_id = post_id)
WHERE postTag_tag_id IN (1, 2)
GROUP BY post_id, post_title
HAVING COUNT(DISTINCT postTag_tag_id) = 2
NOTE: DISTINCT is not necessary if there is an alternate key on postTags {postTag_tag_id, postTag_post_id}, as it should be.
NOTE: If you don't have tag IDs (and just have tag names), you'll need another JOIN (towards the post_tags table).
BTW, you should seriously consider ditching the surrogate PK in the junction table (postTags.postTag_id) and just having the natural PK {postTag_tag_id, postTag_post_id}. InnoDB tables are clustered, and secondary indexes in clustered tables are fatter and slower than in heap-based tables. Also, some queries can benefit from storing posts tagged by the same tag physically close together (or storing tags of the same post close together, if you reverse the PK).