How to select parent rows, whose children contain a set of subvalues? - mysql

I need to select rows from images where the set of tags belonging to an image contains at least all of the tags specified in a list of strings.
CREATE TABLE images (
image_checksum varchar(56) NOT NULL,
filename varchar(56),
PRIMARY KEY (image_checksum)
);
CREATE TABLE tags (
id int NOT NULL AUTO_INCREMENT,
name varchar(64),
confidence DECIMAL(5,2),
image varchar(56) NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY (image) REFERENCES images(image_checksum)
);
I have this query that returns all of the images with tags that contain ANY of the objects specified in the list. The list will be variable length depending on what comes in from the client. I have two images in my database specified. One of a dog, one of a cat. With the query I need -- I would expect to get zero results because neither image contains a dog AND a cat.
SELECT DISTINCT images.image_checksum, images.filename, tags.name, tags.confidence from images
LEFT OUTER JOIN tags ON (tags.image = images.image_checksum)
WHERE name in ('dog','cat');
Any help is appreciated!

You want window functions to count the number of matching tags. Then use that for filtering:
SELECT it.*
FROM (SELECT i.image_checksum, i.filename, t.name, tags.confidence ,
COUNT(*) OVER (PARTITION BY i.image_checksum) as num_tags
FROM images i JOIN
tags t
ON t.image = i.image_checksum
WHERE t.name in ('dog', 'cat')
) it
WHERE num_tags = 2;

you could use group_concat for this particular problem.
SELECT images.image_checksum, images.filename, tags.name,
tags.confidence
from images
LEFT OUTER JOIN tags ON (tags.image = images.image_checksum)
WHERE tags.image in (select t1.image
from tags t1
group by t1.image
having group_concat(t1.name order by t1.name asc) like '%cat,dog%');
This will return all images that have both tags, but will also return all the tags related to those images.
You would have to just make sure that the tags being searched are in alphabetical order, so that it may find them.
group_concat, by default, uses a comma as a separator for different values.
But, it can be overridden using the key word SEPARATOR
group_concat(tags.name SEPARATOR ', ')
More info can be obtained here

Related

SQL select DISTINCT values from multiple columns

I hope that I formulated tittle right.
I'm trying to make tag's for posts, like for example in instagram.
User can add up to 3 tags. I save them in each in separate database column. (tag1/tag2/tag3) and later want to display only distinct values and their count total. No matter in what column they are located.
For example I have 2 different mysql rows(posts)
row 1 have : tag1 = house, tag2 = kitchen, tag3 = null
row 2 have : tag1 = home, tag2 = garden, tag3 = house
And I want to display house(2)/kitchen(1)/garden(1)/home(1)
Result that I get : house(1)/kitchen(1)/garden(1)/home(1)/house(1) because each house are in different column.
I have database table (diy_posts):
Image
My idea of sql query:
SELECT DISTINCT p.tag1 as tag1, p.tag2 as tag2, p.tag3 as tag3,
SUM(CASE WHEN p.tag1=tag1 OR p.tag2=tag2 OR p.tag3=tag3 THEN 1 ELSE 0 END) as count FROM diy_posts p GROUP BY p.id
And displaying them like:
foreach ($diyTags as $tag) {
echo $tag['tag1']; echo $tag['count'];
echo $tag['tag2']; echo $tag['count'];
echo $tag['tag3']; echo $tag['count'];
}
You can do this by unpivoting the table:
select tag, count(*)
from ((select p.tag1 as tag from diy_posts p) union all
(select p.tag2 as tag from diy_posts p) union all
(select p.tag3 as tag from diy_posts p)
) pt
group by tag;
The need to do this suggests that you may not have the right data model. The more typical model would be:
create table postTags (
postTagid int auto_increment primary key,
postId int,
tag_number int,
tag varchar(255)
);
I do note that this will require a trigger to limit the number of tags to three -- if that is, indeed, desirable. This also makes it possible for the database to prevent duplicate tags, simply by defining a unique constraint or index.
Your Desired Output is not possible with Distinct Keyword.
Also for desired output your Table schema is wrong. it must be verticle instead of horizontal.
Still in the existing Table structure you can get desired output by doing comparision and counting process in ServerSide (PHP seems in your case).
Thats the only way in my guess to achive ouput in existing schema.
- However that will ruin performance when tag and post size increase hence you need to change your table schema or you can approach for the child table too.
Turn it into a single column and do the sum:
select tag, count(*) as counts
from
(
SELECT p.tag1 as tag FROM diy_posts
UNION ALL
SELECT p.tag2 as tag FROM diy_posts
UNION ALL
SELECT p.tag3 as tag FROM diy_posts
) tmp
GROUP BY tag;

Filtering MySQL Select based on joined table's fields

I have two tables. Created as follows.
CREATE TABLE item (
id INT AUTO_INCREMENT,
value VARCHAR(64),
PRIMARY KEY(id)
)
CREATE TABLE tag (
name VARCHAR(32),
item_id INT /* id of element in item table */
)
I have a select statement that returns a list of elements in the 'item' table along with all the elements of the 'tag' table linking to that table. It is filtered on the contents of the item.value field.
SELECT id,value,GROUP_CONCAT(tag.name) FROM item
LEFT JOIN tag ON tag.item_id = id
WHERE value LIKE '%test%'
All good so far. Now I want to do the same but get a list of all the item table elements with a certain tag associated with it. So I replace the WHERE query with
WHERE tag.name='test'
This gives me a list of all the 'item' elements which have the tag 'test' but the grouped tag list that come along with it only includes the tag 'test'.
How do I get a list of all the elements of the table 'item' which have tag 'test' along with the full group tag list?
First, you should have a GROUP BY in your original query:
SELECT i.id, i.value, GROUP_CONCAT(tag.name)
FROM item i LEFT JOIN
tag t
ON t.item_id = i.id
WHERE i.value LIKE '%test%'
GROUP BY i.id, i.value
To get only rows that have a certain tag, add:
HAVING SUM(t.name = 'test') > 0
after the GROUP BY.

SELECT in CONCAT

I need to concatenate two columns and I use concat as I see that this function can help me.
An example for concat function is:
SELECT CONCAT(column1,'SEPARATOR',column2) FROM table
And my query is like this:
SELECT
parent_id AS keep_in_mind_parent_id,
(SELECT name FROM table WHERE id = keep_in_mind_parent_id)
FROM
table
WHERE
id = 3
How should I concatenate those columns ? I try it with CONCAT, but doesn't seem to be working.
I'm assuming the table is a tree modeled with parent links (or an "adjacency" model), something like this:
CREATE TABLE `table` (
id INTEGER UNSIGNED PRIMARY KEY,
parent_id INTEGER UNSIGNED NOT NULL,
name VARCHAR(31) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
UNIQUE (parent_id, name)
);
If you want to find the row of a given table that is the parent of another row in the same table, you can INNER JOIN a table to itself on the parent link. Then you can distinguish values in the parent and child tables by the aliases you give in the JOIN. To concatenate the names of the parent and child rows for the child whose id is 3, the join would look like this:
SELECT child.parent_id, child.id AS id,
CONCAT(parent.name, ':', child.name) AS path,
CONCAT(parent.name, ' (#', parent.id, ')') AS parent_name_id
FROM `table` AS child
INNER JOIN `table` AS parent ON child.parent_id = parent.id
WHERE child.id = 3
Then tweak the CONCAT statements to suit the exact format of what you want.
I think you might want group_concat(). Without sample data and desired results, it is hard to tell, but one likely possibility is:
SELECT parent_id AS keep_in_mind_parent_id,
(SELECT group_concat(name) FROM table WHERE id = keep_in_mind_parent_id)
FROM table
WHERE id = 3;

Sql query - Select elements such that a condition is not met

I have this query in sqlite:
SELECT
'L_MEDIA_ARTIST'.'MEDIA_ID'
FROM \
'L_MEDIA_ARTIST',
'L_ARTIST_CAT',
'ARTIST_CAT'
WHERE
'L_ARTIST_CAT'.'ART_ID' == 'L_MEDIA_ARTIST'.'ART_ID'
AND
'L_ARTIST_CAT'.'ART_CAT_ID' == 'ARTIST_CAT'.'ID'
AND
('ARTIST_CAT'.'NAME' == 'SINGER' OR 'ARTIST_CAT'.'NAME' == 'ACTOR')
which just selects all the media id such that the artist has at least one of the tag 'SINGER' or 'ACTOR'.
How can I change this query in order to obtain the list of all media such that the actor has neither the tag 'SINGER' nor the tag 'ACTOR'?
The involved tables are built up has follows:
CREATE TABLE 'L_MEDIA_ARTIST' (
'MEDIA_ID' INTEGER,
'ART_ID' INTEGER,
FOREIGN KEY('MEDIA_ID') REFERENCES MEDIA('ID'),
FOREIGN KEY('ART_ID') REFERENCES ARTIST('ID'),
UNIQUE('MEDIA_ID', 'ART_ID'));
CREATE TABLE 'L_ARTIST_CAT' (
'ART_ID' INTEGER,
'ART_CAT_ID' INTEGER,
FOREIGN KEY('ART_ID') REFERENCES ARTIST('ID'),
FOREIGN KEY('ART_CAT_ID') REFERENCES ARTIST_CAT('ID'),
UNIQUE('ART_ID', 'ART_CAT_ID'));
CREATE TABLE 'ARTIST_CAT' (
'ID' INTEGER PRIMARY KEY,
'NAME' TEXT NOT NULL UNIQUE);
You need an aggregation query for this, because you have to check that none of the values for a media are in the list. Just looking on one row doesn't provide enough information:
SELECT l.MEDIA_ID
FROM L_MEDIA_ARTIST l JOIN
L_ARTIST_CAT ac
ON l.ART_ID = ac.ART_ID JOIN
ARTIST_CAT c
ON ac.ART_CAT_ID = c.ID
GROUP BY l.MEDIA_ID
HAVING SUM(CASE WHEN c.Name IN ('SINGER', 'ACTOR') THEN 1 ELSE 0 END) = 0;
Note that I also fixed the query:
Introduced proper join syntax. You should learn modern join syntax.
Added table aliases so the query is easier to write and to read.
Removed the single quotes around table and column names, which just cause syntax errors.
The HAVING clause counts the number of times that "SINGER" and "ACTOR" are found in the data. The = 0 ensures there are none for a given media.
The media IDs that you do not want can be retrieved with this query:
SELECT L_Media_Artist.Media_ID
FROM L_Media_Artist
JOIN L_Artist_Cat USING (Art_ID)
JOIN Artist_Cat ON L_Artist_Cat.Art_Cat_ID = Artist_Cat.ID
WHERE Artist_Cat.Name IN ('SINGER', 'ACTOR')
(This is the same as your first query.)
So you want all media that are not one of those:
SELECT ID
FROM Media
WHERE ID NOT IN (SELECT L_Media_Artist.Media_ID
FROM L_Media_Artist
JOIN L_Artist_Cat USING (Art_ID)
JOIN Artist_Cat ON L_Artist_Cat.Art_Cat_ID = Artist_Cat.ID
WHERE Artist_Cat.Name IN ('SINGER', 'ACTOR'))

Get all posts from a specific category

The Situation
As some of you might already know from my previous questions, I'm currently developing a Blog-system.
This time, I'm stuck at getting all posts from a specific category, with their category.
Database
Here are the SQL-commands to create the three required tables.
Post
create table Post(
headline varchar(100),
date datetime,
content text,
author int unsigned,
public tinyint,
type int,
ID serial,
Primary Key (ID),
)ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
author is the ID of the user who created the post, public determines if the post can be read from everyone or is just a draft and type determines if it's a blog-post (0) or something else.
Category
create table Kategorie(
name varchar(30),
short varchar(200),
ID serial,
Primary Key (name)
)ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Post_Kategorie
create table Post_Kategorie(
post_ID bigint unsigned,
kategorie_ID bigint unsigned,
Primary Key (post_ID, kategorie_ID),
Foreign Key (post_ID) references Post(ID),
Foreign Key (kategorie_ID) references Kategorie(ID)
)ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The Query
This is my current query to get all posts tagged with a specific category, which is determined by the category's ID:
SELECT Post.headline, Post.date, Post.ID,
CONCAT(
"[", GROUP_CONCAT('{"name":"',Kategorie.name,'","id":',Kategorie.ID,'}'), "]"
) as "categorys"
FROM Post
INNER JOIN Post_Kategorie
ON Post.ID = Post_Kategorie.post_ID
INNER JOIN Kategorie
ON Post_Kategorie.kategorie_ID = 2
WHERE Post.public = 1
AND Post.type = 0
GROUP BY Post.headline, Post.date
ORDER BY Post.date DESC
LIMIT 0, 20
The query works for listing all posts tagged with a specific category, but the categorys-column gets mixed up as every listed post has all available category's (every category listed in the Kategorie-table).
I'm sure the problem lays in the INNER JOIN-condition, but I have no clue where. Please point me in the right direction.
I suspect there might be issues with your CONCAT function, as it mixes different types of quotation marks. I think "[" and "]" should be respectively '[' and ']'.
Otherwise, the problem does seem to be with one of the joins. In particular, INNER JOIN Kategorie does not specify the joining condition, which, I think, should be Post_Kategorie.Kategorie_ID = Kategorie.ID.
There entire query should thus be something like this:
SELECT Post.headline, Post.date, Post.ID,
CONCAT(
"[", GROUP_CONCAT('{"name":"',Kategorie.name,'","id":',Kategorie.ID,'}'), "]"
) as "categorys"
FROM Post
INNER JOIN Post_Kategorie
ON Post.ID = Post_Kategorie.post_ID
INNER JOIN Kategorie
ON Post_Kategorie.Kategorie_ID = Kategorie.ID
WHERE Post.public = 1
AND Post.type = 0
GROUP BY Post.headline, Post.date
HAVING MAX(CASE Post_Kategorie.kategorie_ID WHEN 2 THEN 1 ELSE 0 END) = 1
ORDER BY Post.date DESC
LIMIT 0, 20
The Post_Kategorie.kategorie_ID = 2 condition has been modified to a CASE expression and moved to the HAVING clause, and it is used together with the MAX() aggregate function. This works as follows:
If a post is tagged with a tag or tags belonging to Kategorie.ID = 2, the CASE expression will return 1, and MAX will evaluate to 1 too. Consequently, all the group will be valid and remain in the output.
If no tag the post is tagged with belongs to the said category, the CASE expression will never evaluate to 1, nor will MAX. As a result, the entire group will be discarded.