How to query many-to-many relation with features table (AND condition) - mysql

I guess this is a common setting, but as I don't do that much SQL work, I can't get my head around this one... So, I've got a bunch of songs that have certain features (style of music, mood etc.) and I would like to select songs that are attributed some of these features (e. g. songs that are happy and euphoric).
SONG
+----+----------+
| id | title |
+----+----------+
| 1 | song #1 |
+----+----------+
| 2 | song #2 |
+----+----------+
FEATURE
+----+-------+----------+
| id | name | value |
+----+-------+----------+
| 1 | mood | sad |
+----+-------+----------+
| 2 | mood | happy |
+----+-------+----------+
| 3 | mood | euphoric |
+----+-------+----------+
| 4 | style | rock |
+----+-------+----------+
| 5 | style | jazz |
+----+-------+----------+
SONG_FEATURE
+---------+------------+
| song_id | feature_id |
+---------+------------+
| 1 | 1 |
+---------+------------+
| 2 | 1 |
+---------+------------+
| 2 | 2 |
+---------+------------+
I would like to select all the songs that have certain features with an AND condition. I would use this query for the OR-case.
SELECT
s.*,
f.*
FROM
song_feature sf
LEFT JOIN song s ON s.id = sf.song_id
LEFT JOIN feature f ON f.id = sf.feature_id
WHERE
(
f.name = 'style'
AND f.value = 'pop'
)
OR /* <-- this works, but I would like an AND condition */
(
f.name = 'style'
AND f.value = 'pop'
)
GROUP BY sf.song_id;
But this obviously does not work for the AND condition, so I guess I'm on the wrong track here... Any hints will be greatly appreciated.

You can do it with aggregation, if you filter the resultset of the joins and set the condition in the HAVING clause:
SELECT s.id, s.title
FROM SONG s
INNER JOIN SONG_FEATURE sf ON sf.song_id = s.id
INNER JOIN FEATURE f ON f.id = sf.feature_id
WHERE (f.name, f.value) IN (('mood', 'sad'), ('mood', 'happy'))
GROUP BY s.id, s.title
HAVING COUNT(DISTINCT f.name, f.value) = 2
See the demo.
Results:
> id | title
> -: | :------
> 2 | song #2

Related

Grouping results after join and having clause

Let's say I have three tables (mysql):
recipes
+----+----------------+--------------+
| id | title | image |
+----+----------------+--------------+
| 2 | recipe title 1 | banana image |
| 3 | recipe title 2 | potato image |
+----+----------------+--------------+
ingredient
+----+-----------+---------+---------------+
| id | recipe_id | food_id | quantity_kg |
+----+-----------+---------+---------------+
| 1 | 2 | 36 | 2.5 |
| 2 | 3 | 37 | 1.5 |
+----+-----------+---------+---------------+
food
+----+---------+-------+-----------+----------+
| id | name | price | foodType | unitType |
+----+---------+-------+-----------+----------+
| 36 | carrot | 2 | vegetable | kg |
| 37 | chicken | 12 | meat | kg |
+----+---------+-------+-----------+----------+
Now, I want to get all the recipes that are vegetarian, i.e. that don't contain any foods where foodType is 'meat' (or other animal product).
How do I perform such query?
Here is what I've tried so far:
SELECT
recipe.id as recipeId,
recipe.title as title,
food.type as foodType
FROM recipe
INNER JOIN ingredient on ingredient.recipe_id = recipe.id
INNER JOIN food on food.id = ingredient.aliment_id
HAVING
food.type <> 'meat' AND
food.type <> 'seafood' AND
food.type <> 'fish'
ORDER BY recipeId
This works (I get only vegetarian recipes) BUT it duplicates all the recipes, as long as they have multiple ingredients. eg. :
+----------+--------+----------+
| recipeId | title | foodType |
+----------+--------+----------+
| 5 | titleA | type1 |
| 5 | titleA | type2 |
| 5 | titleA | type3 |
| 8 | titleB | type2 |
| 8 | titleB | type5 |
| 8 | titleB | type1 |
| 8 | titleB | type3 |
+----------+--------+----------+
What I want to obtain is:
+----------+--------+
| recipeId | title |
+----------+--------+
| 5 | titleA |
| 8 | titleB |
+----------+--------+
I already tried getting rid of 'foodType' in the SELECT clause, but if I do so, mysql tells me : "Unknown column 'food.type' in 'having clause'"
I already tried to GROUP BY 'recipeId' right before HAVING clause, but I get that error : "Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'myDb.food.type' which is not functionally dependent on columns in GROUP BY clause" (I understand that error).
I guess it has to do with something like "Grouping results after join and having clause", but I might be wrong...
Thanks a lot
You don't have a GROUP BY clause, so you shouldn't have a HAVING clause. Use WHERE instead
Remove the unwanted columns from your SELECT
Because the joins are across 1:many relationships but you're only selecting on the "one" side, you probably also want SELECT DISTINT instead of just SELECT
Also, you have another issue: The logic of your query isn't actually correct, even though it's returning apparently correct results with such a small set of sample data.
When looking at composition, you probably want to use EXISTS and a subquery. Perhaps something like this (untested):
SELECT
recipe.id as recipeId,
recipe.title as title,
food.type as foodType
FROM recipe r
WHERE NOT EXISTS
(SELECT food.type
FROM ingredient INNER JOIN food on food.id = ingredient.aliment_id
WHERE ingredient.recipe_id = r.id AND
food.type IN ('meat', 'seafood','fish')
)
ORDER BY recipeId
just exclude all recipes which have at least one meat ingridient
(i did it 10 years ago even without sql)
SELECT recipe.id, recipe.title FROM recipe
WHERE recipe.id NOT IN (
SELECT
recipe.id,
FROM recipe
INNER JOIN ingredient
on ingredient.recipe_id = recipe.id
INNER JOIN food
on food.id = ingredient.aliment_id
AND food.type IN ('meat', 'seafood', 'fish')
)
ORDER BY recipeId

Selecting rows whose foreign rows ONLY match a single value

Say I have two tables --people and pets-- where each person may have more than one pet:
people:
+-----------+-------+
| person_id | name |
+-----------+-------+
| 1 | Bob |
| 2 | John |
| 3 | Pete |
| 4 | Waldo |
+-----------+-------+
pets:
+--------+-----------+--------+
| pet_id | person_id | animal |
+--------+-----------+--------+
| 1 | 1 | dog |
| 2 | 1 | dog |
| 3 | 1 | cat |
| 4 | 2 | cat |
| 5 | 3 | dog |
| 6 | 3 | tiger |
| 7 | 3 | tiger |
| 8 | 4 | tiger |
| 9 | 4 | tiger |
| 10 | 4 | tiger |
+--------+-----------+--------+
I'm trying to select the people who ONLY have tigers as pets. Obviously the only one that fits this criteria is Waldo, since Pete has a dog as well... but I'm having some trouble writing the query for this.
The most obvious case is select people.person_id, people.name from people join pets on people.person_id = pets.person_id where pets.animal = "tiger", but this returns Pete and Waldo.
It would be helpful if there was a clause like pets.animal ONLY = "tiger", but as far as I know this doesn't exist.
How could the query be written?
select people.person_id, people.name
from people
join pets on people.person_id = pets.person_id
where pets.animal = "tiger"
AND people.person_id NOT IN (select person_id from pets where animal != 'tiger');
Use group by and having:
select p.person_id
from pets p
group by p.person_id
having max(animal) = 'tiger' and min(animal) = 'tiger';
select distinct person_id
from pets
where animal = "tiger"
intersect
select distinct person_id
from pets
where animal = "tiger"
and person_id not in
(select person_id from pets where animal <> "tiger")
You can use intersect to select a person who only has tiger as his pet.
SELECT *
FROM people pp
WHERE EXISTS (SELECT * FROM pets pt
WHERE pt.person_id = pp.person_id
AND pt.animal = 'tiger'
)
AND NOT EXISTS (SELECT * FROM pets pt
WHERE pt.person_id = pp.person_id
AND pt.animal <> 'tiger'
);
If every person was guaranteed to have at least one pet, then the query could be as simple as:
select name
from people
where not exists (select 1
from pets
where pets.person_id = people.id and
pets.animal != 'tiger')
Or: return the people for whom there is no record that is not a tiger.
NOT EXISTS is executed as a very efficient anti-join, in which each row from people would be rejected as soon as a single non-tiger pet was found.

Complex MySQL query from 4 tables

So far I haven't been successful in my search on combining 4 tables. I have 4 tables:
content
content_id | source_id | content_title | content_date
--------------------------------------------------------
1 | 1 | The factory | 189982398300
2 | 2 | Cold and cloudy | 189982398299
3 | 2 | Green tea | 189982398298
sources
source_id | source_name
-------------------
1 | BBC
2 | Reuters
settings
setting_id | setting_name
-------------------------------
1 | bbc_iplayer_string
2 | reuters_video_id
3 | reuters_category
content_join_settings
content_id | setting_id | setting_data
------------------------------------------
1 | 1 | Js88sdhjsd0gDS09
2 | 2 | video_8AJK3ADJD8
2 | 3 | weather
3 | 2 | video_K7KD8N2ND9
3 | 3 | food and drinks
So as you might have already noticed every content record has it's own source linked to it, and comes with extra settings depending on the source. BBC posts have an bbc_iplayer_string, and posts from Reuters have a reuters_video_id and a reuters_category. Of course this is just an example.
I'd like to have the results combined, and with the setting_name as a column name. I'm not sure how to explain it so I'll show it:
content_id | source_name | content_title | content_date | bbc_iplayer_string | reuters_video_id | reuters_category
--------------------------------------------------------------------------------------------------------
1 | BBC | The factory | 189982398300 | Js88sdhjsd0gDS09 | NULL | NULL
2 | Reuters | Cold and cloudy | 189982398299 | NULL | video_8AJK3ADJD8 | weather
3 | Reuters | Green tea | 189982398298 | NULL | video_K7KD8N2ND9 | food and drinks
I'm not very skilled in complex MySQL queries. I don't know where to start looking, let alone what to search for. I started with this query but I have no idea how to continue. I'm probably not even close.
SELECT
*
FROM
content,
sources,
settings,
content_join_settings
WHERE
content.content_id = content_join_settings.content_id AND
settings.setting_id = content_join_settings.setting_id AND
sources.source_id = content.source_id
GROUP BY
content.content_id
ORDER BY
content.content_date
DESC
You need to join the tables, and then pivot the results to move rows into columns.
SELECT c.content_id, s.source_name, c.content_title, c.content_date,
MAX(CASE setting_name WHEN 'bbc_iplayer_string' THEN setting_data END) AS bbc_iplayer_string,
MAX(CASE setting_name WHEN 'reuters_video_id' THEN setting_data END) AS reuters_video_id,
MAX(CASE setting_name WHEN 'reuters_category' THEN setting_data END) AS reuters_category
FROM content AS c
INNER JOIN sources AS s ON s.source_id = c.source_id
INNER JOIN content_join_settings AS cjs ON cjs.content_id = c.content_id
INNER JOIN settings as st ON st.setting_id = cjs.setting_id
GROUP BY c.content_id
DEMO
Would this help ?
SELECT
*
FROM
content as ct inner join sources as ss on ct.source_id = ss.source_id
inner join content_join_settings as cst on ct.setting_id = cst.setting_id
inner join settings as st on st.setting_id = cst.setting_id
GROUP BY
content.content_id
ORDER BY
content.content_date
DESC

Fastest way to select min row with join

In this example, I have a listing of users (main_data), a pass list (pass_list) and a corresponding priority to each pass code type (pass_code). The query I am constructing is looking for a list of users and the corresponding pass code type with the lowest priority. The query below works but it just seems like there may be a faster way to construct it I am missing. SQL Fiddle: http://sqlfiddle.com/#!2/2ec8d/2/0 or see below for table details.
SELECT md.first_name, md.last_name, pl.*
FROM main_data md
JOIN pass_list pl on pl.main_data_id = md.id
AND
pl.id =
(
SELECT pl2.id
FROM pass_list pl2
JOIN pass_code pc2 on pl2.pass_code_type = pc2.type
WHERE pl2.main_data_id = md.id
ORDER BY pc2.priority
LIMIT 1
)
Results:
+------------+-----------+----+--------------+----------------+
| first_name | last_name | id | main_data_id | pass_code_type |
+------------+-----------+----+--------------+----------------+
| Bob | Smith | 1 | 1 | S |
| Mary | Vance | 8 | 2 | M |
| Margret | Cough | 5 | 3 | H |
| Mark | Johnson | 9 | 4 | H |
| Tim | Allen | 13 | 5 | M |
+------------+-----------+----+--------------+----------------+
users (main_data)
+----+------------+-----------+
| id | first_name | last_name |
+----+------------+-----------+
| 1 | Bob | Smith |
| 2 | Mary | Vance |
| 3 | Margret | Cough |
| 4 | Mark | Johnson |
| 5 | Tim | Allen |
+----+------------+-----------+
pass list (pass_list)
+----+--------------+----------------+
| id | main_data_id | pass_code_type |
+----+--------------+----------------+
| 1 | 1 | S |
| 3 | 2 | E |
| 4 | 2 | H |
| 5 | 3 | H |
| 7 | 4 | E |
| 8 | 2 | M |
| 9 | 4 | H |
| 10 | 4 | H |
| 11 | 5 | S |
| 12 | 3 | S |
| 13 | 5 | M |
| 14 | 1 | E |
+----+--------------+----------------+
Table which specifies priority (pass_code)
+----+------+----------+
| id | type | priority |
+----+------+----------+
| 1 | M | 1 |
| 2 | H | 2 |
| 3 | S | 3 |
| 4 | E | 4 |
+----+------+----------+
Due to mysql's unique extension to its GROUP BY, it's simple:
SELECT * FROM
(SELECT md.first_name, md.last_name, pl.*
FROM main_data md
JOIN pass_list pl on pl.main_data_id = md.id
ORDER BY pc2.priority) x
GROUP BY md.id
This returns only the first row encountered for each unique value of md.id, so by using an inner query to order the rows before applying the group by you get only the rows you want.
A version that will get the details as required, and should also work across different flavours of SQL
SELECT md.first_name, md.last_name, MinId, pl.main_data_id, pl.pass_code_type
FROM main_data md
INNER JOIN pass_list pl
ON md.id = pl.main_data_id
INNER JOIN pass_code pc
ON pl.pass_code_type = pc.type
INNER JOIN
(
SELECT pl.main_data_id, pl.pass_code_type, Sub0.MinPriority, MIN(pl.id) AS MinId
FROM pass_list pl
INNER JOIN pass_code pc
ON pl.pass_code_type = pc.type
INNER JOIN
(
SELECT main_data_id, MIN(priority) AS MinPriority
FROM pass_list a
INNER JOIN pass_code b
ON a.pass_code_type = b.type
GROUP BY main_data_id
) Sub0
ON pl.main_data_id = Sub0.main_data_id
AND pc.priority = Sub0.MinPriority
GROUP BY pl.main_data_id, pl.pass_code_type, Sub0.MinPriority
) Sub1
ON pl.main_data_id = Sub1.main_data_id
AND pl.id = Sub1.MinId
AND pc.priority = Sub1.MinPriority
ORDER BY pl.main_data_id
This does not rely on the flexibility of MySQLs GROUP BY functionality.
I'm not familiar with the special behavior of MySQL's group by, but my solution for these types of problems is to simply express as where there doesn't exist a row with a lower priority. This is standard SQL so should work on any DB.
select distinct u.id, u.first_name, u.last_name, pl.pass_code_type, pc.id, pc.priority
from main_data u
inner join pass_list pl on pl.main_data_id = u.id
inner join pass_code pc on pc.type = pl.pass_code_type
where not exists (select 1
from pass_list pl2
inner join pass_code pc2 on pc2.type = pl2.pass_code_type
where pl2.main_data_id = u.id and pc2.priority < pc.priority);
How well this performs is going to depend on having the proper indexes (assuming that main_data and pass_list are somewhat large). In this case indexes on the primary (should be automatically created) and foreign keys should be sufficient. There may be other queries that are faster, I would start by comparing this to your query.
Also, I had to add distinct because you have duplicate rows in pass_list (id 9 & 10), but if you ensure that duplicates can't exist (unique index on main_data_id, pass_code_type) then you will save some time by removing the distinct which forces a final sort of the result set. This savings would be more noticeable the larger the result set is.

MySQL JOIN - Return NULL for duplicate results in left table

I believe this is a pretty simple thing, and I swear I've done it before but I can't remember how.
So let's say I have a one-to-many relationship. I want to JOIN the two tables, but not allow duplicates for the left table.
SQLFIDDLE
So based on the above SQLFiddle, my results would be:
categories.title | items.NAME | items.category_id
-----------------------------------------------------
red | apple | 1
red | car | 1
red | paper | 1
yellow | lego | 2
yellow | banana | 2
blue | pen | 3
I want it to be:
categories.title | items.NAME | items.category_id
-----------------------------------------------------
red | apple | 1
NULL | car | 1
NULL | paper | 1
yellow | lego | 2
NULL | banana | 2
blue | pen | 3
My reasoning is that this way, I can easily loop over the results without having to do any further processing with PHP.
You can replace the values with something like this:
select
case when rownum = 1 then title else null end title,
name,
category_id
from
(
SELECT c.title,
i.name,
i.category_id,
#row:=(case when #prev=title and #precat=category_id
then #row else 0 end) + 1 as rownum,
#prev:=title ptitle,
#precat:=category_id pcat
FROM items AS i
INNER JOIN categories AS c
ON c.id = i.category_id
order by i.category_id, c.title
) src
order by category_id, rownum
See SQL Fiddle with Demo
The result is:
| TITLE | NAME | CATEGORY_ID |
---------------------------------
| red | apple | 1 |
| (null) | car | 1 |
| (null) | paper | 1 |
| yellow | lego | 2 |
| (null) | banana | 2 |
| blue | pen | 3 |
It might be a long time ago when this was post. But I'll post my answer to the future readers. There is another process that is light and quick to understand.
You can make good use of variables. No subqueries are necessary.
SET #previous:="";
SELECT
IF(C.title=#previous, "", #previous:=C.title) AS Titles,
I.name, I.category_id
FROM items I
INNER JOIN categories AS C ON C.id = I.category_id
ORDER BY I.id, I.name
#previous is the variable that is being used.
SQL FIDDLE DEMO