Selecting records where all joined rows match - mysql

Given the following schema:
employees
id | name
employee_attributes
id | employee_id | key | value
I would like to select all employees that have the provided attributes.
The following statement works:
SELECT employees.* FROM employees
INNER JOIN employee_attributes ON employee_attributes.employee_id = employees.id
WHERE employee_attributes.key = 'foo' AND employee_attributes.value = 'bar'
but only allows me to find an employee by one attribute. How can I adapt this to retrieve employees by more than one attribute?
To be clear, if I supply two sets of attributes to match against, the query should only return employees that have at least those two attributes.
For example, if Bob has just one attribute:
key | value
===========
foo | bar
But I supply two attributes to the query (foo and bar, bin and baz), Bob should not be returned.

Following should work:
SELECT employees.id, employees.name, count(employee_attributes.id) as attribute_count FROM employees
INNER JOIN employee_attributes ON employee_attributes.employee_id = employees.id
WHERE (employee_attributes.key = 'foo' AND employee_attributes.value = 'bar') OR (employee_attributes.key = 'bin' AND employee_attributes.value = 'baz')
group by employees.id, employees.name
having attribute_count >= 2;

You can get the employee ids using aggregation:
SELECT ea.employee_id
FROM employee_attributes.employee_id
WHERE (ea.key = 'foo' AND ea.value = 'bar') OR
(ea.key = 'bin' AND ea.value = 'baz')
GROUP BY ea.employee_id
HAVING COUNT(DISTINCT ea.key) = 2;
For the full information, you can use a JOIN:
SELECT e.*
FROM employee e JOIN
(SELECT ea.employee_id
FROM employee_attributes.employee_id
WHERE (ea.key = 'foo' AND ea.value = 'bar') OR
(ea.key = 'bin' AND ea.value = 'baz')
GROUP BY ea.employee_id
HAVING COUNT(DISTINCT ea.key) = 2
) ea
ON ea.employee_id = e.id;

Use conditional aggregation:
SELECT employees.*
FROM employees
INNER JOIN employee_attributes
ON employee_attributes.employee_id = employees.id
GROUP BY employee_attributes.employee_id
HAVING SUM(CASE WHEN employee_attributes.key = 'foo' AND
employee_attributes.value = 'bar' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN employee_attributes.key = 'bin' AND
employee_attributes.value = 'baz' THEN 1 ELSE 0 END) > 0

Related

Matching Exactly all values in IN clause

I am searching for the solution for this problem for hours now with no luck. I have a Workouts table as below. Each item in the workout table can have multiple target muscles, which are listed in the Target Muscles table.
Workouts table:
id
1
2
Target Muscles table:
id
muscle_key
workout_id
1
a
1
2
b
1
3
c
1
4
a
2
5
b
2
I need to fetch all items in the workouts table which match EXACTLY ALL target muscles keys in the given set, not less and not more. For example, given the set of muscle keys:
(a,b)
The desired output would be:
id
2
The row for workout id = 1 should NOT be selected since it contains an extra muscle key (c).
I am using the following query:
SELECT id
FROM workouts
LEFT JOIN target_muscles ON workouts.id = target_muscles.workout_id
WHERE target_muscles.muscle_key IN (a,b)
GROUP BY workouts.id
HAVING COUNT(DISTINCT target_muscles.muscle_key) = 2
The above query is also returning the workout id = 1, instead of only 2. How can I achieve this?
Any help is appreciated.
Skip the WHERE clause. Use HAVING to make sure exactly a and b are there.
SELECT workouts.id
FROM workouts
JOIN target_muscles ON workouts.id = target_muscles.workout_id
GROUP BY workouts.id
HAVING COUNT(DISTINCT target_muscles.muscle_key) =
COUNT(DISTINCT CASE WHEN target_muscles.muscle_key IN (a,b)
THEN target_muscles.muscle_key END)
AND COUNT(DISTINCT target_muscles.muscle_key) = 2
Can also be done as:
SELECT workouts.id
FROM workouts
JOIN target_muscles ON workouts.id = target_muscles.workout_id
GROUP BY workouts.id
HAVING MIN(target_muscles.muscle_key) = 'a'
AND MAX(target_muscles.muscle_key) = 'b'
AND COUNT(DISTINCT target_muscles.muscle_key) = 2
Or, perhaps less performant:
SELECT workouts.id
FROM workouts
JOIN (SELECT workout_id FROM target_muscles WHERE muscle_key = 'a'
INTERSECT
SELECT workout_id FROM target_muscles WHERE muscle_key = 'b'
EXCEPT
SELECT workout_id FROM target_muscles WHERE muscle_key NOT IN ('a', 'b')) dt
ON workouts.id = dt.workout_id
You can remove the filtering clause and use two conditions:
count of non-(a,b) muscle_keys = 0
distinct count of (a,b) muscle_keys = 2
SELECT w.id
FROM workouts w
LEFT JOIN target_muscles ts ON w.id = ts.workout_id
GROUP BY w.id
HAVING COUNT(CASE WHEN ts.muscle_key NOT IN ('a', 'b') THEN w.id END) = 0
AND COUNT(DISTINCT CASE WHEN ts.muscle_key IN ('a', 'b') THEN ts.muscle_key END) = 2
Check the demo here.
This is an other way to do it using inner join
select distinct s.id
from (
select w.id
from workouts w
inner join targets t on t.workout_id = w.id
group by workout_id
having count(1) = 2
) as s
inner join targets t on t.workout_id = s.id and t.muscle_key in ('a', 'b');
You can Try it from here : https://dbfiddle.uk/nPTQlhqT

DISTINCT on one value from a group selects

I have following sql query
select devices_device.id , devices_device.code, sss.id as "site_id", sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false
order by devices_device.id, devices_device.start_date
I now get a list of device id's. Some of them are the same. I want to do a distinct so I only keep the first record for every device (and due to order by on start_date that would be the most recent device record for that device)
How do I do this? If I do
select distinct devices_device.id , devices_device.code, sss.id as "site_id", sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false
order by devices_device.id, devices_device.start_date
nothing happens
You can use the ROW_NUMBER() window function to identify the row you want. Then filtering out the other ones is easy.
For example:
select *
from (
select
d.id, d.start_date, d.code,
s.id as "site_id", s.name as "site_name",
row_number() over(partition by d.id order by start_date desc) as rn
from devices_device d
inner join st_site_site s on d.site_id = s.id
where d.deleted = false
) x
where rn = 1
order by id, start_date
In this query the ROW_NUMBER() value will be 1 for the latest row in each device group. That's how the filtering at the end removes all other rows greater than 1.
NOTE: In case there are collisions (two rows with the same recent start_date) this query will always return a single [though random] row between them.
You should probably use a GROUP BY. Something like:
select distinct devices_device.id , devices_device.code, sss.id as "site_id",
sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false
group by devices_device.id
order by devices_device.start_date
You could test for the min start date
drop table if exists devices_device,st_site_site;
create table devices_device(id int,code int,site_id int,start_date date,deleted int);
create table st_site_site(id int,name varchar(10));
insert into devices_device values(1,10,1,'2020-10-01',0),(1,20,1,'2020-09-01',0);
insert into st_site_site values(1,'aaa');
select devices_device.id , devices_device.code, sss.id as "site_id", sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false and
devices_device.start_date = (select min(d1.start_date) from devices_device d1 where d1.id = devices_device.id)
order by devices_device.id;
+------+------+---------+-----------+
| id | code | site_id | site_name |
+------+------+---------+-----------+
| 1 | 20 | 1 | aaa |
+------+------+---------+-----------+
1 row in set (0.001 sec)

How to find the shops whose customers are only men?

I have this schema:
PERSON(Name, Sex)
FREQUENTS(Name, Shop)
My question is, how do I find the shops whose clients are exclusively men?
You can use the following using a GROUP BY with HAVING:
SELECT frequents.shop
FROM frequents LEFT JOIN person ON frequents.name = person.name
GROUP BY frequents.shop
HAVING SUM(person.sex = 'female') = 0 AND SUM(person.sex = 'male') > 0
demo at dbfiddle.uk
select shop
from
(select shop , (case when sex = 'Male' then 1 else 2 end)s_cnt
from frequents a11
join person a12
on a11.name = a12.name
group by shop , (case when sex = 'Male' then 1 else 2 end)
) a11
group by shop
having sum(s_cnt) = 1
For example NOT EXISTS
select distinct shop
from frequents
where not exists (
select 1
from person
where person.name = frequents.name and person.sex = 'female'
)
However, according to this test is may be better to use IS NULL approach:
select distinct shop
from frequents
left join person on person.name = frequents.name and person.sex = 'female'
where person.name is null
If you want exclusively men, then I would think:
SELECT f.shop
FROM frequents f JOIN
person p
ON f.name = p.name
GROUP BY f.shop
HAVING MIN(p.sex) = MAX(p.sex) AND -- all sex values are the same or NULL
COUNT(p.sex) = COUNT(*) AND -- no NULL values
MIN(p.sex) = 'male' -- the value is male
This version does not assume that there are only two genders.

How can those two SQL statements be combined into one?

I wrote and would like to combine these 2 sql, one is based on results of another. I checked this post, but looks like its not results based. How could I achieve it ?
First sql:
SELECT
`potential`.*,
`customer`.`ID` as 'FID_customer'
FROM
`os_potential` as `potential`,
`os_customer` as `customer`
WHERE `potential`.`FID_author` = :randomID
AND `potential`.`converted` = 1
AND `potential`.`street` = `customer`.`street`
AND `potential`.`zip` = `customer`.`zip`
AND `potential`.`city` = `customer`.`city`;
Second sql:
SELECT
sum(`order`.`price_customer`) as 'Summe'
FROM
`os_order` as `order`,
`RESUTS_FROM_PREVIOUS_SQL_STATEMENT` as `results`
WHERE `order`.`FID_status` = 10
AND `results`.`FID_customer` = `order`.`FID_customer`;
I would like to get everything from first sql + the 'Summe' from second sql.
TABLES
1.Potentials:
+----+------------+-----------+--------+-----+------+
| ID | FID_author | converted | street | zip | city |
+----+------------+-----------+--------+-----+------+
2.Customers:
+----+--------+-----+------+
| ID | street | zip | city |
+----+--------+-----+------+
3.Orders:
+----+--------------+----------------+
| ID | FID_customer | price_customer |
+----+--------------+----------------+
SELECT p.*
, c.ID FID_customer
, o.summe
FROM os_potential p
JOIN os_customer c
ON c.street = p.street
AND c.zip = p.zip
AND c.city = p.city
JOIN
( SELECT FID_customer
, SUM(price_customer) Summe
FROM os_order
WHERE FID_status = 10
GROUP
BY FID_customer
) o
ON o.FID_customer = c.ID
WHERE p.FID_author = :randomID
AND p.converted = 1
;
You would just write a single query like this:
SELECT sum(o.price_customer) as Summe
FROM os_order o JOIN
os_potential p JOIN
os_customer c
ON p.street = c.street AND p.zip = c.zip AND p.city = c.city JOIN
os_order o2
ON o2.FID_customer = c.FID_customer
WHERE p.FID_author = :randomID AND p.converted = 1 AND
o2.FID_status = 10 ;
Notes:
Never use commas in the FROM clause. Always use explicit JOIN syntax with conditions in an ON clause.
Table aliases are easier to follow when they are short. Abbreviations for the table names is commonly used.
Backticks are only necessary when the table/column name needs to be escaped. Yours don't need to be escaped.
If the 1st query return 1 record per customer, then just simply join the 3 tables, keep the sum and use the group by clause:
SELECT
`potential`.*,
`customer`.`ID` as 'FID_customer',
sum(`order`.`price_customer`) as Summe
FROM
`os_potential` as `potential`
INNER JOIN
`os_customer` as `customer`
ON `potential`.`street` = `customer`.`street`
AND `potential`.`zip` = `customer`.`zip`
AND `potential`.`city` = `customer`.`city`
LEFT JOIN
`os_order` as `order`
ON `results`.`FID_customer` = `order`.`FID_customer`
AND `order`.`FID_status` = 10
WHERE `potential`.`FID_author` = :randomID
AND `potential`.`converted` = 1
GROUP BY `customer`.`ID`, <list all fields from potential table>
If the 1st query may return multiple records per customer, then you need to do the summing in a subquery:
SELECT
`potential`.*,
`customer`.`ID` as 'FID_customer',
`order`.Summe
FROM
`os_potential` as `potential`
INNER JOIN
`os_customer` as `customer`
ON `potential`.`street` = `customer`.`street`
AND `potential`.`zip` = `customer`.`zip`
AND `potential`.`city` = `customer`.`city`
LEFT JOIN
(SELECT FID_customer, sum(price_customer) as Summe
FROM `os_order`
WHERE FID_status=10
GROUP BY FID_customer
) as `order`
ON `results`.`FID_customer` = `order`.`FID_customer`
WHERE `potential`.`FID_author` = :randomID
AND `potential`.`converted` = 1
I think you should use a subselect, but be careful with the number of results, it's not the best for performance.
You can do something like this:
SELECT n1, n2, (select count(1) from whatever_table) as n3, n4 from whatever_table
note that the subselect must return just 1 result, in other case you'll have an error

MySql: How to check field has data in MySql?

Here is my code and result :
SELECT DISTINCT
CAL.CarListingId,
(SELECT
IF(REPLACE(carImage.ImageUrl,'~','') IS NULL,'asdf',REPLACE(carImage.ImageUrl,'~',''))
FROM
carImage
WHERE
IsMainImage = 1 AND Status = 1
AND CarListingId = CAL.CarListingId) AS ImageUrl,
CAL.ListingNumber,
CAL.Caption,
CAL.Year,
CAL.Km,
CAL.Color,
CAL.Price,
CONCAT((SELECT Name FROM City WHERE CityId IN (SELECT CityId FROM County WHERE CountyId = CAL.CountyId)),'/', (SELECT Name FROM County WHERE CountyId = CAL.CountyId)) AS Region,
CAL.Creation
FROM
carlisting AS CAL
INNER JOIN
User AS U ON U.UserId = CAL.CreatedBy
INNER JOIN
carlistingcategory AS CLC ON CLC.CarListingId = CAL.CarListingId
LEFT JOIN CarImage AS CI ON CI.CarListingId = CAL.CarListingId
ORDER BY CAL.Creation;
I use this query as a subquery in another query. I need to check this query's result if it is `NULL`. But as you can see there is no data so `IS NULL` returns false. How can I check the sub query has data ?
try this query:
SELECT DISTINCT
CAL.CarListingId,
CAL.ListingNumber,
CAL.Caption,
CAL.Year,
CAL.Km,
CAL.Color,
CAL.Price,
CONCAT((SELECT Name FROM City WHERE CityId IN (SELECT CityId FROM County WHERE CountyId = CAL.CountyId)),'/', (SELECT Name FROM County WHERE CountyId = CAL.CountyId)) AS Region,
CAL.Creation,
( case when CI.ImageUrl IS NULL then 'asdf' else CI.ImageUrl
end)
FROM
carlisting AS CAL
LEFT JOIN CarImage AS CI ON CI.CarListingId = CAL.CarListingId
INNER JOIN User AS U ON U.UserId = CAL.CreatedBy
INNER JOIN carlistingcategory AS CLC ON CLC.CarListingId = CAL.CarListingId
ORDER BY CAL.Creation;