Selecting rows with multiple values from other rows with mySQL - mysql

Say I have the following table, named data:
ID foo1 foo2 foo3
1 11 22 33
2 22 17 92
3 31 33 53
4 53 22 11
5 43 23 9
I want to select all rows where either foo1, foo2 or foo3 match either of these columns in the first row. That is, I want all rows where at least one of the foos appears also in the first row. In the example above, I want to select rows 1, 2, 3 and 4. I thought that I could use something like
SELECT * FROM data WHERE foo1 IN (SELECT foo1,foo2,foo3 FROM data WHERE ID=1)
OR foo2 IN (SELECT foo1,foo2,foo3 FROM data WHERE ID=1)
OR foo3 IN (SELECT foo1,foo2,foo3 FROM data WHERE ID=1)
but this does not seem to work. I can, of course, use
WHERE foo1=(SELECT foo1 FROM data WHERE ID=1)
OR foo1=(SELECT foo2 FROM data WHERE ID=1)
OR ...
but that would invlove many lines, and in my real data set there are actually 16 columns, so it will really be a pain in the lower back. Is there a more sophisticated way to do so?
Also, what should I do if I want to count also the number of hits (in the example above, get 4 for row 1, 2 for row 4, and 1 for rows 2,3)?

SELECT data.*,
(data.foo1 IN (t.foo1, t.foo2, t.foo3))
+ (data.foo2 IN (t.foo1, t.foo2, t.foo3))
+ (data.foo3 IN (t.foo1, t.foo2, t.foo3)) AS number_of_hits
FROM data JOIN data t ON t.id = 1
WHERE data.foo1 IN (t.foo1, t.foo2, t.foo3)
OR data.foo2 IN (t.foo1, t.foo2, t.foo3)
OR data.foo3 IN (t.foo1, t.foo2, t.foo3)
See it on sqlfiddle.
Actually, on reflection, you might consider normalising your data:
CREATE TABLE data_new (
ID BIGINT UNSIGNED NOT NULL,
foo_number TINYINT UNSIGNED NOT NULL,
val INT,
PRIMARY KEY (ID, foo_number),
INDEX (val)
);
INSERT INTO data_new
(ID, foo_number, val)
SELECT ID, 1, foo1 FROM data
UNION ALL SELECT ID, 2, foo2 FROM data
UNION ALL SELECT ID, 3, foo3 FROM data;
DROP TABLE data;
Then you can do:
SELECT ID,
MAX(IF(foo_number=1,val,NULL)) AS foo1,
MAX(IF(foo_number=2,val,NULL)) AS foo2,
MAX(IF(foo_number=3,val,NULL)) AS foo3,
number_of_hits
FROM data_new JOIN (
SELECT d1.ID, COUNT(*) AS number_of_hits
FROM data_new d1 JOIN data_new d2 USING (val)
WHERE d2.ID = 1
GROUP BY d1.ID
) t USING (ID)
GROUP BY ID
See it on sqlfiddle.
As you can see from the execution plan, this will be considerably more efficient for large data sets.

There are several ways to get the result set.
Here's one approach, (if you don't care about which fooN gets matched with with fooN, and also want to return that "first" row).
SELECT DISTINCT d.*
JOIN ( SELECT foo1 AS foo FROM data WHERE id = 1
UNION ALL
SELECT foo2 FROM data WHERE id = 1
UNION ALL
SELECT foo3 FROM data WHERE id = 1
) f
JOIN data d
ON f.foo IN (d.foo1, d.foo2, d.foo3)
That ON clause could also be written like this:
ON d.foo1 = f.foo
OR d.foo2 = f.foo
OR d.foo2 = f.foo
To get a "count" of the hits...
SELECT d.id
, d.foo1
, d.foo2
, d.foo3
, SUM( IFNULL(d.foo1=f.foo,0)
+IFNULL(d.foo2=f.foo,0)
+IFNULL(d.foo3=f.foo,0)
) AS count_of_hits
JOIN ( SELECT foo1 AS foo FROM data WHERE id = 1
UNION ALL
SELECT foo2 FROM data WHERE id = 1
UNION ALL
SELECT foo3 FROM data WHERE id = 1
) f
JOIN data d
ON f.foo IN (d.foo1, d.foo2, d.foo3)
GROUP
BY d.id
, d.foo1
, d.foo2
, d.foo3
eggyal is right, as usual. Getting the count of hits is actually much simpler: we can just use a SUM(1) or COUNT(1) aggregate, no need to run all those comparisons, we've already done all the necessary comparisons.
SELECT d.id
, d.foo1
, d.foo2
, d.foo3
, COUNT(1) AS count_of_hits
JOIN ( SELECT foo1 AS foo FROM data WHERE id = 1
UNION ALL
SELECT foo2 FROM data WHERE id = 1
UNION ALL
SELECT foo3 FROM data WHERE id = 1
) f
JOIN data d
ON f.foo IN (d.foo1, d.foo2, d.foo3)
GROUP
BY d.id
, d.foo1
, d.foo2
, d.foo3

Related

SQL Join where column a in table 1 matches a value from 1 of 4 columns (a-d) in table 2

Goal:
Trying to join together two tables
Table structure:
t1:
name | id
t2:
id_a | id_b | id_c | id_d | favorite color
Problem:
I'm trying to find out the favorite color that corresponds to each name, where the t1.id is found in 1 of the 4 id fields in t2. The tricky part is that the non-matching values aren't null, so a coalesce doesn't work.
What I've tried:
Tried a case when statement in the join, but that seems to be creating some endless loop that is never finishing.
Trying a union, but that is creating some unexpected duplication.
Also tried a multi- on condition (like below), but that's not working:
WITH test AS (
SELECT
t1.*
, t2.*
FROM t1
LEFT JOIN t2
ON ( t1.id = t2.id_a
OR t1.id = t2.id_b
OR t1.id = t2.id_c
OR t1.id = t2.id_d
)
)
SELECT COUNT(*) FROM test
;
Here's an example dataset:
WITH names AS(
SELECT
1 as id , 'alfred' as name
UNION ALL SELECT 2, 'becca'
UNION ALL SELECT 3, 'charlie'
UNION ALL SELECT 4, 'dezi'
)
, color AS(
SELECT
1 as id_a, 6 as id_b, 9 as id_c, 7 as id_d, 'green' as fave_color
UNION ALL SELECT 1,2,6,5, 'orange'
UNION ALL SELECT 5,7,9,3, 'blue'
UNION ALL SELECT 9,4,6,8, 'black'
)
SELECT
n.id
, n.name
, c.fave_color
FROM color c
LEFT JOIN names n
ON n.id IN (c.id_a,c.id_b,c.id_c,c.id_d)
GROUP BY 1,2,3
ORDER BY 1,2
;

Slow query on counting items

I have a list with filters and I have to count how many items are in each one filter, but the following query gets slower and much slower when multiple filters are set
SELECT COUNT(*) FROM (
SELECT itf.`filter_id`
FROM `item_to_filter` AS `itf`
JOIN `item_to_inventory` AS `iti` ON (itf.`item_id` = iti.`item_id` AND iti.`quantity` > 0)
WHERE 1 = 1
AND (
(itf.`filter_group_id` = 2 AND itf.`filter_id` IN (1))
OR (itf.`filter_group_id` = 4 AND itf.`filter_id` IN (55)) //gets slower
OR (itf.`filter_group_id` = 1 AND itf.`filter_id` IN (107, 108)) //gets much slower
)
GROUP BY itf.`item_id`
HAVING COUNT(DISTINCT itf.`filter_group_id`) = 3
) AS `total_items`
Is there any other way to write the query to count items from each filter
Here you can see the tables structure and data (are the indexes correctly sets?)
Try this one: Wrap the conditions into a UNION ALL subquery. Then join it with your tables.
SELECT COUNT(*) FROM (
SELECT itf.filter_id
FROM (
SELECT 2 as filter_group_id 1 as filter_id UNION ALL
SELECT 4 as filter_group_id 55 as filter_id UNION ALL
SELECT 1 as filter_group_id 107 as filter_id UNION ALL
SELECT 1 as filter_group_id 108 as filter_id
) f -- filters
JOIN item_to_filter AS itf USING(filter_group_id, filter_id)
JOIN item_to_inventory AS iti ON itf.item_id = iti.item_id
WHERE iti.quantity > 0
GROUP BY itf.item_id
HAVING COUNT(DISTINCT itf.filter_group_id) = 3
) AS total_items
You should define an index on item_to_filter(filter_group_id, filter_id). But as I wrote in the comment, it's possible that the engine can do the same optimization if that index is present.
To make a better use of the given indexes you could rewrite your query to
SELECT COUNT(*) FROM (
SELECT item_id
FROM item_to_inventory AS iti
JOIN item_to_filter AS itf2 USING(item_id) -- itf2: group_id = 2
JOIN item_to_filter AS itf4 USING(item_id) -- itf4: group_id = 4
JOIN item_to_filter AS itf1 USING(item_id) -- itf1: group_id = 1
WHERE iti.quantity > 0
AND itf2.filter_group_id = 2 AND itf2.filter_id IN (1)
AND itf4.filter_group_id = 4 AND itf4.filter_id IN (55)
AND itf1.filter_group_id = 1 AND itf1.filter_id IN (107,108)
) AS total_items
But even here an index on item_to_filter(filter_group_id, filter_id) should improve the performance.

how to SELECT and Concat() column values based on several conditions SQL query?

newbie here to SQL. So I have two tables, let's take for example the two tables below.
Table A
set_num s_id s_val
100 3 AA
100 5 BB
200 3 AA
200 9 CC
Table B
s_id s_val phrase seq
1 DD 'hi' 'first'
3 AA 'hello' 'first'
6 EE 'goodnight' 'first'
5 BB 'world' 'second'
9 CC 'there' 'second'
4 FF 'bye' 'first'
I want to join Table A with Table B on two columns, like a composite key (s_id, s_val), and I want to return
set_num from Table A and the concatenation of phrases in Table B (which we will call entire_phrase, concat(...) AS entire_phrase).
The concatenation should also follow an order in which the phrases are to be concatenated. This will be determined by seq column in Table B for each phrase. "First" will indicate this phrase needs to come first and "Second", well comes next. I will like to do this with a SELECT query but not sure if this is possible without it getting to complex. Can I do this in SELECT or does this call for another approach?
Expected Output:
set_num entire_phrase
100 'hello world'
200 'hello there'
And not
set_num entire_phrase
100 'world hello'
200 'there hello'
Any help/approach will be greatly appreciated!
You can do it like this:
select temp1.set_num, concat(phrase1,' ',phrase2) as entire_phrase
from (
(
select set_num, b.phrase as phrase1
from TableA as A
join TableB as B
on a.s_id = b.s_id
and a.s_val = b.s_val
and b.seq = 'first'
) as temp1
join
(
select set_num, b.phrase as phrase2
from TableA as A
join TableB as B
on a.s_id = b.s_id
and a.s_val = b.s_val
and b.seq = 'second'
) as temp2
on temp1.set_num = temp2.set_num
)
Running here: http://sqlfiddle.com/#!9/d63ac3/1

SQL, add column with summed values of the same type

My current table looks like this:
ID TYPE QUANTITY
1 A1 3
2 B1 2
3 A1 2
4 B1 8
And after doing the query I want to get that:
ID TYPE QUANTITY SUM
1 A1 3 5
2 B1 2 10
3 A1 2 5
4 B1 8 10
The SUM column consist of summed quantities of items with the same type.
My approach is to use a derived table which aggregates the quantity by type first and then join this result with the original data:
select
t.id,
t.type,
t.quantity,
tmp.overall
from
table t join (
select
table.type,
sum(table.quantity) as overall
from
table
group by
table.type
) tmp on t.type = tmp.type
SELECT t.ID,t.TYPE,t.QUANTITY,x.SUM FROM TABLE t LEFT JOIN
(SELECT ID,TYPE,QUANTITY,SUM(QUANTITY) AS SUM FROM TABLE GROUP BY TYPE)x
ON t.type=x.type
SQL Fiddle
I haven't tried the query but see what happens if you do this:
SELECT
ID,
myTable.TYPE,
QUANTITY,
aaa.summy
FROM myTable
JOIN
(
SELECT
TYPE,
SUM(QUANTITY) summy
FROM myTable
GROUP BY TYPE
) aaa
ON aaa.TYPE = myTable.TYPE

Count Duplicates with same id passing in one coulmn

Hi there m trying to calculate the row count for same value,
id,value
1 | a
2 | b
3 | c
4 | d
5 | e
and my query is
select value, count(*) as Count from mytable where id in('4','2','4','1','4') group by value having count(*) > 1
for which my expected output will be,
value,Count
d | 3
b | 1
a | 1
Thanks, any help will be appreciated
Try that:
SELECT value, count(value) AS Count
FROM mytable m
WHERE value = m.value
GROUP BY value
SELECT t.id, t.value, COUNT(t.id)
FROM
test t
JOIN
( SELECT 1 AS id
UNION ALL SELECT 3
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 1
UNION ALL SELECT 1 ) AS tmp
ON t.id = tmp.id
GROUP BY t.id
Sample on sqlfiddle.com
See also: Force MySQL to return duplicates from WHERE IN clause without using JOIN/UNION?
Of course, your IN parameter will be dynamic, and thus you will have to generate the corresponding SQL statement for the tmp table.
That's the SQL-only way to do it. Another possibility is to have the query like you have it in your question and afterwards programmatically associate the rows to the count passed to the IN parameter.