Mysql multiple inner join don't work - mysql

I have two tables A & B
A :
id
data
B :
key
value
A_id
I have a problem with my sql query (its hard to explain it, so i create an sqlfiddle)
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '18' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
in this example, I'm supposed to get (A.id = 12 & A.id = 13) but nothing
CREATE TABLE A
(`id` int, `data` int)
;
INSERT INTO A
(`id`, `data`)
VALUES
(11, 11),
(12, 11),
(13, 12)
;
CREATE TABLE B
(`key` int, `value` int, `A_id` int)
;
INSERT INTO B
(`key`, `value`, `A_id`)
VALUES
(20, 1900, 12),
(2, 19, 11),
(11, 19, 11),
(9, 19, 11),
(18, 1950, 13),
(19, 1950, 12)
;
Any idea ?
thanks

First, if you code for more than 10 minutes, you will learn to despise the phrase "don't work"... "don't work" is the phrase that doesn't work.
/rant
You are trying to join tables in an effort to filter. Instead, filter accordingly. Check this out:
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.A_id = A.id
WHERE
(b1.key = '20' AND b1.value = '1900')
OR
(b1.key = '18' AND b1.value >= '1900')
That asks for what you want and joins only when necessary.

INNER JOIN means that it will only return a result if a result exists in both tables. Since you inner join the same table twice each time with unique ids you will never get a result with those three results (table A inner join B inner join b).
I'd suggest something different:
SELECT A.id
FROM A
INNER JOIN B on A.id = B.A_id
WHERE
(B.id = '20' AND B.value = '1980') OR (B.id = '18' AND B.value >= '1990')
;

Your query is using the same A.id for both joins - so it means they must be the same on the rows returned. For b2.key = 18, b2.A_id is 13, while b1.key = 20 forces a b1.A_id of 12. So you want to get a return where b1.A_id = b2.A_id, where one is 12 and the other is 13. If you change the query to
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '19' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
you'll get a return of 12 for A.id.

Related

Need help to tune a complex query

I am trying to performance tune a query and need help with that
We have a requirement to pull in data on the website based on multiple factors which resulted in a complex query which works fine but is expensive. We have all the correct indexes.
What I am stuck on is removing the use of DISTINCT in the query (which seems to be the bottleneck).
I am no SQL expert and I think I've tried everything I could.
Any help to simplify this query and remove DISTINCT (or not use GROUP BY) will be much appreciated. Thanks.
SELECT DISTINCT t2.PK
FROM categories t0
JOIN cat2catrel t1 ON t1.SourcePK = t0.PK
JOIN categories t2 ON t1.TargetPK = t2.PK
JOIN cat2catrel t3 ON t3.SourcePK = t2.PK
JOIN categories t4 ON t3.TargetPK = t4.PK
JOIN cat2prodrel t5 ON t5.SourcePK = t4.PK
JOIN products t6 ON t5.TargetPK = t6.PK
JOIN stocklevels t7 ON t7.productcode = t6.code
JOIN relativeinventory t8 ON t7.p_inventory = t8.PK
JOIN warehouses t9 ON t7.warehouse = t9.PK
JOIN pos2warehouserel t10 ON t10.TargetPK = t9.PK
JOIN pos2warehouserel t10 ON t10.TargetPK = t9.PK
JOIN pointofservice item_t11 ON t10.SourcePK = t11.PK
WHERE ( t0.code = 'code'
AND t8.nventorystatus IN (1111)
AND t11.name = 'ABC')
AND ((t0.TypePkString IN (1, 2, 3, 4)
AND (( t0.catalogversion IN (1, 2, 3)))
AND t1.TypePkString=1000
AND t2.TypePkString IN (1, 2, 3, 4)
AND (( t2.catalogversion IN (1, 2)))
AND t3.TypePkString=300
AND t4.TypePkString IN (6, 7, 8)
AND (( t4.catalogversion IN (5, 6)))
AND t5.TypePkString=500
AND t6.TypePkString=600
AND t7.TypePkString=200
AND t8.TypePkString=700
AND t9.TypePkString IN (3, 7)
AND t10.TypePkString=900
AND t11.TypePkString=750 ));
Currently, this query works fine and provides the results I want. I just don't want to use DISTINCT or GROUP BY and still get unique results.
DB: MySQL 6.3
Your only selecting t2.pk, which appears to be the primary key in the categories table. If this is correct, your producing duplication with your Joins. The easiest solution in general to get rid of DISTINCT in this scenario is to start with the table that you are selecting from and ensuring your INNER JOINING to it. However, it appears your actually doing this already, but your duplicating the join... i.e. t0 and t2 are the same table which is a big red flag to me. Try something like this:
SELECT t0.PK
FROM categories t0
INNER JOIN cat2catrel t1 ON t1.SourcePK = t0.PK AND t1.TargetPK = t0.PK
INNER JOIN cat2catrel t3 ON t3.SourcePK = t0.PK
INNER JOIN categories t4 ON t3.TargetPK = t4.PK
INNER JOIN cat2prodrel t5 ON t5.SourcePK = t4.PK
INNER JOIN products t6 ON t5.TargetPK = t6.PK
INNER JOIN stocklevels t7 ON t7.productcode = t6.code
INNER JOIN relativeinventory t8 ON t7.p_inventory = t8.PK
INNER JOIN warehouses t9 ON t7.warehouse = t9.PK
INNER JOIN pos2warehouserel t10 ON t10.TargetPK = t9.PK
INNER JOIN pos2warehouserel t10 ON t10.TargetPK = t9.PK
INNER JOIN pointofservice item_t11 ON t10.SourcePK = t11.PK
WHERE ( t0.code = 'code'
AND t8.nventorystatus IN (1111)
AND t11.name = 'ABC')
AND ((t0.TypePkString IN (1, 2, 3, 4)
AND (( t0.catalogversion IN (1, 2, 3)))
AND t1.TypePkString=1000
AND t2.TypePkString IN (1, 2, 3, 4)
AND (( t2.catalogversion IN (1, 2)))
AND t3.TypePkString=300
AND t4.TypePkString IN (6, 7, 8)
AND (( t4.catalogversion IN (5, 6)))
AND t5.TypePkString=500
AND t6.TypePkString=600
AND t7.TypePkString=200
AND t8.TypePkString=700
AND t9.TypePkString IN (3, 7)
AND t10.TypePkString=900
AND t11.TypePkString=750 ));
However I see other tables duplicated to like "pos2warehouserel". Tables should only be declared and assigned an alias once.

MySQL select with group and one to many relations condition

For example have such structure:
CREATE TABLE clicks
(`date` varchar(50), `sum` int, `id` int)
;
CREATE TABLE marks
(`click_id` int, `name` varchar(50), `value` varchar(50))
;
where click can have many marks
So example data:
INSERT INTO clicks
(`sum`, `id`, `date`)
VALUES
(100, 1, '2017-01-01'),
(200, 2, '2017-01-01')
;
INSERT INTO marks
(`click_id`, `name`, `value`)
VALUES
(1, 'utm_source', 'test_source1'),
(1, 'utm_medium', 'test_medium1'),
(1, 'utm_term', 'test_term1'),
(2, 'utm_source', 'test_source1'),
(2, 'utm_medium', 'test_medium1')
;
I need to get agregated values of click grouped by date which contains all of selected values.
I make request:
select
c.date,
sum(c.sum)
from clicks as c
left join marks as m ON m.click_id = c.id
where
(m.name = 'utm_source' AND m.value='test_source1') OR
(m.name = 'utm_medium' AND m.value='test_medium1') OR
(m.name = 'utm_term' AND m.value='test_term1')
group by date
and get 2017-01-01 = 700, but I want to get 100 which means that only click 1 has all of marks.
Or if condition will be
(m.name = 'utm_source' AND m.value='test_source1') OR
(m.name = 'utm_medium' AND m.value='test_medium1')
I need to get 300 instead of 600
I found answer in getting distinct click_id by first query and then sum and group by date with condition whereIn, but on real database which is very large and has id as uuid this request executes extrimely slow. Any advices how to get it work propely?
You can achieve it using below queries:
When there are the three conditions then you have to pass the HAVING count(*) >= 3
SELECT cc.DATE
,sum(cc.sum)
FROM clicks AS cc
INNER JOIN (
SELECT id
FROM clicks AS c
LEFT JOIN marks AS m ON m.click_id = c.id
WHERE (
m.NAME = 'utm_source'
AND m.value = 'test_source1'
)
OR (
m.NAME = 'utm_medium'
AND m.value = 'test_medium1'
)
OR (
m.NAME = 'utm_term'
AND m.value = 'test_term1'
)
GROUP BY id
HAVING count(*) >= 3
) AS t ON cc.id = t.id
GROUP BY cc.DATE
When there are the three conditions then you have to pass the HAVING count(*) >= 2
SELECT cc.DATE
,sum(cc.sum)
FROM clicks AS cc
INNER JOIN (
SELECT id
FROM clicks AS c
LEFT JOIN marks AS m ON m.click_id = c.id
WHERE (
m.NAME = 'utm_source'
AND m.value = 'test_source1'
)
OR (
m.NAME = 'utm_medium'
AND m.value = 'test_medium1'
)
GROUP BY id
HAVING count(*) >= 2
) AS t ON cc.id = t.id
GROUP BY cc.DATE
Demo: http://sqlfiddle.com/#!9/fe571a/35
Hope this works for you...
You're getting 700 because the join generates multiple rows for the different IDs. There are 3 rows in the mark table with ID=1 and sum=100 and there are two rows with ID=2 and sum=200. On doing the join where shall have 3 rows with sum=100 and 2 rows with sum=200, so adding these sum gives 700. To fix this you have to aggregate on the click_id too as illustrated below:
select
c.date,
sum(c.sum)
from clicks as c
inner join (select * from marks where (name = 'utm_source' AND
value='test_source1') OR (name = 'utm_medium' AND value='test_medium1')
OR (name = 'utm_term' AND value='test_term1')
group by click_id) as m
ON m.click_id = c.id
group by c.date;
DEMO SQL FIDDLE
I found the right way myself, which works on large amounts of data
The main goal is to make request generate one table with subqueries(conditions) which do not depend on amount of data in results, so the best way is:
select
c.date,
sum(c.sum)
from clicks as c
join marks as m1 ON m1.click_id = c.id
join marks as m2 ON m2.click_id = c.id
join marks as m3 ON m3.click_id = c.id
where
(m1.name = 'utm_source' AND m1.value='test_source1') AND
(m2.name = 'utm_medium' AND m2.value='test_medium1') AND
(m3.name = 'utm_term' AND m3.value='test_term1')
group by date
So we need to make as many joins as many conditions we have

MySQL JOIN two tables and return multiple rows

I have two tables. 'First' table contains 2 ids of 'second' table. v2 and v3 are second table's IDs.
First:
`id`, `mem`, `v2`, `v3`, `v2_amt`, `v3_amt`
1, 'test', 1, 2, '10', '20'
2, 'test2', 1, 2, '10', ''
Second:
`id`, `name`
1, 'anna'
2, 'teena'
When I'm joining,
SELECT f.mem, s.name
FROM `first` f
JOIN second s
ON f.v2 = s.id
AND f.v2_amt !=""
AND (f.v3 = s.id AND f.v3_amt !='')
WHERE f.id = '1'
GROUP BY s.id
Currenlty it return none.
Is any way to union both tables to achieve output as following..??
`mem`, `name`
test, 'anna'
test, 'teena'
For fetching 2 id of first table.
SELECT f.mem, s.name
FROM `first` f
JOIN second s
ON f.v2 = s.id
AND f.v2_amt !=""
AND (f.v3 = s.id AND f.v3_amt !='')
WHERE f.id = '2'
GROUP BY s.id
It should return as, seems v3_amt is empty.
`mem`, `name`
test, 'anna'
You should empty the v3 column on insert if v3_amt="" similarly on v2 and try this query
Select f.v2,f.v3,f.v2_amt,f.v3_amt,s.name from first as f join second as s on
(f.v2 = s.id OR f.v3 = s.id) and (f.v2_amt!="" OR f.v3_amt!="") where f.id=2
:)
You should use OR.
SELECT f.mem, s.name FROM `first` f JOIN `second` s
ON f.v2 = s.id AND f.v2_amt !="" OR (f.v3 = s.id AND f.v3_amt !='')
WHERE f.id = '1'
you can use left join and OR for this case
select ft.mem, st.name from first_table ft
LEFT JOIN second_table st ON (ft.v2 = st.id AND ft.v2_amt !="") OR (ft.v3 = st.id AND ft.v3_amt !="")
WHERE ft.id = '1'

IF statement in mysql JOIN - Adding multiple conditions if condition success

Here is the case:-
I want to join 2 tables. Lets say table a and b
SELECT *
FROM a
JOIN b ON a.id = b.id AND b.status = '1'
Here is the problem:
b.status = '1'
should only be added when
b.stage in (1, 3, 5, 6, 8)
How can I add such condition in ON clause ?
Like
ON a.id = b.id
CASE
IF (b.stage in (1, 3, 5, 6, 8))
THEN
AND b.status = '1'
END
Your condition is logically the same as "either stage is not in the list or status is 1":
SELECT *
FROM a
JOIN b ON a.id = b.id
AND (b.stage not in (1, 3, 5, 6, 8) OR b.status = '1')

Optimize multiple row count

There are two tables: question and answer. In answer I hold user_id and question_id. I want to count how many times each choice is selected.
Below is a working query, but instead of joining the same table 4 times, what is a faster way i.e. joining the answer table only once.
SELECT question.question_id,
question.correct_choice,
COUNT(DISTINCT a.user_id) as num_of_a,
COUNT(DISTINCT b.user_id) as num_of_b,
COUNT(DISTINCT c.user_id) as num_of_c,
COUNT(DISTINCT d.user_id) as num_of_d
FROM answer a,
answer b,
answer c,
answer d,
question
WHERE a.question_id = question.question_id
AND b.question_id = question.question_id
AND c.question_id = question.question_id
AND d.question_id = question.question_id
AND a.choice = 'A'
AND b.choice = 'B'
AND c.choice = 'C'
AND d.choice = 'D'
GROUP BY question.question_id
ORDER BY question.question_id asc;
returns
273, D, 5, 2, 8, 39
274, C, 2, 14, 50, 2
277, C, 3, 5, 41, 17
278, C, 16, 9, 34, 9
279, C, 8, 30, 24, 12
280, B, 17, 21, 20, 3
284, C, 2, 3, 19, 1
286, A, 16, 3, 2, 2
287, D, 1, 2, 1, 18
289, B, 3, 18, 2, 2
290, D, 6, 9, 8, 6
This solution only does a single join... additionally, I converted your implicit joins to explicit, and rounded out your GROUP BY:
SELECT
q.question_id,
q.correct_choice,
COUNT(DISTINCT CASE WHEN a.choice = 'A' THEN a.user_id END) as num_of_a,
COUNT(DISTINCT CASE WHEN a.choice = 'B' THEN a.user_id END) as num_of_b,
COUNT(DISTINCT CASE WHEN a.choice = 'C' THEN a.user_id END) as num_of_c,
COUNT(DISTINCT CASE WHEN a.choice = 'D' THEN a.user_id END) as num_of_d
FROM
answer a
JOIN question q ON a.question_id = q.question_id
GROUP BY q.question_id, q.correct_choice
ORDER BY q.question_id asc;
This works because when the CASE statement doesn't evaluate to true, it returns NULL, which won't be included in the COUNT DISTINCT of user Ids.
You might consider using a SELECT... UNION SELECT style if you are concerned about performance.
Although I would agree with #benjam that you should EXPLAIN the results to see what optimizer is saying, since you do not have a dependent queries.
Make sure that you have indexes on question.question_id, and on answer.question_id, answer.choice, and answer.user_id and your query should be just as fast as any other that does not join answer for each choice. Then use the following query:
SELECT `question`.`question_id`,
`question`.`correct_choice`,
COUNT(DISTINCT `a`.`user_id`) as `num_of_a`,
COUNT(DISTINCT `b`.`user_id`) as `num_of_b`,
COUNT(DISTINCT `c`.`user_id`) as `num_of_c`,
COUNT(DISTINCT `d`.`user_id`) as `num_of_d`
FROM `question`
LEFT JOIN `answer` AS `a`
USING(`a`.`question_id` = `question`.`question_id`
AND `a`.`choice` = 'A'),
LEFT JOIN `answer` AS `b`
USING(`b`.`question_id` = `question`.`question_id`
AND `b`.`choice` = 'B'),
LEFT JOIN `answer` AS `c`
USING(`c`.`question_id` = `question`.`question_id`
AND `c`.`choice` = 'C'),
LEFT JOIN `answer` AS `d`
USING(`d`.`question_id` = `question`.`question_id`
AND `d`.`choice` = 'D')
GROUP BY `question`.`question_id` ;
The ORDER BY clause is not needed and implied from the GROUP BY clause.