Select first row for each combination - mysql

I'm trying select the rows in which user_from exists as user_to in other row/s which have been created later.
This query gives me the rows.
SELECT A
FROM db.table A
LEFT JOIN db.table B
ON A.user_to = B.user_from
AND A.user_from = B.user_to
AND A.createdAt < B.createdAt
WHERE B.user_to IS NOT NULL AND B.user_from IS NOT NULL;
However, I want to get just the first row for each combination of user_to/user_from.
E.g. If there are some rows like:
user_to = 1, user_from = 2
user_to = 2, user_from = 1
user_to = 2, user_from = 1
user_to = 1, user_from = 2
I want to get just the first created one (defined like createdAt).
I've tried using GROUP BY user_from, but this exclude all other combinations with each user_from.

I think Use DISTINCT in the SELECT query .
SELECT DISTINCT A
FROM db.table A
LEFT JOIN db.table B
ON A.user_to = B.user_from
AND A.user_from = B.user_to
AND A.createdAt < B.createdAt
WHERE B.user_to IS NOT NULL AND B.user_from IS NOT NULL;

It seems I had earlier misunderstood the requirement.
Assuming no two 'createdAt' values are exactly the same, you could use this:
select a.* from
A a
where NOT EXISTS (SELECT *
FROM A b
WHERE (
(
(a.to_ = b.from_ AND a.from_ = b.to_)
OR
(a.to_ = b.to_ AND a.from_ = b.from_ AND a.c <> b.c)
)
AND
b.c < a.c
)
);

Related

MySQL select with group and one to many relations condition

For example have such structure:
CREATE TABLE clicks
(`date` varchar(50), `sum` int, `id` int)
;
CREATE TABLE marks
(`click_id` int, `name` varchar(50), `value` varchar(50))
;
where click can have many marks
So example data:
INSERT INTO clicks
(`sum`, `id`, `date`)
VALUES
(100, 1, '2017-01-01'),
(200, 2, '2017-01-01')
;
INSERT INTO marks
(`click_id`, `name`, `value`)
VALUES
(1, 'utm_source', 'test_source1'),
(1, 'utm_medium', 'test_medium1'),
(1, 'utm_term', 'test_term1'),
(2, 'utm_source', 'test_source1'),
(2, 'utm_medium', 'test_medium1')
;
I need to get agregated values of click grouped by date which contains all of selected values.
I make request:
select
c.date,
sum(c.sum)
from clicks as c
left join marks as m ON m.click_id = c.id
where
(m.name = 'utm_source' AND m.value='test_source1') OR
(m.name = 'utm_medium' AND m.value='test_medium1') OR
(m.name = 'utm_term' AND m.value='test_term1')
group by date
and get 2017-01-01 = 700, but I want to get 100 which means that only click 1 has all of marks.
Or if condition will be
(m.name = 'utm_source' AND m.value='test_source1') OR
(m.name = 'utm_medium' AND m.value='test_medium1')
I need to get 300 instead of 600
I found answer in getting distinct click_id by first query and then sum and group by date with condition whereIn, but on real database which is very large and has id as uuid this request executes extrimely slow. Any advices how to get it work propely?
You can achieve it using below queries:
When there are the three conditions then you have to pass the HAVING count(*) >= 3
SELECT cc.DATE
,sum(cc.sum)
FROM clicks AS cc
INNER JOIN (
SELECT id
FROM clicks AS c
LEFT JOIN marks AS m ON m.click_id = c.id
WHERE (
m.NAME = 'utm_source'
AND m.value = 'test_source1'
)
OR (
m.NAME = 'utm_medium'
AND m.value = 'test_medium1'
)
OR (
m.NAME = 'utm_term'
AND m.value = 'test_term1'
)
GROUP BY id
HAVING count(*) >= 3
) AS t ON cc.id = t.id
GROUP BY cc.DATE
When there are the three conditions then you have to pass the HAVING count(*) >= 2
SELECT cc.DATE
,sum(cc.sum)
FROM clicks AS cc
INNER JOIN (
SELECT id
FROM clicks AS c
LEFT JOIN marks AS m ON m.click_id = c.id
WHERE (
m.NAME = 'utm_source'
AND m.value = 'test_source1'
)
OR (
m.NAME = 'utm_medium'
AND m.value = 'test_medium1'
)
GROUP BY id
HAVING count(*) >= 2
) AS t ON cc.id = t.id
GROUP BY cc.DATE
Demo: http://sqlfiddle.com/#!9/fe571a/35
Hope this works for you...
You're getting 700 because the join generates multiple rows for the different IDs. There are 3 rows in the mark table with ID=1 and sum=100 and there are two rows with ID=2 and sum=200. On doing the join where shall have 3 rows with sum=100 and 2 rows with sum=200, so adding these sum gives 700. To fix this you have to aggregate on the click_id too as illustrated below:
select
c.date,
sum(c.sum)
from clicks as c
inner join (select * from marks where (name = 'utm_source' AND
value='test_source1') OR (name = 'utm_medium' AND value='test_medium1')
OR (name = 'utm_term' AND value='test_term1')
group by click_id) as m
ON m.click_id = c.id
group by c.date;
DEMO SQL FIDDLE
I found the right way myself, which works on large amounts of data
The main goal is to make request generate one table with subqueries(conditions) which do not depend on amount of data in results, so the best way is:
select
c.date,
sum(c.sum)
from clicks as c
join marks as m1 ON m1.click_id = c.id
join marks as m2 ON m2.click_id = c.id
join marks as m3 ON m3.click_id = c.id
where
(m1.name = 'utm_source' AND m1.value='test_source1') AND
(m2.name = 'utm_medium' AND m2.value='test_medium1') AND
(m3.name = 'utm_term' AND m3.value='test_term1')
group by date
So we need to make as many joins as many conditions we have

How to use both SELECT and UPDATE statements in one query?

I have this update query which works as well:
UPDATE tbname t CROSS JOIN ( SELECT related FROM tbname WHERE id = 5 ) x
SET AcceptedAnswer = ( id = 5 )
WHERE t.related = x.related
I also have two select statements which validates somethings. Actually I want to check these to conditions before updating:
Condition1:
(SELECT 1 FROM tbname
WHERE id = x.related AND
author_id = 29
)
Condition2:
(SELECT 1 FROM tbname
WHERE id = x.related AND
(
( amount IS NOT NULL AND
NOT EXISTS ( SELECT 1 FROM tbname
WHERE related = x.related AND
AcceptedAnswer = 1 )
) OR amount IS NULL
)
)
How can I combine those two conditions with that updating query?
Here is what I've tried so far but it doesn't work and throws this error:
UPDATE tbname CROSS JOIN ( SELECT related FROM tbname WHERE id = 5 ) x
SET AcceptedAnswer = ( id = 5 )
WHERE q.related = x.related
AND
(SELECT 1 FROM tbname
WHERE id = x.related AND
author_id = 29
) AND
(SELECT 1 FROM tbname
WHERE id = x.related AND
(
( amount IS NOT NULL AND
NOT EXISTS ( SELECT 1 FROM tbname
WHERE related = x.related AND
AcceptedAnswer = 1 )
) OR amount IS NULL
)
)
#1093 - You can't specify target table 'tbname' for update in FROM clause
Seems your update is equivalent to this
update tbname as a
inner join tbname as b on a.related = b.related and b.id = 5
set AcceptedAnswer = (id = 5)
your query seem set to true (1) the AccepetdAnswer of the row with id = 5 for the row that have acceppeted equalt to the accepted value of th row with id = 5 (false / 0) in the other case ..
for test use
select * from tbname as a
inner join tbname as b on a.related = b.related and b.id = 5
and (b.related = a.id and a.author_id = 29)
and (b.related = a.id and
(a.amont is not null and (a.related = b.related and a.AcceptedAnswer = 1)))
I'm not pretty sure what is the purpose of the SET clause (id =5)
anyway this way avoids the use of the cross join provided that you
don't use the table "x" to get something beyond the "related" items.
UPDATE tbname
SET
AcceptedAnswer = ( id = 5 )
WHERE
#THIS IS EQUIVALENT TO THE JOIN CLAUSE
id IN ( SELECT related FROM tbname WHERE id = 5 )
#THIS IS THE CONDITION 1 POINTING tnname
AND author_id = 29
#THIS IS THE CONDITION 2 POINTING tbname
AND (
( amount IS NOT NULL
AND NOT AcceptedAnswer = 1
) OR amount IS NOT NULL
)
;

How to one column full result and max of that column value?

SELECT Max(c.vendor_id),c.vendors_id FROM (SELECT distinct a.vendor_id FROM service_master a,products b,vendors v,`vendor_addresses` ad WHERE a.cat_id= 242 AND a.service_id = b.s_sid AND a.is_active =1 AND b.isproductactive = 1 AND v.vendorid = a.vendor_id AND ad.vendorchild_id = a.vendor_id AND v.isvendoractive = 1 LIMIT 10) c ORDER BY c.vendor_id
Questions:
1)I want full result in vendor_id column
2)Max(vendor_id)result
How to get result in single query?
Not tested. But please try this.
select t1.id,t2.id
from detail t1
left join(
select max(id) as id
from detail
) t2 on 1 = 1
select id, ( select max(id) from detail internal_detail
where detail.id = internal_detail.id ) as max from detail

Which table does the row come from?

Here is my query:
SELECT *
FROM messages
WHERE status = 1
AND (
poster IN (SELECT thing FROM follows WHERE follower = :uid AND type = 3)
OR
topic_id IN (SELECT thing FROM follows WHERE follower = :uid AND type = 1)
)
ORDER BY post_date DESC LIMIT 0, 20
I want to know which clause the rows come from. From the poster IN (...) part or the topic_id IN (...) part? How can I do that?
A straightforward way:
SELECT *
, CASE WHEN poster IN (SELECT thing FROM follows WHERE follower = :uid AND type = 3) THEN 'poster'
ELSE 'topic_id' END AS from_clause
FROM messages <..>
Another way :
SELECT m.*
, CASE WHEN t1.thing IS NULL THEN 'topic_id' ELSE `poster` END AS from_clause
FROM messages m
LEFT JOIN (SELECT thing FROM follows WHERE follower = :uid AND type = 3) t1 ON m.poster = t1.thing
LEFT JOIN (SELECT thing FROM follows WHERE follower = :uid AND type = 1) t2 ON m.topic_id = t2.thing
WHERE m.status = 1 AND (t1.thing IS NOT NULL OR t2.thing IS NOT NULL)

ROW Prior to the MAX row, not the MIN

I have the below query to find the row prior to MAX row. i feel like i am missing something, can somebody please help with it. I ammlooking forward to get the b.usercode_1 as row prior to a.usercode_1 not the min or any other random row but the ROW prior to the MAX.
Please suggest.
Select distinct
c.ssn
, c.controlled_group_Status CG_status
, c.last_name || ' , '|| c.first_name FULL_NAME
, a.usercode_1 Current_REG
, a.eff_date effective_since1
, b.usercode_1 PRIOR_REG
, b.eff_date effective_since2
, d.term_eff_date
from employee_eff_date c
, emp_cg_data a
, emp_cg_data b
, emp_ben_elects d
where c.control_id = 'XYZ'
and c.controlled_group_Status <> 'D'
and c.eff_date = (select max( c1.eff_date)
from emp_cg_data c1
where c.control_id = c1.control_id
and c.ssn = c1.ssn)
and a.control_id = c.control_id
and a.ssn = c.ssn
and a.eff_date = (select max(a1.eff_date )
from emp_cg_data a1
where a.control_id = a1.control_id
and a.ssn = a1.ssn)
and a.usercode_1 = 'REG26'
and b.control_id = c.control_id
and b.ssn = c.ssn
and b.eff_date = (select max( b1.eff_date)
from emp_cg_data b1
where b.control_id = b1.control_id
and b.ssn = b1.ssn
and b1.eff_date < a.eff_date)
and b.usercode_1 like 'REG%'
and d.control_id = c.control_id
and d.ssn = c.ssn
and d.life_event_date = (select max( d1.life_event_date)
from emp_ben_elects d1
where d.control_id = d1.control_id
and d.ssn = d1.ssn)
and d.le_seq_no= (select max( d1.le_seq_no)
from emp_ben_elects d1
where d.control_id = d1.control_id
and d.ssn = d1.ssn
and d.life_event_date = d1.life_event_date)
and d.term_eff_date is null
;
NOTE: this is not a complete answer... its a helpful suggestion of what you should start with.
you are doing a Cartesian Product of the four tables, filtered by a WHERE... so something like this
Implicit Join -- generally not a good practice as it can be very difficult to keep the where filters apart from the join conditions.
SELECT *
FROM tableA a, TableB b
WHERE b.id = a.id
another way to write a JOIN (the more generally accepted way)
SELECT *
FROM tableA a
JOIN tableB b ON b.id = a.id
Use the ON clause to join two tables together.
You should change your joins to this format so that others can read your query and understand it better.
suggestion to solve your problem
a fairly simple way to get the second to last row is to use a row counter.
so something like
SELECT *, #row_count := #row_count + 1
FROM tableA a
JOIN tableB b on b.id = a.id AND -- any other conditions for the join.
CROSS JOIN (SELECT #row_count := 0) t
then from here you can get the MAX row, whether thats the ID or something else. and then get the #row_num -1. aka the previous row.