If I have a table:
id1 id2 count
A A 1
A B 2
A C 1
B A 3
B B 1
B C 2
C A 3
C B 2
C C 1
What I want after deleting:
id1 id2 count
A A 1
A B 2
A C 1
B B 1
B C 2
C C 1
which means if I have A(id1) --> B(id2) then delete B(id1) --> A(id2). same as B(id1) --> C(id2) then delete the row C(id1) --> B(id2)
Thank you for ur help!
In this case we analyze Target.id1 > Target.id2 mean case like (B, A, ??) where B > A
this also ignore cases like (A, A, ??)
Then use self left join to try find another row with (A, B, ??)
If we found a match then Source.id1 IS NOT NULL and we delete
SQL Fiddle Demo
DELETE Target
FROM Table1 Target
LEFT JOIN Table1 Source
ON Target.`id1` = Source.`id2`
AND Target.`id2` = Source.`id1`
AND Target.`id1` > Target.`id2`
WHERE Source.`id1` IS NOT NULL;
OUTPUT
| id1 | id2 | count |
|-----|-----|-------|
| A | A | 1 |
| A | B | 2 |
| A | C | 1 |
| B | B | 1 |
| B | C | 2 |
| C | C | 1 |
Should be something like:
DELETE FROM 'myTable'
WHERE STRCMP(id1, id2) > 0;
STRCMP function can compare the strings and return an int. From there it should be easy - something very similar to the above. If you have further trouble let me know.
It looks like what you are saying is...
If there is a (id1,id2) tuple in the table with values e.g. (a,b), and there is another tuple (b,a) that consists of the the same values, but swapped in the columns, you want to remove one of those tuples. It looks like the one you want to remove is the one that has the "greater" value in the first column.
First, identify the "duplicate" tuples.
For now, we'll ignore the tuples where the values of id1 and id2 are the same, e.g. (a,a).
SELECT s.id1
, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
ORDER BY s.id1, s.id2
If that returns the set of rows you want to remove, we can convert that into a DELETE. To do that, we need to change that query into an inline view,
We can re-write that to be like this, verify we get equivalent results.
SELECT o.id1, o.id2
FROM ( SELECT q.id1, q.id2
FROM ( SELECT s.id1, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
) q
GROUP BY q.id1, q.id2
) p
JOIN mytable o
ON o.id1 = p.id1
AND o.id2 = p.id2
ORDER BY o.id1, o.id2
Then we can convert that to a DELETE statement, replacing SELECT o.id1, o.id2 WITH DELETE o.* and removing the ORDER BY...
DELETE o.*
FROM ( SELECT q.id1, q.id2
FROM ( SELECT s.id1, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
) q
GROUP BY q.id1, q.id2
) p
JOIN mytable o
ON o.id1 = p.id1
AND o.id2 = p.id2
Related
i have this query
select a.*,
GROUP_CONCAT(b.`filename`) as `filesnames`
from `school_classes` a
join classes_data.`classes_albums` b on a.`school_key` = b.`school_key`
and a.`class_key` = b.`class_key`
group by a.`ID`
the result of this query is
i want to add to it
ORDER BY b.added_date DESC LIMIT 2
so the output of filenames column only shows latest 2 files , ?
It is not clear from your question what your tables look like and how they relate but this might be what you are after.
drop table if exists school_classes;
create table school_classes (id int,school_key int, class_key int);
drop table if exists classes_albums;
create table classes_albums(id int,school_key int, class_key int, filename varchar(3),dateadded date);
insert into school_classes values
(1,1,1), (2,1,2),(3,1,3)
insert into classes_albums values
(1,1,1,'a','2017-01-01'),(1,1,1,'b','2017-02-01'),(1,1,1,'c','2017-03-01');
select a.*, b.filenames
from school_classes a
join
(
select c.school_key,c.class_key,group_concat(c.filename order by c.rn desc) filenames
from
(
select c.*,
if(concat(c.school_key,c.class_key) <> #p, #rn:=1, #rn:=#rn+1) rn,
#p:=concat(c.school_key,c.class_key) p
from classes_albums c, (select #rn:=0, #ck:=0,#sk:=0) rn
order by c.school_key,c.class_key, c.dateadded desc
) c
where c.rn < 3
group by c.school_key,c.class_key
) b on b.school_key = a.school_key and b.class_key = a.class_key
+------+------------+-----------+-----------+
| id | school_key | class_key | filenames |
+------+------------+-----------+-----------+
| 1 | 1 | 1 | b,c |
+------+------------+-----------+-----------+
1 row in set (0.02 sec)
I have two mysql tables: A, and B.
A has columns id, alpha, blabla, moreblabla, with primary key id.
B has columns id, alpha, beta, somemoreblabla, with primary key (id, alpha)
I need to select all those A.id's, for which its A.alpha is not present in any of the B.alpha's respective to every B.id = A.id
How do I do it?
Your question is not entirely clear. Do you want all A.alpha's that are not in any B.alpha? In that case a simple query like this is enough:
Select A.id from A where A.alpha NOT IN (Select B.alpha from B);
If you want to select all ID's from A that have a counterpart (an equal ID) in B but where the alpha between A and B are different it is a bit more work:
SELECT A.id FROM A
INNER JOIN B on A.id = B.id
WHERE A.alpha != B.alpha
Consider the following structure:
CREATE TABLE `A` (
`id` int(11) NOT NULL,
`alpha` varchar(255) NOT NULL
);
CREATE TABLE `B` (
`id` int(11) NOT NULL,
`alpha` varchar(255) NOT NULL
);
With inserts:
insert into A set id = 1, alpha = 'a';
insert into A set id = 2, alpha = 'b';
insert into B set id = 1, alpha = 'a';
insert into B set id = 2, alpha = 'a';
If you run the query with the join your result will be:
+----+
| id |
+----+
| 2 |
+----+
This is since ID 2 in A has a different alpha than ID 2 in B.
EDIT:
It just occurred to me that you might even mean that every A.id can occur in B multiple times. If that is what can happen you need a different approach again. Lets assume the same insert as before with an addition:
insert into A set id = 1, alpha = 'a';
insert into A set id = 2, alpha = 'b';
insert into B set id = 1, alpha = 'a';
insert into B set id = 2, alpha = 'a';
insert into B set id = 2, alpha = 'b'; <- important since there is now a 2nd 2 in B that should ensure that the record with ID 2 from A should not be returned.
insert into A set id = 3, alpha = 'c';
insert into B set id = 3, alpha = 'x'; <-- only ID 3 should now be returned due to the situation above
Our tables now look like so:
A
+----+-------+
| id | alpha |
+----+-------+
| 1 | a |
| 2 | b |
| 3 | c |
+----+-------+
B
+----+-------+
| id | alpha |
+----+-------+
| 1 | a |
| 2 | a |
| 2 | b |
| 3 | x |
+----+-------+
If this is your case the following query will do the trick:
select A.id
FROM A where A.alpha NOT IN (
select B.alpha FROM B where B.id = A.id
);
SELECT A.id
FROM
A
LEFT OUTER JOIN B ON A.id = B.id AND B.alpha = A.alpha
where B.alpha IS NULL
Here is SQL Fiddle
This will be the fastest query,better than The marked Answer in terms of query optimization.
EXPLAIN SELECT A.id
FROM
A
LEFT OUTER JOIN B ON A.id = B.id AND B.alpha = A.alpha
where B.alpha IS NULL
Here is the output :
EXPLAIN select A.id
FROM A where A.alpha NOT IN (
select B.alpha FROM B where B.id = A.id
)
Here is the EXPLAIN of Marked answer.
You can see the DIFFERENCE.
SIMPLE VS PRIMARY
SIMPLE VS DEPENDENT SUBQUERY
Hope this helps.
Use MySQL NOT IN
Try this query :-
select id from a where a.id NOT IN (select id from b)
Select A.id from table_A where A.alpha NOT IN (Select B.alpha from table_B);
The Sub-query will return a table with all the records of B.alpha from table_B.
And say it returns 1 to 5 records as result.
And say your table table_A has 1 to 10 records in A.alpha
Now your parent query will check for records in table table_A for field A.alpha which do not belongs to B.alpha (using NOT IN)
Hence the expected result is 6 to 10 as result.
I have two tables :
Table:#a
id | name
10 | a
20 | b
30 | c
40 | d
50 | e
Table:#b
id | name
10 | a
30 | a
50 | a
I want all the #a table Id which are not present in #b
The following query works :
select * from #a as a
where not exists(select * from #b as b where a.id = b.id)
But I am not able to understand why the below query does not work
select * from #a as a
where exists(select * from #b as b where a.id <> b.id)
Why does where not exists expression does not give correct output
The first query does yield the right result : select all records from A where doesnt have appropriate match in B
But the second one is logically different.
Looking at :
;with A(id,name) as
(
select 10,'a' UNION ALL
select 20,'b' UNION ALL
select 30,'c' UNION ALL
select 40,'d' UNION ALL
select 50,'e'
) ,B(id,name) as
(
select 10,'a' UNION ALL
select 30,'a' UNION ALL
select 50,'a'
)
select * from a as a
where exists(select * from b as b where a.id <> b.id)
For each record from a, show that record if exists records from b with non-matched id.
So for 10,a (in a) there ARE(!) records from B where id is not 10 , hence it does YIELDS 10,a (from a)!
Now - do you see the problem ?
I have a table tbl_entries with the following structure:
+----+------+------+------+
| id | col1 | col2 | col3 |
+----+------+------+------+
| 11 | a | b | c |
| 12 | d | e | a |
| 13 | a | b | c |
| 14 | X | e | 2 |
| 15 | a | b | c |
+----+------+------+------+
And another table tbl_reviewlist with the following structure:
+----+-------+------+------+------+
| id | entid | cola | colb | colc |
+----+-------+------+------+------+
| 1 | 12 | N | Y | Y |
| 2 | 13 | Y | N | Y |
| 3 | 14 | Y | N | N |
+----+-------+------+------+------+
Basically, tbl_reviewlist contains reviews about the entries in tbl_entries. However, for some known reason, the entries in tbl_entries are duplicated. I am extracting the unique records by the following query:
SELECT * FROM `tbl_entries` GROUP BY `col1`, `col2`, `col3`;
However, any one of the duplicate rows from tbl_entries will be returned no matter they have been reviewed or not. I want the query to prefer those rows which have been reviewed. How can I do that?
EDIT: I want to prefer rows which have been reviewed but if there are rows which have not been reviewed yet it should return those as well.
Thanks in advance!
Have you actually tried anything?
A hint: The SQL standard requires that every column in the result set of a query with a group by clause must be either
a grouping column
an aggregate function — sum(), count(), etc.,
a constant value/literal, or
an expression derived solely from the above.
Some broken implementations (and I believe MySQL is one of them) allow other columns to be included and offer their own...creative...behavior. If you think about it, group by essentially says to do the following:
Order this table by the grouping expressions
Partition it into subsets based on the group by sequence
Collapse each such partition into a single row computing the aggregate expressions as you go.
Once you've done that, what does it mean to ask for something that isn't uniform across the collapsed group partition?
If you have a table foo containing columns A, B, C, D and E and say something like
select A,B,C,D,E from foo group by A,B,C
per the standard, you should get a compile error. Deviant implementations [usually] treat this sort of query as the [rough] equivalent of
select *
from foo t
join ( select A,B,C
from foo
group by A,B,C
) x on x.A = t.A
and x.B = t.B
and x.C = t.C
But I wouldn't necessarily count on that without review the documentation for the specific implementation that your are using.
If you want to find just reviewed entries, then something like this:
select *
from tbl_entries t
where exists ( select *
from tbl_reviewlist x
where x.entid = t.id
)
will do you. If, however, you want to find reviewed entries that are duplicated on col1, col2 and col3 then something like this should do you:
select *
from tbl_entries t
join ( select col1,col2,col3
from tbl_entries x
group by col1,col2,col3
having count(*) > 1
) d on d.col1 = t.col1
and d.col2 = t.col2
and d.col3 = t.col3
where exists ( select *
from tbl_reviewlist x
where x.entid = t.id
)
Since your problem statement is rather unclear, another take might be something along these lines:
select t.col1 ,
t.col2 ,
t.col3 ,
t.duplicate_count ,
coalesce(x.review_count,0) as review_count
from ( select col1 ,
col2 ,
col3 ,
count(*) as duplicate_count
from tbl_entries
group by col1 ,
col2 ,
col3
) t
left join ( select cola, colb, colc , count(*) as review_count
from tbl_reviewList
group by cola, colb, colc
having count(*) > 1
) x on x.cola = t.col1
and x.colb = t.col2
and x.colc = t.col3
order by sign(coalesce(x.review_count,0)) desc ,
t.col1 ,
t.col2 ,
t.col3
This query
summarizes the entries table, developing a count of how many time seach col1/2/3 combination exists.
summarizes the review table, developing a count of reviews for each cola/b/c combination
joins them together matching cols a:1, b:2 c:3
orders them
preferring reviewed items to non-reviewed items by placing them first,
then by the col1/2/3 values.
I think there's a way with less repetition, but this should be a start:
select
tbl_entries.ID,
col1,
col2,
col3,
cola, -- ... you get the idea ...
from (
select coalesce(min(entid), min(tbl_entries.ID)) as favID
from tbl_entries left join tbl_reviewlist on entid = tbl_entries.ID
group by col1, col2, col3
) as A join tbl_entries on tbl_entries.ID = favID
left join tbl_reviewlist on entid = tbl_entries.ID
Basically you distill the desired output to a list of core ID's and then re-map back to the data...
SELECT e.col1, e.col2, e.col3,
COALESCE(MIN(r.entid), MIN(e.id)) AS id
FROM tbl_entries AS e
LEFT JOIN tbl_reviewlist AS r
ON r.entid = e.id
GROUP BY e.col1, e.col2, e.col3 ;
Tested at SQL-Fiddle
I have a table call production
factory_id | factory_name | product_id
1 | A | 1
1 | A | 2
1 | A | 3
2 | B | 3
3 | C | 1
3 | C | 2
3 | C | 3
3 | C | 4
3 | C | 5
I'm trying to develop a query that will return two factory name pair such that every product of factory1 is produced by factory2, result looked like:
factory_name_1 | factory_name_2
A | C
B | A
B | C
I have some nested self join and renames, but I can't wrap my head around how I can apply EXISTS or IN for this scenario that does "for each product produced by factory X do condition". Thanks to any help in advanced.
Update:
Sorry that I forgot to paste my query:
select t0.fname0, t1.fname1
from (
select factory_id as fid0, factory_name as fname0, product_id as pid0, count(distinct factory_id, product_id) as pnum0
from production
group by factory_id
) t0
join
(
select factory_id as fid1, factory_name as fname1, product_id as pid1, count(distinct factory_id, product_id) as pnum1
from production
group by factory_id
) t1
where t0.fid0 <> t1.fid1
and t0.pnum0 < t1.pnum1
and t0.pid0 = t1.pid1;
Update 2: production is the only table. Expected output factory1 and factory2 are just the rename of factory_name attribute.
You need to JOIN the table for each factory pairing to make sure they "join" on the same product_ids, otherwise you might end up with similar counts for DISTINCT product_ids but these will not necessarily refer to the same product_ids.
This is my take on it:
SELECT bfna,afna, pcnt FROM (
SELECT a.factory_name afna, b.factory_name bfna, COUNT(DISTINCT b.product_id) commoncnt
FROM tbl a LEFT JOIN tbl b ON b.factory_name!=a.factory_name AND b.product_id=a.product_id
GROUP BY a.factory_name, b.factory_name
) c
INNER JOIN (
SELECT factory_name fna, COUNT(DISTINCT product_id) pcnt
FROM TBL GROUP BY factory_name
) d ON fna=bfna AND commoncnt=pcnt
ORDER BY bfna,afna
You can find a demo here: https://rextester.com/JJGCK84904
It produces:
bfna afna commoncnt
A C 3
B A 1
B C 1
For simplicity I left out the column factory_id as it does not add any information here.
Fun fact: as I am using only "bare-bone" SQL expressions, the above code will run on SQL-Server too without any changes.
You can do it this way:
select A as factory_name_1 , B as factory_name_2
from
(
select A, B, count(*) as Count_
from
(
select a.factory_name as A, b.factory_name as B
from yourtable a
inner join yourtable b
on a.product_id = b.product_id and a.factory_id <> b.factory_id
)a group by A, B
)a
inner join
(select factory_name, count(*) as Count_ from yourtable group by factory_name) b
on a.A = b.factory_name and a.Count_ = b.Count_
Order by 1
Output:
factory_name_1 factory_name_2
A C
B A
B C
The other solutions just seem more complicated than necessary. This is basically a self-join with aggregation:
with t as (
select t.*, count(*) over (partition by factory_id) as cnt
from tbl t
)
select t1.factory_id, t2.factory_id, t1.factory_name, t2.factory_name, count(*)
from t t1 join
t t2
on t1.product_id = t2.product_id and t1.factory_id <> t2.factory_id
group by t1.factory_id, t2.factory_id, t1.factory_name, t2.factory_name, t1.cnt
having count(*) = max(t1.cnt);
Here is a db<>fiddle.