Select every field that fullfil the condition - mysql

I have a table call production
factory_id | factory_name | product_id
1 | A | 1
1 | A | 2
1 | A | 3
2 | B | 3
3 | C | 1
3 | C | 2
3 | C | 3
3 | C | 4
3 | C | 5
I'm trying to develop a query that will return two factory name pair such that every product of factory1 is produced by factory2, result looked like:
factory_name_1 | factory_name_2
A | C
B | A
B | C
I have some nested self join and renames, but I can't wrap my head around how I can apply EXISTS or IN for this scenario that does "for each product produced by factory X do condition". Thanks to any help in advanced.
Update:
Sorry that I forgot to paste my query:
select t0.fname0, t1.fname1
from (
select factory_id as fid0, factory_name as fname0, product_id as pid0, count(distinct factory_id, product_id) as pnum0
from production
group by factory_id
) t0
join
(
select factory_id as fid1, factory_name as fname1, product_id as pid1, count(distinct factory_id, product_id) as pnum1
from production
group by factory_id
) t1
where t0.fid0 <> t1.fid1
and t0.pnum0 < t1.pnum1
and t0.pid0 = t1.pid1;
Update 2: production is the only table. Expected output factory1 and factory2 are just the rename of factory_name attribute.

You need to JOIN the table for each factory pairing to make sure they "join" on the same product_ids, otherwise you might end up with similar counts for DISTINCT product_ids but these will not necessarily refer to the same product_ids.
This is my take on it:
SELECT bfna,afna, pcnt FROM (
SELECT a.factory_name afna, b.factory_name bfna, COUNT(DISTINCT b.product_id) commoncnt
FROM tbl a LEFT JOIN tbl b ON b.factory_name!=a.factory_name AND b.product_id=a.product_id
GROUP BY a.factory_name, b.factory_name
) c
INNER JOIN (
SELECT factory_name fna, COUNT(DISTINCT product_id) pcnt
FROM TBL GROUP BY factory_name
) d ON fna=bfna AND commoncnt=pcnt
ORDER BY bfna,afna
You can find a demo here: https://rextester.com/JJGCK84904
It produces:
bfna afna commoncnt
A C 3
B A 1
B C 1
For simplicity I left out the column factory_id as it does not add any information here.
Fun fact: as I am using only "bare-bone" SQL expressions, the above code will run on SQL-Server too without any changes.

You can do it this way:
select A as factory_name_1 , B as factory_name_2
from
(
select A, B, count(*) as Count_
from
(
select a.factory_name as A, b.factory_name as B
from yourtable a
inner join yourtable b
on a.product_id = b.product_id and a.factory_id <> b.factory_id
)a group by A, B
)a
inner join
(select factory_name, count(*) as Count_ from yourtable group by factory_name) b
on a.A = b.factory_name and a.Count_ = b.Count_
Order by 1
Output:
factory_name_1 factory_name_2
A C
B A
B C

The other solutions just seem more complicated than necessary. This is basically a self-join with aggregation:
with t as (
select t.*, count(*) over (partition by factory_id) as cnt
from tbl t
)
select t1.factory_id, t2.factory_id, t1.factory_name, t2.factory_name, count(*)
from t t1 join
t t2
on t1.product_id = t2.product_id and t1.factory_id <> t2.factory_id
group by t1.factory_id, t2.factory_id, t1.factory_name, t2.factory_name, t1.cnt
having count(*) = max(t1.cnt);
Here is a db<>fiddle.

Related

Find missing dates of each product where multiple products are stored in the same table

I have a table structure similar to this which tracks some product data in a daily base:
product_id|columnA|columnB|my_date|
1 | a1 | a2 |2021-03-03|
1 | a1 | a2 |2021-03-04|
1 | a1 | a2 |2021-03-06|
1 | a1 | a2 |2021-03-07|
1 | a1 | a2 |2021-03-10|
2 | a1 | a2 |2021-06-01|
2 | a1 | a2 |2021-06-03|
...
(more product_id)
As you can see, |2021-03-05|, |2021-03-08| and |2021-03-09| are missing for product_id 1 and |2021-06-02| is missing for product_id
2 .
I want to get all the missing dates for each product_id, the result table should look like:
product_id|mssing_date|
1 |2021-03-05|
1 |2021-03-08|
1 |2021-03-09|
2 |2021-06-02|
... ....
other_ids |other_missing dates|
Use a cross join to general all combinations of products and dates. Then remove them . . . one method is a left join:
select p.product_id, d.my_date
from (select distinct product_id from t) p cross join
(select distinct my_date from t) d left join
t
on t.product_id = p.product_id and t.my_date = p.my_date
where t.product_id is null;
EDIT:
For the revised question (based on the comments), you can just calculate the date range and use that for the query:
select p.product_id, d.my_date
from (select product_id, min(my_date) as min_my_date, max(my_date) as max_my_date
from t
group by product_id
) p join
(select distinct my_date from t) d
on d.my_date bewteen p.min_my_date and p.max_my_date left join
t
on t.product_id = p.product_id and
t.my_date = p.my_date
where t.product_id is null;
These methods assume that there is at least one row for each date in the data. In not, you need a different way to generate the dates, such as a calendar table or recursive CTE.
WITH RECURSIVE all_dates AS (
SELECT '2021-01-01' AS d_date
UNION
SELECT d_date + INTERVAL 1 DAY
FROM all_dates
WHERE d_date < '2021-12-31')
SELECT product_id, d_date
FROM all_dates
LEFT JOIN products ON d_date = product.date
WHERE product.date IS NULL;
Using MySQL 8
https://dba.stackexchange.com/questions/224182/generate-dates-between-date-ranges-in-mysql

Check if ID from a table exists in another table, and if so, how many times

Let's say I have two tables
Table a
some_ID
1
2
3
4
Table b
some_ID
1
2
1
4
Now what I would like to receive is a table like
id amount
1 | 2
2 | 1
I tried with a following query:
SELECT COUNT(a.some_id) as id
FROM Table_a
INNER JOIN Table_b
ON Table_a.some_id = Table.b.some_id
but that only returned how many id rows there are in both tables.
Any help?
Do the grouping on table_b and then join that result set on table_a
SELECT b.* FROM
(
SELECT id, COUNT(*) AS Cnt
FROM Table_b
GROUP BY id
) b
INNER JOIN Table_a a ON a.id = b.id
SQLFiddle
If you want the zero counts:
SELECT a.some_id AS id, count(b.some_id) as amount
FROM a LEFT JOIN b ON a.some_id = b.some_id
GROUP BY a.some_id
Result:
id | amount
1 | 2
2 | 1
3 | 0
4 | 1
If not:
SELECT a.some_id AS id, count(*) as amount
FROM a INNER JOIN b ON a.some_id = b.some_id
GROUP BY a.some_id
Result:
id | amount
1 | 2
2 | 1
4 | 1
The difference is the join type. Once left outer join. Then inner join. Note that in the first case it is important to count with count(b.some_id). With count(*) the rows with missing b entries would be counted as 1. count(*) counts the rows. count(expression) counts the non-null values.
If I understand correctly, you want a histogram of histograms:
select cnt, count(*) as num_ids
from (select id, count(*) as cnt
from b
group by id
) b
group by cnt;

Find the same sets of pairs

I have such scheme in mysql:
TableA (id integer PK, pid integer, mid integer)
Ex. data:
id | pid | mid
1 | 2 | 2
2 | 2 | 4
3 | 3 | 4
4 | 4 | 2
5 | 4 | 4
6 | 3 | 2
7 | 3 | 5
I have pid with some mid's and want to find all pid's with the same set of mid's. In example for pid=2 answer is 2,4
group_concat is not suitable for me
I think it should be simple, but the answer eludes me
UPD:
I have tried group_concat:
SELECT DISTINCT(b.pid) FROM (SELECT pid, group_concat(mid) as concated FROM TableA where pid=100293) as a, (select pid, group_concat(mid) as concated, COUNT(1) as count FROM TableA group by pid) as b where a.concated=b.concated;
Since you are working with integers, instead of group_concat you could generate a bitmask on distinct mid values for each pid and join on that. Then it's just math all the way down:
SELECT DISTINCT pid
FROM (SELECT pid, sum(pow(2,mid)) as midmask FROM (SELECT distinct pid, mid FROM tableA) as t1a GROUP BY pid) as t1
INNER JOIN (SELECT pid, sum(pow(2,mid)) as midmask FROM (SELECT distinct pid, mid FROM tableA) as t2a GROUP BY pid) as t2
ON t1.midmask = t2.midmask
IF mid is already distinct for each pid then you can get rid of the inner-inner subqueries.
Using #GordonLinoff's excellent single-subquery approach where GROUP_CONCAT is only used on the main query (where it won't be so expensive). Instead of the group_concat on the inner query we use the bitmask approach that may be quicker.
SELECT midmask>>1, group_concat(pid)
FROM (SELECT pid, sum(pow(2,mid)) as midmask FROM (SELECT distinct pid, mid FROM tableA) as t1a GROUP BY pid) as t1
GROUP BY midmask;
Results:
+---------+-------------------+
| midmask | group_concat(pid) |
+---------+-------------------+
| 10 | 2,4 |
| 26 | 3 |
+---------+-------------------+
Obviously that midmask in the result set isn't super necessary, but you can pick out the values from the bitmask if you want to see the mid values that contributed to the match if you like.
I'm using the bit right-shift operator to insure that the proper bit is set in the midmask result otherwise you'll be off by one. If you don't care about the output of the midmask, then don't bother with the >>1 portion of the query.
You can use this query. It will give you comma separated pids.
select `mid`, group_concat(`pid`) from `tableA` group by `mid`;
In MySQL, I would approach this using group_concat():
select mids, group_concat(pid)
from (select pid, group_concat(mid order by mid) as mids
from t
group by pid
) t
group by mids;
This solves the general problem, for all pids. Solving for 1 pid is a bit tricky in MySQL (no window functions), but you can try:
select t.pid, t2.pid, count(*)
from t join
t t2
on t.mid = t2.mid and t2.pid = 2
group by t.pid, t2.pid
having count(*) = (select count(*) from t where t.pid = t.pid) and
count(*) = (select count(*) from t where t.pid = t2.pid);
For this, you want indexes on t(mid, pid) and t(pid).

delete rows in mysql

If I have a table:
id1 id2 count
A A 1
A B 2
A C 1
B A 3
B B 1
B C 2
C A 3
C B 2
C C 1
What I want after deleting:
id1 id2 count
A A 1
A B 2
A C 1
B B 1
B C 2
C C 1
which means if I have A(id1) --> B(id2) then delete B(id1) --> A(id2). same as B(id1) --> C(id2) then delete the row C(id1) --> B(id2)
Thank you for ur help!
In this case we analyze Target.id1 > Target.id2 mean case like (B, A, ??) where B > A
this also ignore cases like (A, A, ??)
Then use self left join to try find another row with (A, B, ??)
If we found a match then Source.id1 IS NOT NULL and we delete
SQL Fiddle Demo
DELETE Target
FROM Table1 Target
LEFT JOIN Table1 Source
ON Target.`id1` = Source.`id2`
AND Target.`id2` = Source.`id1`
AND Target.`id1` > Target.`id2`
WHERE Source.`id1` IS NOT NULL;
OUTPUT
| id1 | id2 | count |
|-----|-----|-------|
| A | A | 1 |
| A | B | 2 |
| A | C | 1 |
| B | B | 1 |
| B | C | 2 |
| C | C | 1 |
Should be something like:
DELETE FROM 'myTable'
WHERE STRCMP(id1, id2) > 0;
STRCMP function can compare the strings and return an int. From there it should be easy - something very similar to the above. If you have further trouble let me know.
It looks like what you are saying is...
If there is a (id1,id2) tuple in the table with values e.g. (a,b), and there is another tuple (b,a) that consists of the the same values, but swapped in the columns, you want to remove one of those tuples. It looks like the one you want to remove is the one that has the "greater" value in the first column.
First, identify the "duplicate" tuples.
For now, we'll ignore the tuples where the values of id1 and id2 are the same, e.g. (a,a).
SELECT s.id1
, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
ORDER BY s.id1, s.id2
If that returns the set of rows you want to remove, we can convert that into a DELETE. To do that, we need to change that query into an inline view,
We can re-write that to be like this, verify we get equivalent results.
SELECT o.id1, o.id2
FROM ( SELECT q.id1, q.id2
FROM ( SELECT s.id1, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
) q
GROUP BY q.id1, q.id2
) p
JOIN mytable o
ON o.id1 = p.id1
AND o.id2 = p.id2
ORDER BY o.id1, o.id2
Then we can convert that to a DELETE statement, replacing SELECT o.id1, o.id2 WITH DELETE o.* and removing the ORDER BY...
DELETE o.*
FROM ( SELECT q.id1, q.id2
FROM ( SELECT s.id1, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
) q
GROUP BY q.id1, q.id2
) p
JOIN mytable o
ON o.id1 = p.id1
AND o.id2 = p.id2

MySQL INNER JOIN from second table (TOP10)

$stmt = $conn->prepare('SELECT a.*, c.*, SUM(a.money+b.RESULT) AS ARESULT
FROM users a
INNER JOIN bankaccounts c
ON a.id = c.owner
INNER JOIN
(
SELECT owner, SUM(amount) AS RESULT
FROM bankaccounts
GROUP BY owner
) b ON a.id = b.owner
ORDER BY ARESULT DESC LIMIT 10');
What's problem, it show wrong only one record? I want list max 10 records - like TOP 10 richest who has [money+(all his bankaccounts amount)]
Lets say.. I have 2 tables.
Table: users
ID | username | money
1 | richman | 500
2 | richman2 | 600
Table: bankaccounts
ID | owner | amount
65 | 1 | 50
68 | 1 | 50
29 | 2 | 400
So it would list:
richman2 1000$
richman 600$
Try using a subqueries...
$stmt = $conn->prepare('SELECT a.*,
IFNULL((SELECT SUM(amount) FROM bankaccounts b WHERE b.owner=a.id),0) AS BANK_MONEY,
(IFNULL(a.money,0) + IFNULL((SELECT SUM(amount) FROM bankaccounts c WHERE c.owner=a.id),0)) AS ARESULT
FROM users a
ORDER BY ARESULT DESC LIMIT 0, 10');
EDIT: Added a field for bank account totals
EDIT2: Added IFNULL to SQL statement in case user is not in BankAccounts table
Try this:
SELECT a.*, (a.money + b.RESULT) AS ARESULT
FROM users a
INNER JOIN (SELECT owner, SUM(amount) AS RESULT
FROM bankaccounts
GROUP BY owner
) b ON a.id = b.owner
ORDER BY ARESULT DESC
LIMIT 10