MySQL ORDER BY field created by GROUP_CONCAT() - mysql

Input two table, TABLE_A and TABLE_B
TABLE_A TABLE_B
A_ID | A A_ID | B
1 | a 1 | b
2 | a1 1 | b1
3 | a2 2 | b2
Expecting Output TABLE C
TABLE_C
A_ID | A | C
3 | a2 | NULL <--- NULL if no matched A_ID in TABLE B
1 | a | b,b1 <--- Concat all rows in TABLE B with ','
2 | a1 | b2
Following code can almost give the above TABLE_C except I want to sort field C with NULL first then DESC. ORDER BY C IS NULL DESC does not seems to work. Note that if C is NULL TABLE_C will order by A_ID regardless the value in field C.
SELECT
A1.A_ID,
A1.A,
GROUP_CONCAT(B1.B SEPARATOR ',') as 'C'
FROM `TABLE_A` A1
LEFT JOIN `TABLE_B` B1
ON A1.A_ID=B1.A_ID
GROUP BY A1.A_ID, A1.A;
Following SQL gives error.
SELECT
A1.A_ID,
A1.A,
GROUP_CONCAT(B1.B SEPARATOR ',') as 'C'
FROM `TABLE_A` A1
LEFT JOIN `TABLE_B` B1
ON A1.A_ID=B1.A_ID
GROUP BY A1.A_ID, A1.A
ORDER BY C IS NULL DESC, A1.A_ID; <--- Order by C with NULL failed.
Reference 'C' not supported (reference to group function)

To ORDER BY Null values first and then by A1.A_ID use:
SELECT A1.A_ID, A1.A, GROUP_CONCAT(B1.B SEPARATOR ',') as C
FROM `TABLE_A` A1
LEFT JOIN `TABLE_B` B1 ON A1.A_ID=B1.A_ID
GROUP BY A1.A_ID, A1.A
ORDER BY (CASE WHEN GROUP_CONCAT(B1.B SEPARATOR ',') IS NULL then GROUP_CONCAT(B1.B SEPARATOR ',') ELSE A1.A_ID END) ;
and regarding the error Reference 'C' not supported (reference to group function) you should be ordering by 'C' ie
ORDER BY 'C' IS NULL DESC, A1.A_ID;
and not ORDER BY C IS NULL DESC, A1.A_ID;

You can not use Is Null in order by
SELECT
A1.A_ID,
A1.A,
GROUP_CONCAT(B1.B SEPARATOR ',') as 'C'
FROM `TABLE_A` A1
LEFT JOIN `TABLE_B` B1
ON A1.A_ID=B1.A_ID
GROUP BY A1.A_ID, A1.A
ORDER BY C, A1.A_ID DESC

Related

MySQL COUNT and GROUP BY subquery

I'm trying to find the count of posts grouped by branch and category. I'm not getting the categories with count 0.
CREATE TABLE branches
(`id` serial primary key, `name` varchar(7) unique)
;
INSERT INTO branches
(`id`, `name`)
VALUES
(1, 'branch1'),
(2, 'branch2'),
(3, 'branch3')
;
CREATE TABLE categories
(`id` serial primary key, `category` varchar(4) unique)
;
INSERT INTO categories
(`id`, `category`)
VALUES
(1, 'cat1'),
(2, 'cat2')
;
CREATE TABLE posts
(`id` serial primary key, `branch_id` int, `category_id` int, `title` varchar(6), `created_at` varchar(10))
;
INSERT INTO posts
(`id`, `branch_id`, `category_id`, `title`, `created_at`)
VALUES
(1, 1, 1, 'Title1', '2017-12-14'),
(2, 1, 2, 'Title2', '2018-01-05'),
(3, 2, 1, 'Title3', '2018-01-10')
;
Expected Output:
+---------+----------+----+----+
| branch | category | c1 | c2 |
+---------+----------+----+----+
| branch1 | cat1 | 1 | 0 |
| branch1 | cat2 | 0 | 1 |
| branch2 | cat1 | 0 | 1 |
| branch2 | cat2 | 0 | 0 |
+---------+----------+----+----+
Query tried:
SELECT b.name, x.c1, y.c2 FROM branches b
LEFT JOIN (
SELECT COUNT(id) c1 FROM posts WHERE created_at < '2018-01-01'
GROUP BY posts.branch_id, posts.category_id
) x x.branch_id = b.id
LEFT JOIN (
SELECT COUNT(id) c2 FROM posts WHERE created_at BETWEEN '2018-01-01' AND '2018-01-31'
GROUP BY posts.branch_id, posts.category_id
) y y.branch_id = b.id
GROUP BY b.id
You need to CROSS JOIN branches and categories first; then LEFT JOIN to posts and do conditional counts based on your WHERE criteria.
Generic format:
SELECT x.data, y.data
, COUNT(CASE WHEN conditionN THEN 1 ELSE NULL END) AS cN
FROM x CROSS JOIN y
LEFT JOIN z ON x.id = z.x_id AND y.id = z.y_id
GROUP BY x.data, y.data
;
Note: COUNT (and pretty much all aggregate functions) ignore NULL values.
It looks like this might do what you want.
Explanation: Get each possible combination of branch/category for branches which exists in posts. Do a conditional sum to get the counts by date range and branch/category. Then join back to branch.
SELECT b.b_id branch,
b.category,
COALESCE(Range_Sum.C1,0) C1,
COALESCE(Range_Sum.C2,0) C2
FROM ( SELECT b.id b_id,
c.id c_id,
c.category
FROM branches b,
categories c
WHERE EXISTS
( SELECT 1
FROM posts
WHERE b.id = posts.branch_id
)
) b
LEFT
JOIN (SELECT p.branch_id,
c.id c_id,
c.category,
SUM
( CASE WHEN p.created_at < '2018-01-01' THEN 1
ELSE 0
END
) C1,
SUM
( CASE WHEN p.created_at BETWEEN '2018-01-01' AND '2018-01-31' THEN 1
ELSE 0
END
) C2
FROM posts p
INNER
JOIN categories c
ON p.category_id = c.id
GROUP
BY p.branch_id,
c.category,
c.id
) Range_Sum
ON b.b_id = Range_Sum.branch_id
AND b.c_id = Range_Sum.c_id;
Also, just a thing for writing easily readable queries - NEVER use x and y as aliases. Choose anything else that could possibly be more informative.
Maybe a little contrived...
SELECT DISTINCT x.branch_id
, y.category_id
, COALESCE(z.created_at < '2018-01-01',0) c1
, COALESCE(z.created_at BETWEEN '2018-01-01' AND '2018-01-31',0) c2
FROM posts x
JOIN posts y
LEFT
JOIN posts z
ON z.branch_id = x.branch_id
AND z.category_id = y.category_id;
http://sqlfiddle.com/#!9/8aabf2/31

Mysql join on null

I have 2 tables, each with 3 columns to join with.
table A
c1 c2 c3
10 NULL NULL
10 NULL 1
10 1 NULL
table B
c1 c2 c3
10 NULL NULL
10 NULL 1
10 1 NULL
I would like to join them so that NULL = NULL, so
SELECT * FROM a JOIN b ON a.c1 = b.c1 AND a.c2 = b.c2 AND a.c3 = b.c3
I would like it to join on NULL should match NULL. So that in the end I'm getting the 3 records:
table A+B
c1 c2 c3 c1 c2 c3
10 NULL NULL 10 NULL NULL
10 NULL 1 10 NULL 1
10 1 NULL 10 1 NULL
is this possible somehow? I have tried also with IFNULL but did'n get the results what I expect. I would be grateful if you could point me to the right direction. Many thanks!
Use the NULL-safe equality operator:
SELECT *
FROM a JOIN
b
ON a.c1 <=> b.c1 AND a.c2 <=> b.c2 AND a.c3 <=> b.c3;
However, with your sample data, a join on the first column is sufficient:
SELECT *
FROM a JOIN
b
ON a.c1 = b.c1 ;

delete rows in mysql

If I have a table:
id1 id2 count
A A 1
A B 2
A C 1
B A 3
B B 1
B C 2
C A 3
C B 2
C C 1
What I want after deleting:
id1 id2 count
A A 1
A B 2
A C 1
B B 1
B C 2
C C 1
which means if I have A(id1) --> B(id2) then delete B(id1) --> A(id2). same as B(id1) --> C(id2) then delete the row C(id1) --> B(id2)
Thank you for ur help!
In this case we analyze Target.id1 > Target.id2 mean case like (B, A, ??) where B > A
this also ignore cases like (A, A, ??)
Then use self left join to try find another row with (A, B, ??)
If we found a match then Source.id1 IS NOT NULL and we delete
SQL Fiddle Demo
DELETE Target
FROM Table1 Target
LEFT JOIN Table1 Source
ON Target.`id1` = Source.`id2`
AND Target.`id2` = Source.`id1`
AND Target.`id1` > Target.`id2`
WHERE Source.`id1` IS NOT NULL;
OUTPUT
| id1 | id2 | count |
|-----|-----|-------|
| A | A | 1 |
| A | B | 2 |
| A | C | 1 |
| B | B | 1 |
| B | C | 2 |
| C | C | 1 |
Should be something like:
DELETE FROM 'myTable'
WHERE STRCMP(id1, id2) > 0;
STRCMP function can compare the strings and return an int. From there it should be easy - something very similar to the above. If you have further trouble let me know.
It looks like what you are saying is...
If there is a (id1,id2) tuple in the table with values e.g. (a,b), and there is another tuple (b,a) that consists of the the same values, but swapped in the columns, you want to remove one of those tuples. It looks like the one you want to remove is the one that has the "greater" value in the first column.
First, identify the "duplicate" tuples.
For now, we'll ignore the tuples where the values of id1 and id2 are the same, e.g. (a,a).
SELECT s.id1
, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
ORDER BY s.id1, s.id2
If that returns the set of rows you want to remove, we can convert that into a DELETE. To do that, we need to change that query into an inline view,
We can re-write that to be like this, verify we get equivalent results.
SELECT o.id1, o.id2
FROM ( SELECT q.id1, q.id2
FROM ( SELECT s.id1, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
) q
GROUP BY q.id1, q.id2
) p
JOIN mytable o
ON o.id1 = p.id1
AND o.id2 = p.id2
ORDER BY o.id1, o.id2
Then we can convert that to a DELETE statement, replacing SELECT o.id1, o.id2 WITH DELETE o.* and removing the ORDER BY...
DELETE o.*
FROM ( SELECT q.id1, q.id2
FROM ( SELECT s.id1, s.id2
FROM mytable s
WHERE s.id1 > s.id2
AND EXISTS ( SELECT 1
FROM mytable r
WHERE r.id1 = s.id2
AND r.id2 = s.id1
)
) q
GROUP BY q.id1, q.id2
) p
JOIN mytable o
ON o.id1 = p.id1
AND o.id2 = p.id2

MYSQL get the first value in JOIN query

I have 3 tables: A, B and C.
A has AID, B has AID and BID, and C has BID Value and Date.
I need to create a query that returns me AID and the first (according to date) Value from C.
WHAT I've tried:
SELECT A.AID, Value FROM A INNER JOIN B on A.AID = B.BID
INNER JOIN C ON C.BID = B.BID GROUP BY A.AID
It gives me the last Value and not the first.
Data example:
A:
AID:
1
2
3
B:
AID BID
1 1
1 2
2 3
3 4
3 5
3 6
C:
BID Value Date
1 15 1.1.1970
1 422 1.1.1992
2 945 1.1.1975
3 149 1.1.1994
3 147 1.1.2015
4 110 1.1.2004
5 142 1.1.2005
The output should be:
AID Value
1 15
2 149
3 110
If you do not have too many records with the same value and value doesn't have any commas, then the group_concat()/substring_index() trick is probably the easiest way:
select b.aid,
substring_index(group_concat(c.value order by c.date desc), ',' 1) as first_value
from c join
b
on c.bid = b.bid
group by b.aid;
Larger amounts of data require a more complicated query. Something like:
select b.aid, c.value
from c join
b
on c.bid = b.bid
where c.date = (select min(c2.date)
from c2 join
b2
on c2.bid = b2.bid
where b2.aid = b.aid
);
To restrict C to just those rows with the latest (minimum) date you need a subquery that will produce the minimum date, then use that to limit the rows from C
SELECT
A.AID
, C.Value
FROM A
INNER JOIN B ON A.AID = B.BID
INNER JOIN C ON b.bid = c.bid
INNER JOIN (
SELECT
bid
, MIN(date) AS mindate
FROM c
GROUP BY
bid
) AS m ON c.bid = m.bid
AND c.date = m.mindate
DROP TABLE IF EXISTS b;
CREATE TABLE b
(aid INT NOT NULL
,bid INT NOT NULL
,PRIMARY KEY(aid,bid)
);
INSERT INTO b VALUES
(1 ,1),
(1 ,2),
(2 ,3),
(3 ,4),
(3 ,5),
(3 ,6);
DROP TABLE IF EXISTS c;
CREATE TABLE c
(bid INT NOT NULL
,value INT NOT NULL
,date DATE
,PRIMARY KEY(bid,date)
);
INSERT INTO c VALUES
(1 ,15 ,'1970-01-01'),
(1 ,422 ,'1992-01-01'),
(2 ,945 ,'1975-01-01'),
(3 ,149 ,'1994-01-01'),
(3 ,147 ,'2015-01-01'),
(4 ,110 ,'2004-01-01'),
(5 ,142 ,'2005-01-01');
SELECT x.aid
, y.value
FROM b x
JOIN c y
ON y.bid = x.bid
JOIN
( SELECT b.aid
, MIN(c.date) min_date
FROM b
JOIN c
ON c.bid = b.bid
GROUP
BY b.aid
) z
ON z.min_date = y.date
AND z.aid = x.aid;
+-----+-------+
| aid | value |
+-----+-------+
| 1 | 15 |
| 2 | 149 |
| 3 | 110 |
+-----+-------+

Select every field that fullfil the condition

I have a table call production
factory_id | factory_name | product_id
1 | A | 1
1 | A | 2
1 | A | 3
2 | B | 3
3 | C | 1
3 | C | 2
3 | C | 3
3 | C | 4
3 | C | 5
I'm trying to develop a query that will return two factory name pair such that every product of factory1 is produced by factory2, result looked like:
factory_name_1 | factory_name_2
A | C
B | A
B | C
I have some nested self join and renames, but I can't wrap my head around how I can apply EXISTS or IN for this scenario that does "for each product produced by factory X do condition". Thanks to any help in advanced.
Update:
Sorry that I forgot to paste my query:
select t0.fname0, t1.fname1
from (
select factory_id as fid0, factory_name as fname0, product_id as pid0, count(distinct factory_id, product_id) as pnum0
from production
group by factory_id
) t0
join
(
select factory_id as fid1, factory_name as fname1, product_id as pid1, count(distinct factory_id, product_id) as pnum1
from production
group by factory_id
) t1
where t0.fid0 <> t1.fid1
and t0.pnum0 < t1.pnum1
and t0.pid0 = t1.pid1;
Update 2: production is the only table. Expected output factory1 and factory2 are just the rename of factory_name attribute.
You need to JOIN the table for each factory pairing to make sure they "join" on the same product_ids, otherwise you might end up with similar counts for DISTINCT product_ids but these will not necessarily refer to the same product_ids.
This is my take on it:
SELECT bfna,afna, pcnt FROM (
SELECT a.factory_name afna, b.factory_name bfna, COUNT(DISTINCT b.product_id) commoncnt
FROM tbl a LEFT JOIN tbl b ON b.factory_name!=a.factory_name AND b.product_id=a.product_id
GROUP BY a.factory_name, b.factory_name
) c
INNER JOIN (
SELECT factory_name fna, COUNT(DISTINCT product_id) pcnt
FROM TBL GROUP BY factory_name
) d ON fna=bfna AND commoncnt=pcnt
ORDER BY bfna,afna
You can find a demo here: https://rextester.com/JJGCK84904
It produces:
bfna afna commoncnt
A C 3
B A 1
B C 1
For simplicity I left out the column factory_id as it does not add any information here.
Fun fact: as I am using only "bare-bone" SQL expressions, the above code will run on SQL-Server too without any changes.
You can do it this way:
select A as factory_name_1 , B as factory_name_2
from
(
select A, B, count(*) as Count_
from
(
select a.factory_name as A, b.factory_name as B
from yourtable a
inner join yourtable b
on a.product_id = b.product_id and a.factory_id <> b.factory_id
)a group by A, B
)a
inner join
(select factory_name, count(*) as Count_ from yourtable group by factory_name) b
on a.A = b.factory_name and a.Count_ = b.Count_
Order by 1
Output:
factory_name_1 factory_name_2
A C
B A
B C
The other solutions just seem more complicated than necessary. This is basically a self-join with aggregation:
with t as (
select t.*, count(*) over (partition by factory_id) as cnt
from tbl t
)
select t1.factory_id, t2.factory_id, t1.factory_name, t2.factory_name, count(*)
from t t1 join
t t2
on t1.product_id = t2.product_id and t1.factory_id <> t2.factory_id
group by t1.factory_id, t2.factory_id, t1.factory_name, t2.factory_name, t1.cnt
having count(*) = max(t1.cnt);
Here is a db<>fiddle.