Building SQL Join for complement of data. T-SQL help needed - mysql

Assume I have data as,
declare #TableA table
(
TableAID int,
TableAName varchar(10)
)
declare #TableB table
(
TableBID int,
TableBName varchar(10),
TableAID int
)
insert into #TableA values
(1, 'A 1'),
(2, 'A 2'),
(3, 'A 3')
insert into #TableB values
(1, 'B 1', 1),
(2, 'B 2', 2)
I want to write a join and NOT SQL query which returns me data just as shown below,
TableAName TableBName
---------- ----------
A 3 N/A
In short get a complement of the view with Inner Joins!

This is a classic use for an OUTER JOIN and most commonly this is done using a LEFT OUTER JOIN (commonly abbreviated to just LEFT JOIN)
SELECT A.TableAName, B.TableBName
FROM TableA A
LEFT JOIN TableB B on A.TableAID = B.TableAID
WHERE B.TableAID IS NULL
An outer join allows unequal record numbers, here TableA has 3 but TableB has 2. When there is no matching data in TableB NULLs will exist, and hence you can filter for NULL as shown above.
Please do yourself a favour, go here for a visual representation of joins
& look for Left Excluding JOIN

Related

MySQL: filter child records, include all siblings

There are two MySQL tables:
tparent(id int, some data...)
tchild(id int, parent_id int, some data...)
I need to return all columns (parent plus all children) where at least one of the children matches some criteria.
My current solution:
-- prepare sample data
DROP TABLE IF EXISTS tparent;
DROP TABLE IF EXISTS tchild;
CREATE TABLE tparent (id int, c1 varchar(10), c2 date, c3 float);
CREATE TABLE tchild(id int, parent_id int, c4 float, c5 varchar(20), c6 date);
CREATE UNIQUE INDEX tparent_id_IDX USING BTREE ON tparent (id);
CREATE UNIQUE INDEX tchild_id_IDX USING BTREE ON tchild (id);
INSERT INTO tparent
VALUES
(1, 'a', '2021-01-01', 1.23)
, (2, 'b', '2021-02-01', 1.32)
, (3, 'c', '2021-01-03', 2.31);
INSERT INTO tchild
VALUES
(10, 1, 22.333, 'argh1', '2000-01-01')
, (20, 1, 33.222, 'argh2', '2000-01-02')
, (30, 1, 44.555, 'argh3', '2000-02-02')
, (40, 2, 33.222, 'argh4', '2000-03-02')
, (50, 3, 33.222, 'argh5', '2000-04-02')
, (60, 3, 33.222, 'argh6', '2000-05-02');
-- the query
WITH parent_filter AS
(
SELECT
parent_id
FROM
tchild
WHERE
c4>44
)
SELECT
p.*,
c.*
FROM
tparent p
JOIN tchild c ON p.id = c.parent_id
JOIN parent_filter pf ON p.id = pf.parent_id;
It returns 3 rows for parent id 1 and child ids 10, 20, 30, because child id 30 has a matching record. It does not return data for any other parent id.
However, I am querying tchild twice here (first in the CTE, then again in the main query). As both tables are relatively big (10s - 100s millions of rows, 2-5 child records per parent record on average), I am hitting performance / timing issues.
Is there a better way of achieving this filtering? I.e. without having to query tchild table more than once?
did you try this version?
SELECT *
FROM tparent p
JOIN tchild c ON p.id = c.parent_id AND <criteria>
this way you limit the tchild table with the createria before the actual join
Perhaps you can use this instead:
select p.*, c.*
from tparent p
join tchild c
on p.id = c.parent_id
where exists (select 1 from tchild where <crtiteria>)
This should retrieve all rows for parent and child join when there is at least one record in the child table meeting the criteria.

MySQL using an anti-join to select non duplicate values between two tables

I am trying to select values from table A based on values from table B. But, I can't figure out what to use.
Goal:
A user on my website gets a whole list of check-boxes from table A on a web-page.
Then the user chooses a check-box and the value from the check-box is
inserted into table B.
At some point in time, the user returns to that web-page and only sees the checkboxes that weren't inserted into table B.
In database terms, I would use a select query that compares table A (which holds ALL of the values) and table B (which basically stores a copy of a value from table A).
Here's my query. wp_ml_skill_class is table A and wp_ml_character_skill is table B
SELECT DISTINCT
s.skill_name, s.skill_id, c.char_id, c.um_id, c.class_id
FROM
`wp_ml_skill_class` sc
JOIN
`wp_ml_skill` s
ON
(s.skill_id = sc.skill_id)
JOIN
`wp_ml_character` c
WHERE
c.class_id = 3
AND
c.char_id = 5
AND
sc.skill_id
NOT IN
(SELECT cs.skill_id FROM wp_ml_character_skill cs);
You could do this with a LEFT JOIN between table A & B, checking for NULL results from table B, which will represent the rows in table A which are not in table B.
SELECT A.*
FROM wp_ml_skill_class A
LEFT JOIN wp_ml_character_skill B
ON B.skill_id = A.skill_id
WHERE B.skill_id IS NULL
Here's a small example to demonstrate:
create table A (id int, val varchar(10));
create table B (id int, val varchar(10));
insert into A values (1, 'a'), (2, 'b'), (3, 'c');
insert into B values (2, 'b');
SELECT *
FROM A
LEFT JOIN B
ON B.id = A.id
WHERE B.id IS NULL
Output:
id val id val
1 a (null) (null)
3 c (null) (null)

Join two tables and remove duplicates

I'm trying to join two tables. Where table2 has duplates.
The tables look something like
CREATE TABLE ta
(
id int,
cno varchar(30),
d1 varchar(30),
d2 int
);
CREATE TABLE tb
(
id int,
cno varchar(30),
cn1 varchar(30),
cn2 int
);
INSERT INTO ta
(id, cno, d1, d2)
VALUES
(1, '1234','a',2),
(2, '6456','j',3),
(3, '5456','h',4),
(4, '4454','g',5);
INSERT INTO tb
(id, cno, cn1, cn2)
VALUES
(1, '1234', 'a', 21),
(1, '1234', 'a', 22),
(2, '6456', 'b', 33),
(2, '6456', 'c', 34),
(2, '6456', 'c', 35),
(3, '5456', 'c', 36),
(4, '4454', 'c', 37);
I was able to get the result http://sqlfiddle.com/#!2/b282e3/1 in MySQL. However when I run it in Postgresql I get an error http://sqlfiddle.com/#!15/b282e/4
Output should be like http://sqlfiddle.com/#!2/b282e3/1
CNO CN1 CN2 D1 D2
1234 a 21 a 2
4454 c 37 g 5
5456 c 36 h 4
6456 b 33 j 3
Any alternatives for this in Psql?
Use aggregate functions for columns that are not used in GROUP BY:
select t2.cno,
min(t2.cn1) as a,
min(t2.cn2) as b,
min(t1.d1) as c,
min(t1.d2) as d
from ta as t1
inner join tb as t2
on t1.cno=t2.cno
group by t2.cno
http://sqlfiddle.com/#!15/b282e/23
This query in MySQL:
select t2.cno, t2.cn1, t2.cn2, t1.d1, t1.d2
from ta t1 inner join
tb t2
on t1.cno = t2.cno
group by t2.cno;
Is not valid SQL (according to the standard or other databases). The problem is that there are columns in the select that are neither in the group by nor are they arguments to aggregation functions (and they are not "functionally dependent" either). Your use of the group by extension in MySQL is officially discouraged. You can read the documentation about it here.
Ironically, Postgres has an extension called distinct on that does something similar. The syntax is:
select distinct on (t2.cno) t2.cno, t2.cn1, t2.cn2, t1.d1, t1.d2
from ta t1 inner join
tb t2
on t1.cno = t2.cno
order by t2.cno;
distinct on takes a list in parentheses and returns one row per value in the parentheses -- taking the first row and ignoring the rest. These columns need to match the columns in the order by, otherwise Postgres generates a compile-time error.
In most other databases, you would do something similar using row_number(). And you can use that as well in Postgres.
select t2.cno, min (t2.cn1), min(t2.cn2), t1.d1 , t1.d2
from ta as t1
inner join tb as t2 on t1.cno=t2.cno
group by t2.cno, t1.d1 , t1.d2
WITH Queries (Common Table Expressions)
with cte as
(
select cno,cn1,cn2 from tb where cn2 in (select min(cn2) from tb group by cno)
),
cte1 as
(
select d1,d2,cno from ta where cno in (select cno from tb where cn2 in (select
min(cn2) from tb group by cno))
)
select cte.cno,cn1,cn2,d1,d2 from cte inner join cte1 on cte1.cno = cte.cno order
by cte.cno

Bringing records in TableA Not in Table B Left Outer Join

The below question was asked in interview to me. I don't know whether its possible or not to use left outer join in this case
CREATE TABLE TableA(Id INT, Name VARCHAR(255));
CREATE TABLE TableB(Id INT);
INSERT INTO TableA(Id, Name)
VALUES (1, 'Person A'),
(2, 'Person B'),
(3, 'Person C'),
(4, 'Person D'),
(5, 'Person E'),
(6, 'Person F');
INSERT INTO TableB(Id)
VALUES (1),
(2),
(3);
The output should be
Name
Person D
Person E
Person F
Two Table. TableA and Table B. I want the Names in Table A which are not in Table B. Is it Possible to do this by Left outer Join. With paper and pen I struggled for few minutes and I wrote a query in paper which I found wrong later.
Note: Please don't use Sub query. I did the same and the interviewer asked me to do that by left outer join.
Let me know whether its possible are not.
SQL Fiddle
Sounds easy enough
SELECT * FROM TableA a
LEFT JOIN TableB b on b.Id=a.Id
WHERE b.ID is null
That should give you the matches in table. The trick is to realize that you're interested in the null rows on the B side.
You may use Subquery ?
SELECT Name FROM TableA
WHERE TableA.ID not in (SELECT TableB.ID From TableB)

Help with INSERT INTO..SELECT

I'm inserting a large number of rows into Table_A. Table_A includes a B_ID column which points to Table_B.B_ID.
Table B has just two columns: Table_B.B_ID (the primary key) and Table_B.Name.
I know the value for every Table_A field I'm inserting except B_ID. I only know the corresponding Table_B.Name. So how can I insert multiple rows into Table_A?
Here's a pseudocode version of what I want to do:
REPLACE INTO Table_A (Table_A.A_ID, Table_A.Field, Table_A.B_ID) VALUES
(1, 'foo', [SELECT B_ID FROM Table_B WHERE Table_B.Name = 'A'),
(2, 'bar', [SELECT B_ID FROM Table_B WHERE Table_B.Name = 'B'),...etc
I've had to do things like this when deploying scripts to a production environment where Ids differed in environments. Otherwise it's probably easier to type out the ID's
REPLACE INTO table_a (table_a.a_id, table_a.field, table_a.b_id)
SELECT 1, 'foo', b_id, FROM table_b WHERE name = 'A'
UNION ALL SELECT 2, 'bar', b_id, FROM table_b WHERE name = 'B'
If the values:
(1, 'foo', 'A'),
(2, 'bar', 'B'),
come from a (SELECT ...)
you can use this:
INSERT INTO Table_A
( A_ID, Fld, B_ID)
SELECT Data.A_ID
, Data.Field
, Table_B.B_ID
FROM (SELECT ...) As Data
JOIN Table_B
ON Table_B.Name = Data.Name
If not, you can insert them into a temporary table and then use the above, replacing (SELECT ...) with TemporaryTable.
CREATE TABLE HelpTable
( A_ID int
, Fld varchar(200)
, Name varchar(200)
) ;
INSERT INTO HelpTable
VALUES
(1, 'foo', 'A'),
(2, 'bar', 'B'), etc...
;
INSERT INTO Table_A
( A_ID, Field, B_ID)
SELECT HelpTable.A_ID
, HelpTable.Fld
, Table_B.B_ID
FROM HelpTable
JOIN Table_B
ON Table_B.Name = HelpTable.Name
;
DROP TABLE HelpTable ;