Select rows that are indirectly referenced in another table - mysql

PREPARATION
Consider this script to create a MySQL dummy-database:
CREATE SCHEMA `zzz_dummy` ;
CREATE TABLE `zzz_dummy`.`subtable2` (
`id` INT NOT NULL UNIQUE,
`col1` VARCHAR(45) NULL,
PRIMARY KEY (`id`));
CREATE TABLE `zzz_dummy`.`subtable1` (
`id` INT NOT NULL UNIQUE,
`ref_subtab2` INT NULL,
PRIMARY KEY (`id`));
CREATE TABLE `zzz_dummy`.`maintable` (
`id` INT NOT NULL UNIQUE,
`ref_subtab1` INT NULL,
PRIMARY KEY (`id`));
ALTER TABLE `zzz_dummy`.`maintable`
ADD INDEX `fk_subtab1_idx` (`ref_subtab1` ASC);
ALTER TABLE `zzz_dummy`.`maintable`
ADD CONSTRAINT `fk_subtab1`
FOREIGN KEY (`ref_subtab1`)
REFERENCES `zzz_dummy`.`subtable1` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION;
ALTER TABLE `zzz_dummy`.`subtable1`
ADD INDEX `fk_subtab2_idx` (`ref_subtab2` ASC);
ALTER TABLE `zzz_dummy`.`subtable1`
ADD CONSTRAINT `fk_subtab2`
FOREIGN KEY (`ref_subtab2`)
REFERENCES `zzz_dummy`.`subtable2` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION;
INSERT INTO zzz_dummy.subtable2 VALUES
(1,'ref_val_1'),
(2,'ref_val_2'),
(3,'no_ref');
INSERT INTO zzz_dummy.subtable1 VALUES
(1,'1'),
(2,'2'),
(3,'3');
INSERT INTO zzz_dummy.maintable VALUES
(1,'1'),
(2,'2'),
(3,'1'),
(4,'1'),
(5,'2'),
(6,'1');
This will produce the following tables and entries:
maintable:
+----+-------------+
| id | ref_subtab1 |
+----+-------------+
| 1 | 1 |
| 3 | 1 |
| 4 | 1 |
| 6 | 1 |
| 2 | 2 |
| 5 | 2 |
+----+-------------+
subtable1:
+----+-------------+
| id | ref_subtab2 |
+----+-------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+-------------+
subtable2:
+----+-----------+
| id | col1 |
+----+-----------+
| 1 | ref_val_1 |
| 2 | ref_val_2 |
| 3 | no_ref |
+----+-----------+
PROBLEM
As you can see, the column ref_subtab1 in maintable references id in subtable1, which column ref_subtab2 finally references id in subtable2. I want to select all rows in subtable2 that are indirectly referenced in aforementioned manner.
I have tried
SELECT subtable2.* FROM zzz_dummy.subtable2
INNER JOIN zzz_dummy.maintable
INNER JOIN zzz_dummy.subtable1
WHERE zzz_dummy.maintable.ref_subtab1=zzz_dummy.subtable1.id
AND zzz_dummy.subtable1.ref_subtab2=zzz_dummy.subtable2.id;
but this returns 6 results, one for every match in maintable:
+----+-----------+
| id | col1 |
+----+-----------+
| 1 | ref_val_1 |
| 1 | ref_val_1 |
| 1 | ref_val_1 |
| 1 | ref_val_1 |
| 2 | ref_val_2 |
| 2 | ref_val_2 |
+----+-----------+
I do not want redundant values, I would like it to return:
+----+-----------+
| id | col1 |
+----+-----------+
| 1 | ref_val_1 |
| 2 | ref_val_2 |
+----+-----------+
Can this be done efficiently with a MySQL statement?

As already commented use distinct to get the unique result combination and also move those conditions from WHERE clause to JOIN ON condition like
SELECT distinct subtable2.*
FROM zzz_dummy.subtable2
INNER JOIN zzz_dummy.subtable1
ON zzz_dummy.subtable1.ref_subtab2 = zzz_dummy.subtable2.id
INNER JOIN zzz_dummy.maintable
ON zzz_dummy.maintable.ref_subtab1 = zzz_dummy.subtable1.id;

Related

Try to use Insert On Duplicate Key update command in MYSQL, but one table doesn't have PK

I have two table :
MST table:
ID (PK INT)
Content (Varchar255)
NAME (Varchar255)
STG table:
ID (INT) * not Primary Key
Content (Varchar255)
NAME (Varchar255)
Then how can I insert data from STG to MST table, if there is duplicate in ID I will update the rest of columns, or insert new rows.
It doesn't matter does the source has PK/UNIQUE. Each separate insert performs separate and independent check on target table only.
But you must have some definite source data ordering - the result depends on it.
DEMO fiddle - different ordering causes different result.
I have a method where in you can insert data into MTG but duplicate ID won't be inserted into MTG
First, we create a table
create table MTG like STG;
This will create a table like STG, but won't insert the records in it.
Remember only the columns, its attributes and constraints are copied by using like clause.
desc MTG;
+---------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+-------+
| ID | int | YES | | NULL | |
| content | varchar(225) | YES | | NULL | |
| Name | varchar(225) | YES | | NULL | |
+---------+--------------+------+-----+---------+-------+
Then we modify MTG column ID to make it primary key
alter table MTG modify id int primary key;
I have inserted some fake data in STG which is like:
select * from STG;
+------+---------+---------+
| ID | content | Name |
+------+---------+---------+
| 1 | FUN | Swap |
| 1 | Vines | Swapnil |
| 2 | dint | Raj |
+------+---------+---------+
As soon as we fire below Query
insert ignore into MTG select * from STG;
It will ignore duplicate records.
as you can see
+----+---------+------+
| id | content | Name |
+----+---------+------+
| 1 | FUN | Swap |
| 2 | dint | Raj |
+----+---------+------+

Is there a way in MySQL to implicitly create a primary key for a table?

In MySQL, when CREATE TABLE, is there a way for MySQL to implicitly create a column (i.e. a column not explicitly declared in CREATE TABLE command) as the primary key of the table?
Thanks.
No, the PRIMARY KEY needs to be defined on the table.
You may be thinking about this, which applies to InnoDB engine:
If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB
internally generates a hidden clustered index named GEN_CLUST_INDEX on
a synthetic column containing row ID values. The rows are ordered by
the ID that InnoDB assigns to the rows in such a table. The row ID is
a 6-byte field that increases monotonically as new rows are inserted.
Thus, the rows ordered by the row ID are physically in insertion
order.
Below is an example that shows this index's creation for a table with no PRIMARY KEY and no UNIQUE column.
# Create the table
create table test.check_table (id int, description varchar(10)) ENGINE = INNODB;
# Verify that there is no primary or unique column
desc test.check_table;
+-------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| description | varchar(10) | YES | | NULL | |
+-------------+-------------+------+-----+---------+-------+
# Insert some values
insert into test.check_table values(1, 'value-1');
insert into test.check_table values(2, 'value-2');
insert into test.check_table values(null, 'value-3');
insert into test.check_table values(4, null);
insert into test.check_table values(1, 'value-1');
# Verify table
select * from test.check_table;
+------+-------------+
| id | description |
+------+-------------+
| 1 | value-1 |
| 2 | value-2 |
| NULL | value-3 |
| 4 | NULL |
| 1 | value-1 |
+------+-------------+
# Verify that the GEN_CLUST_INDEX index is auto-created.
select * from INFORMATION_SCHEMA.INNODB_INDEX_STATS where TABLE_SCHEMA='test' and TABLE_NAME = 'check_table';
+--------------+-------------+-----------------+--------+--------------+-------------------+------------------+
| table_schema | table_name | index_name | fields | rows_per_key | index_total_pages | index_leaf_pages |
+--------------+-------------+-----------------+--------+--------------+-------------------+------------------+
| test | check_table | GEN_CLUST_INDEX | 1 | 5 | 1 | 1 |
+--------------+-------------+-----------------+--------+--------------+-------------------+------------------+
# Duplicate rows are still allowed (Primary Key constraints not enforced)
insert into test.check_table values(1, 'value-1');
select * from test.check_table;
+------+-------------+
| id | description |
+------+-------------+
| 1 | value-1 |
| 2 | value-2 |
| NULL | value-3 |
| 4 | NULL |
| 1 | value-1 |
| 5 | value-5 |
| 1 | value-1 |
+------+-------------+
To contrast, a table with a PRIMARY KEY specified creates an index with name PRIMARY.
# Create another table
create table test.check_table_2 (id int, description varchar(10), PRIMARY KEY(id)) ENGINE = INNODB;
# Verify primary key column
desc check_table_2;
+-------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | 0 | |
| description | varchar(10) | YES | | NULL | |
+-------------+-------------+------+-----+---------+-------+
# Verify index
select * from INFORMATION_SCHEMA.INNODB_INDEX_STATS where TABLE_SCHEMA='test' and TABLE_NAME = 'check_table_2';
+--------------+---------------+------------+--------+--------------+-------------------+------------------+
| table_schema | table_name | index_name | fields | rows_per_key | index_total_pages | index_leaf_pages |
+--------------+---------------+------------+--------+--------------+-------------------+------------------+
| test | check_table_2 | PRIMARY | 1 | 0 | 1 | 1 |
+--------------+---------------+------------+--------+--------------+-------------------+------------------+
# Primary key is enforced
insert into check_table_2 values(1,'value-1');
OK
insert into check_table_2 values(1,'value-1');
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

how to filter repeated rows from a joining more than two tables

i have an event table that have one to many relation with more than two (Ticket, Guest, Sponsor) tables. they are related with the event table with a foreign key constraint.
Now when i use left join using(EVENT_ID) i get records repeating multiple times until the largest table result set rows are reached for other tables record
is their a way to get make those records from other tables null.
this are my tables i only showed columns that are related with my question in this example for space and simplicity
CREATE TABLE event(
EVENT_ID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(30) NOT NULL,
) ;
CREATE TABLE eventSponsor (
SPONSOR_ID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
EVENT_ID INTEGER NOT NULL,
name VARCHAR(50) NOT NULL,
FOREIGN KEY fk_key(EVENT_ID)
REFERENCES event(EVENT_ID)
ON UPDATE CASCADE
ON DELETE CASCADE
)ENGINE=InnoDB;
CREATE TABLE eventGuest(
GUEST_ID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(30) NOT NULL,
EVENT_ID INTEGER NOT NULL,
FOREIGN KEY fk_guest(EVENT_ID)
REFERENCES event(EVENT_ID)
ON UPDATE CASCADE
ON DELETE CASCADE
)ENGINE=InnoDB;
CREATE TABLE eventTicket(
TICKET_ID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
EVENT_ID INTEGER NOT NULL,
name VARCHAR(20) NOT NULL,
FOREIGN KEY fk_ticket(EVENT_ID)
REFERENCES event(EVENT_ID)
ON UPDATE CASCADE
ON DELETE CASCADE
)ENGINE=InnoDB;
this is my join
SELECT `event`.`EVENT_ID`, `event`.`name` AS `eventName`, `eventGuest`.`GUEST_ID` `eventGuest`.`first_name` AS 'guestName',`eventSponsor`.`SPONSOR_ID`, `eventSponsor`.`name` AS 'sponsorName', `eventTicket`.`TICKET_ID`, `eventTicket`.`name` AS 'ticketName'
FROM `event`
RIGHT JOIN `eventTicket` USING(`EVENT_ID`)
LEFT JOIN `eventGuest` USING(`EVENT_ID`)
LEFT JOIN `eventSponsor` USING(`EVENT_ID`)
WHERE `event`.`EVENT_ID` = in_eventId;
and i get a result of
EVENT_ID |eventName | GUEST_ID | guestName | SPONSOR_ID | sponsorName | TICKET_ID | ticketName
1 | event1 | 1 | guest1 | 1 | sponsor1 | 1 | ticket1 |
1 | event1 | 2 | guest2 | 2 | sponsor2 | 2 | ticket2 |
1 | event1 | 1 | guest1 | 3 | sponsor3 | 3 | ticket3 |
1 | event1 | 2 | guest2 | 1 | sponsor1 | 4 | ticket4 |
see since there are 4 tickets associated with that event but only 2 guests and 3 sponsors their record repeats until they are equal with ticket row but.
what i want was to get something like this
EVENT_ID |eventName | GUEST_ID | guestName | SPONSOR_ID | sponsorName | TICKET_ID | ticketName
1 | event1 | 1 | guest1 | 1 | sponsor1 | 1 | ticket1 |
1 | event1 | 2 | guest2 | 2 | sponsor2 | 2 | ticket2 |
1 | event1 | null | null | 3 | sponsor3 | 3 | ticket3 |
1 | event1 | null | null | null | null | 4 | ticket4 |
is it possible to twick my join and get this result or is that the default. if it is what do you suggest a workaround for this. FYI i'm writing this SQL inside a stored procedure.
thank you in advance.

why this result use column of parent table in subquery

create table teacher (
id int,
name varchar(126),
primary key(id)
)DEFAULT CHARSET=utf8 ENGINE=InnoDB;
create table class (
id int,
name varchar(126),
t_id int,
primary key(id),
foreign key (t_id) references teacher(id) on delete cascade on update cascade
)DEFAULT CHARSET=utf8 ENGINE=InnoDB;
data in tables
mysql> select * from teacher ;
+----+------+
| id | name |
+----+------+
| 1 | yang |
| 2 | yan |
+----+------+
2 rows in set (0.00 sec)
mysql> select * from class ;
+----+---------+------+
| id | name | t_id |
+----+---------+------+
| 1 | math | 1 |
| 2 | english | 2 |
+----+---------+------+
2 rows in set (0.00 sec)
sql I execute
mysql> select * from class where t_id = (select t_id from teacher where name = 'yang');
+----+---------+------+
| id | name | t_id |
+----+---------+------+
| 1 | math | 1 |
| 2 | english | 2 |
+----+---------+------+
how this sql executed and t_id parsed in subsql;
ps:this sql is not what I want, a bug one, but want to know how this run in mysql.

How to prevent Duplicate records from my table Insert ignore does not work here

mysql> select * from emp;
+-----+---------+------+------+------+
| eno | ename | dno | mgr | sal |
+-----+---------+------+------+------+
| 1 | rama | 1 | NULL | 2000 |
| 2 | kri | 1 | 1 | 3000 |
| 4 | kri | 1 | 2 | 3000 |
| 5 | bu | 1 | 2 | 2000 |
| 6 | bu | 1 | 1 | 2500 |
| 7 | raa | 2 | NULL | 2500 |
| 8 | rrr | 2 | 7 | 2500 |
| 9 | sita | 2 | 7 | 1500 |
| 10 | dlksdgj | 2 | 2 | 2000 |
| 11 | dlksdgj | 2 | 2 | 2000 |
| 12 | dlksdgj | 2 | 2 | 2000 |
| 13 | dlksdgj | 2 | 2 | 2000 |
| 14 | dlksdgj | 2 | 2 | 2000 |
+-----+---------+------+------+------+
Here is my table. I want to eliminate or prevent insertion of the duplicate records, as the eno field is auto increment total row never be duplicate, but the records are duplicates. How can I prevent inserting those duplicate records?
I tried using INSERT IGNORE AND ON DUPLICATE KEY UPDATE (I think I have not used them properly).
The way I used them is,
mysql> insert into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000);
Query OK, 1 row affected (0.03 sec)
mysql> insert ignore into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000);
Query OK, 1 row affected (0.03 sec)
mysql> insert into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000) ON DUPLICATE KEY UPDATE eno=eno;
Query OK, 1 row affected (0.03 sec)
mysql> insert into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000) ON DUPLICATE KEY UPDATE eno=eno;
Query OK, 1 row affected (0.04 sec
mysql> desc emp;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| eno | int(11) | NO | PRI | NULL | auto_increment |
| ename | varchar(50) | YES | | NULL | |
| dno | int(11) | YES | | NULL | |
| mgr | int(11) | YES | MUL | NULL | |
| sal | int(11) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
alter the table by adding UNIQUE constraint
ALTER TABLE employee ADD CONSTRAINT emp_unique UNIQUE (ename,dno,mgr,sal)
but you can do this if the table employee is empty.
or if records existed, try adding IGNORE
ALTER IGNORE TABLE employee ADD CONSTRAINT emp_unique UNIQUE (ename,dno,mgr,sal)
UPDATE 1
Something went wrong, I guess. You only need to add unique constraint on column ename since eno will always be unique due to AUTO_INCREMENT.
In order to add unique constraint, you need to do some cleanups on your table.
The queries below delete some duplicate records, and alters table by adding unique constraint on column ename.
DELETE a
FROM Employee a
LEFT JOIN
(
SELECT ename, MIN(eno) minEno
FROM Employee
GROUP BY ename
) b ON a.eno = b.minEno
WHERE b.minEno IS NULL;
ALTER TABLE employee ADD CONSTRAINT emp_unique UNIQUE (ename);
Here's a full demonstration
SQLFiddle Demo
Create a UNIQUE CONSTRAINT on which you think the duplicacy exist .
like
ALTER TABLE MYTABLE ADD CONSTRAINT constraint1 UNIQUE(column1, column2, column3)
This will work regardless of whether you clean up your table first (i.e. you can stop inserting duplicates immediately and clean up on separate schedule) and without having to add any unique constraints or altering table in any other way:
INSERT INTO
emp (ename, dno, mgr, sal)
SELECT
e.ename, 2, 2, 2000
FROM
(SELECT 'dlksdgj' AS ename) e
LEFT JOIN emp ON e.ename = emp.ename
WHERE
emp.ename IS NULL
The above query assumes you want to use ename as a "unique" field, but in the same way you could define any other fields or their combinations as unique for the purposes of this INSERT.
It works because it's an INSERT ... SELECT format where the SELECT part only produces a row (i.e. something to insert) if its left joined emp does not already have that value. Naturally, if you wanted to change which field(s) defined this "uniqueness" you would modify the SELECT and the LEFT JOIN accordingly.