MySQL Combining data from 2 tables into 3rd table - mysql

I currently have 2 tables:
CrawlData
id (autoincrement), Source, Destination, and some more columns
Nodes
id (autoincrement), URL
The Nodes table contains the distinct Source values from CrawlData
Now I would like to have a table that is a kind of look up table that contains the ID's from Nodes instead of the texts in Source and Destination from CrawlData
I can get all the ID's with a Select query using Join on the URL=Source and URL = Destination, but don't know how to combine these and then also to get them in a new table Edges with 2 columns:
SourceNode (= id from Nodes where CrawlData.Source = URL)
DestinationNode (= id from Nodes where CrawlData.Destination = URL)

You can INSERT the records returned by SELECT statement using INSERT INTO...SELECT statement.
INSERT INTO Edges(SourceNode, DestinationNode)
SELECT b.ID SourceNode,
c.ID DestinationNode
FROM CrawlData a
INNER JOIN Nodes b
ON a.Source = b.URL
INNER JOIN Nodes c
ON a.Destination = c.URL
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
For faster execution, execute the following statements to add INDEX on the columns to avoid FULL TABLE SCAN which could be slow if doing on large RDBMS.
ALTER TABLE Nodes ADD INDEX (URL);
if happens that all values of Source and Destination column are present on Nodes.URL, declare these columns as foreign keys,
ALTER TABLE CrawlData
ADD CONSTRAINT cd_fk1 FOREIGN KEY (Source) REFERENCES Nodes(URL)
ALTER TABLE CrawlData
ADD CONSTRAINT cd_fk2 FOREIGN KEY (Destination) REFERENCES Nodes(URL)
otherwise, add normal index on them
ALTER TABLE CrawlData ADD INDEX (Source);
ALTER TABLE CrawlData ADD INDEX (Destination);

You can join twice to the Nodes table. Once, use the Source to join to URLK. The next time use the Destination.
Conceptually, it is like using two copies of the Nodes table, each with a different name (say "S" and "D"). You get:
select S.ID As SOURCE_ID, D.ID As DEST_ID
from CrawlData
join Nodes S on Source = S.URL
join Nodes D on Destination = D.URL

Related

Delete entry that is connected to 2 tables

table 1 is called (athlete) and table2 is called (training_session.id) the primary key to table 1 is ID, and the table 2 has the primary key Athelete_id
I want to delete a person from my database by using his name, which I've called "Pet". However, he is also connected to another table which stores his training session. So (ID 1) on table 1 is connected to table 2 (athlete id1)
I struggle a lot, I try using INNER JOIN.
DELETE athlete,training_session FROM athlete
INNER JOIN
training_session ON training_session.id = athlete.name
WHERE
athlete.name = "Pet;
I have something wrong with my syntax, is it correct to use Inner Join or have I misunderstood
You should have set up foreign key constraints with Cascade deletions to simplify the logic and all you would have needed than was to delete from athlete. So I would suggest you add it.
For more info you can take a look at:
http://www.mysqltutorial.org/mysql-on-delete-cascade/

Delete data table with left join

I do not understand MySQL delete when I need to delete data in a table with data from another table that depend on it.
For example if I want to delete a data in table 'factory', all data at table 'room' that depends on data at table 'factory is also deleted'.
Fac_ID is the primary key in 'factory' and foreign key in 'room'
below is my SQL code.
DELETE * FROM factory
LEFT JOIN room ON room.Fac_ID = factory.Fac_ID
WHERE factory.Fac_ID = :Fac_ID
Can any one help me?
I think you need a separate delete for this.
First is to delete foreign data
delete from room where Fac_ID = :Fac_ID
Then delete primary data
delete from factory where Fac_ID = :Fac_ID
Unless your table design is ON DELETE CASCADE (supported only in INNODB), you only need to delete the primary data
There is a small mistake in your query because of which I think you are facing problem.
As per my understanding you have some records in Main table and some records in refrenced table. There are some case for which main table has some id but there is not entry in refrence table for that id. And for handling that case you applied left join.
But in your query you wrote reference table on left so basically it is taking all of the record from reference table which is kind of inner join in this case.
So for correcting this you need to interchange the key id pass in your query or you may use right join to select all records from main table.
DELETE * FROM factory
LEFT JOIN room ON factory.Fac_ID = room.Fac_ID --- here you applied left join
WHERE factory.Fac_ID = :Fac_ID
MySQL allows you to delete rows from multiple tables at the same time. The syntax is:
DELETE f, r
FROM factory f LEFT JOIN
room r
ON r.Fac_ID = f.Fac_ID
WHERE f.Fac_ID = :Fac_ID;
However, this is better set up as a cascading delete foreign key relationship between the two tables.

Delete all not referenced (by foreign key) records from a table in mysql

I have a address table which is referenced from 6 other tables (sometimes multiple tables). Some of those tables have around half a million records (and the address table around 750000 records). I want to have a periodical query running which deletes all records that are not referenced from any of the tables.
The following sub-queries is not a option, because the query never finishes - the scope is too big.
delete from address where address_id not in (select ...)
and not in (select ...) and not in (select ...) ...
What I was hoping was that I could use the foreign key constraint and I could simply delete all records for which the foreign key constraint does not stop me (because there is no reference to the table). I could not find a way to do this (or is there?). Anybody another good idea to tackle this problem?
You can try this ways
DELETE
address
FROM
address
LEFT JOIN other_table ON (address.id = other_table.ref_field)
LEFT JOIN other_table ON (address.id = other_table2.ref_field)
WHERE
other_table.id IS NULL AND other_table2.id IS NULL
OR
DELETE
FROM address A
WHERE NOT EXISTS (
SELECT 1
FROM other_table B
WHERE B.a_key = A.id
)
I always use this:
DELETE FROM table WHERE id NOT IN (SELECT id FROM OTHER table)
I'd do this by first creating a TEMPORARY TABLE (t) that is a UNION of the IDs in the 6 referencing tables, then run:
DELETE x FROM x LEFT JOIN t USING (ID) WHERE x.ID IS NULL;
Where x is the address table.
See 'Multiple-table syntax' here:
http://dev.mysql.com/doc/refman/5.0/en/delete.html
Obviously, your temporary table should have its PRIMARY KEY on ID. It may take some time to query and join, but I can't see a way round it. It should be optimized, unlike the multiple sub-query version.

Populate foreign keys for subtype table already containing data

I have tables [Moulds], [Machines] and [SpareParts] each with different attributes/columns. I would like to make them into subtypes and create a supertype table for them called [Assets] so I can reference all of them together in a maintenance scheduling application.
The [Assets] table will simply contain columns [Asset_ID], [Asset_Type] and [Description]. [Asset_ID] is an identity PK, [Asset_Type] is an int (eg. Moulds = 1, Machines = 2, etc.) and [Description] will be taken from the subtype tables. I will add a column called [Asset_FK] to each of the subtype tables as a foreign key.
My problem is that each subtype table has hundreds to thousands of rows of data already in them. It would be unreasonable to manually create PK-FK for each existing record, but I'm uncertain of the SQL required to automate it.
For populating the [Assets] table, I currently have this:
DECLARE #AssetID TABLE (ID int)
INSERT INTO Assets (Assets.Description, Assets.Asset_Type)
OUTPUT Inserted.Asset_ID INTO #AssetID
SELECT IsNull(Moulds.Description,''), 5
FROM Moulds
But, I'm not sure how to update the FK in [Moulds] in the same query, or if this is even the right approach. Specifically, I'm not sure how to identify the row in subtypes I selected which I want to update.
To summarize my question, I have a blank supertype table and filled subtype tables. I want to populate the supertype table using the subtype tables and automatically fill in the FK values for the existing subtype records to link them. How can I do this using SQL (MS SQL Server 2008r2)?
Try this:
update m
set m.fkid = a.id
from moulds m
inner join assets a
on isnull(m.description,'') = a.description and a.Asset_Type = 5
inner join #AssetID a2 on a.id = a2.id
So, based on rs.'s answer, I came up with an idea. I add a temporary column to table [Assets] that stores the PK of table [Moulds] (or some other subtype table), use it for the update operations, then drop the column. It looks like this:
USE [Maintenance]
ALTER TABLE Assets
ADD Asset_FK int null
GO
DECLARE #AssetID TABLE (ID int)
INSERT INTO Assets (Description, Asset_Type, Asset_FK)
OUTPUT Inserted.Asset_ID INTO #AssetID
SELECT IsNull(Description,''), 5, Mould_PK
FROM Moulds
UPDATE m
SET m.Asset_ID = a.Asset_ID
FROM Moulds m
INNER JOIN Assets a
ON m.Mould_PK = a.Asset_FK AND a.Asset_Type = 5
INNER JOIN #AssetID a2 ON a.Asset_ID = a2.ID
GO
ALTER TABLE Assets
DROP COLUMN Asset_FK
Probably not the most elegant answer, but it seems simple and works.

How to set a database integrity check on foreign keys referenced fields

I have four Database Tables like these:
Book
ID_Book |ID_Company|Description
BookExtension
ID_BookExtension | ID_Book| ID_Discount
Discount
ID_Discount | Description | ID_Company
Company
ID_Company | Description
Any BookExtension record via foreign keys points indirectly to two different ID_Company fields:
BookExtension.ID_Book references a Book record that contains a Book.ID_Company
BookExtension.ID_Discount references a Discount record that contains a Discount.ID_Company
Is it possible to enforce in Sql Server that any new record in BookExtension must have Book.ID_Company = Discount.ID_Company ?
In a nutshell I want that the following Query must return 0 record!
SELECT count(*) from BookExtension
INNER JOIN Book ON BookExstension.ID_Book = Book.ID_Book
INNER JOIN Discount ON BookExstension.ID_Discount = Discount.ID_Discount
WHERE Book.ID_Company <> Discount.ID_Company
or, in plain English:
I don't want that a BookExtension record references a Book record of a Company and a Discount record of another different Company!
Unless I've misunderstood your intent, the general form of the SQL statement you'd use is
ALTER TABLE FooExtension
ADD CONSTRAINT your-constraint-name
CHECK (ID_Foo = ID_Bar);
That assumes existing data already conforms to the new constraint. If existing data doesn't conform, you can either fix the data (assuming it needs fixing), or you can limit the scope (probably) of the new constraint by also checking the value of ID_FooExtension. (Assuming you can identify "new" rows by the value of ID_FooExtension.)
Later . . .
Thanks, I did indeed misunderstand your situation.
As far as I know, you can't enforce that constraint the way you want to in SQL Server, because it doesn't allow SELECT queries within a CHECK constraint. (I might be wrong about that in SQL Server 2008.) A common workaround is to wrap a SELECT query in a function, and call the function, but that's not reliable according to what I've learned.
You can do this, though.
Create a UNIQUE constraint on Book
(ID_Book, ID_Company). Part of it will look like UNIQUE (ID_Book, ID_Company).
Create a UNIQUE constraint on Discount (ID_Discount, ID_Company).
Add two columns to
BookExtension--Book_ID_Company and
Discount_ID_Company.
Populate those new columns.
Change the foreign key constraints
in BookExtension. You want
BookExtension (ID_Book,
Book_ID_Company) to reference
Book (ID_Book, ID_Company). Similar change for the foreign key
referencing Discount.
Now you can add a check constraint to guarantee that BookExtension.Book_ID_Company is the same as BookExtension.Discount_ID_Company.
I'm not sure how [in]efficient this would be but you could also use an indexed view to achieve this. It needs a helper table with 2 rows as CTEs and UNION are not allowed in indexed views.
CREATE TABLE dbo.TwoNums
(
Num int primary key
)
INSERT INTO TwoNums SELECT 1 UNION ALL SELECT 2
Then the view definition
CREATE VIEW dbo.ConstraintView
WITH SCHEMABINDING
AS
SELECT 1 AS Col FROM dbo.BookExtension
INNER JOIN dbo.Book ON dbo.BookExtension.ID_Book = Book.ID_Book
INNER JOIN dbo.Discount ON dbo.BookExtension.ID_Discount = Discount.ID_Discount
INNER JOIN dbo.TwoNums ON Num = Num
WHERE dbo.Book.ID_Company <> dbo.Discount.ID_Company
And a unique index on the View
CREATE UNIQUE CLUSTERED INDEX [uix] ON [dbo].[ConstraintView]([Col] ASC)