Vertically Merge Multiple Tables in MySQL by Joint Primary Key - mysql

I've got 3 MySQL MyISAM tables: table1, table2 and table3.
Each table has an ID column (ID, ID2, ID3 respectively), and different data columns.
For example table1 has [ID, Name, Birthday, Status, ...] columns,
table2 has [ID2, Country, Zip, ...],
table3 has [ID3, Source, Phone, ...]
you get the idea.
The ID, ID2, ID3 columns are common to all three tables... if there's an ID value in table1 it will also appear in table2 and table3. The number of rows in these tables is identical, about 10m rows in each table.
What I'd like to do is create a new table that contains (most of) the columns of all three tables and merge them into it.
The dates, for instance, must be converted because right now they're in VARCHAR YYYYMMDD format. Reading the MySQL manual I figured STR_TO_DATE() would do the job, but I don't know how to write the query itself in the first place so I have no idea how to integrate the date conversion.
So basically, after I create the new table (which I do know how to do), how can I merge the three tables into it, integrating into the query the date conversion?
Update:
The only thing that's unclear to me is how I can convert the dates within the query.
As far as I understand the query should be something like that:
INSERT INTO [new table]
SELECT table1.ID, table1.Name, table1.Birthday, table2.Country, table3.Phone
FROM table1
INNER JOIN table2 ON table1.ID = table2.ID2
INNER JOIN table3 ON table1.ID = table3.ID3;
...but how can I convert the dates within it? Or for that matter, apply any function to a field before it's inserted? For instance how can I convert the Birthday field before inserting it using STR_TO_DATE()? Where do I put it?
STR_TO_DATE(table1.Birthday, '%Y%m%d')
[Err I figured just replace "table1.Birthday" with "STR_TO_DATE(table1.Birthday, ...)"? Is that correct?]

Looks like you want an INSERT SELECT query along the lines of:
INSERT INTO [new table]
SELECT [values]
FROM table1
INNER JOIN table2 on table1.ID = table2.ID2
INNER JOIN table3 ON table1.ID = table3.ID3;
Where you fill in [new table] as the name of the new table and [values] as the values you want in the new table.
Here are the relevant parts of the manual for more details.
INSERT...SELECT syntax - for details of the INSERT SELECT statement
JOIN syntax - for details on JOINing tables in queries

Related

Query with CONCAT inside JOIN is slow

I need to get different values from 3 different tables (table1, table2, table3) where the common value is a reference number. This number appears on all 3 tables, except on table3 where the number is divided on three different columns. I tried to make a LEFT OUTER JOIN concatenating these three columns to make the whole reference number, but the query becomes significantly slower. This is the part of the query where the issue is found:
SELECT t1.type AS type, t2.client AS client, t3.somenumber AS somenumber, t4.anothernumber AS anothernumber
FROM table1 t1 JOIN table2 t2 ON t1.somevalue = t2.somevalue
JOIN table4 t4 ON t4.reference_number = t1.reference_number --Some validation I need to make on another table
--Here's the problem. table3's values 1 through 3 make the reference number found in the other tables.
--The CONCAT makes the query significantly slow.
LEFT OUTER JOIN table3 t3 ON CONCAT(t3.value1, t3.value2, t3.value3) = t1.reference_number
WHERE t1.date BETWEEN '2022-04-01' AND '2022-05-01'
AND t1.client IN ('client1', 'client2', 'client3', 'client4', 'client5')
GROUP BY t1.reference_number --Group by the reference number
I tried making a view to create a column where the reference number is 'stored', but it still takes a lot to run the query. Is there a way to optimize this?
Running on 10.3.32-MariaDB
The GROUP BY does not need CONCAT:
GROUP BY t3.value1, t3.value2, t3.value3
I don't understand why you tacked on t1.reference_number; it is either similar info or NULL. The NULL case might lead to extra groups, by it seems like a waste. (Add it on if necessary.)
Indexes:
t1: INDEX(date)
t1: INDEX(client, date)
t2: INDEX(somevalue, client)
t3: INDEX(value1, value2, value3)
t4: INDEX(reference_number)
Was t3.value a typo for t3.value3?
Consider getting rid of t4; you are not using any values from it. The only thing it is doing is to verify that table4 has a matching row.
What version of MySQL are you using?
It may be useful to have VIRTUAL or PERSISTENT (generated) column that is CONCAT (value1, value2, value3) and index it.
(And I agree with the Commenters that the "reference number" is ambiguous.)

In SQL when we join tables, is there a name for the new table?

E.g. in Pandas, we can apply a mask and create a new dataframe and assign it a name. Similarly in SQL, once I do a LEFT JOIN of 2 tables, is there a way to refer to the new combined table ?
You can join two table and can get the result in the new combined and also you can give name to that table . Just try this query and if get any doubt just feel free to ask anytime.
MYSQL QUERY
EMP(C1, C2, CD1)
DEPT(D1, D2)
SELECT NEWTABLE.First, NEWTABLE.Third
FROM
(SELECT E.C1 AS First, E.C2 AS Second, D.D2 AS Third FROM EMP E, DEPT D WHERE
E.CD1 = D.D1) NEWTABLE
WHERE NEWTABLE.Second > 20;
We have created a virtual table i.e "NEWTABLE" you can give your name also .
(SELECT E.C1 AS First, E.C2 AS Second, D.D2 AS Third FROM EMP E, DEPT D WHERE
E.CD1 = D.D1)
This is the query for where we have applied join query and also we have selected the three row from two table and renamed it as "FIRST", "SECOND" and "THIRD".
And you will get the doubt in the first line so let me clear we have performed the operation NEWTABLE.Second > 20;on the new table which we obtained after join.
If you still get any doubt regarding the query just ask .
Values Stored in the new table is temporary and you can use it for that query only.
And if you want to store permanent value then you have create to new table then assigned that table with the table we joined and so on .
No that won't work in sql, at least not directly
But you can do a subquery
Like
SELECT aa.*
FROM
(SELECT t1.*,t2.* FROM table1 t1 LEFT JOIN table2 t2 ON t1.id = t2.refid) aa
or A view
CREATE VIEW v AS SELECT t1.*,t2.* FROM table1 t1 LEFT JOIN table2 t2 ON t1.id = t2.refid;
A problem can result, when you have in both tables the same names for columns, that would cause problems, so you must check and in case of equal columnames alias the second column

Merging 2 tables preserving the ID

I have a question about merging a table with another preserving an ID on a database (I'm using MySQL). I have 2 tables, the first has and Item ID and a category and subcategory assigned to that ID. The second has a Item ID with all its characteristics like name and other variables. How can I merge those two tables in a way that the ID corresponds to the correct item in the new table (that's the difficult part I think)? Is it possible?
Thank you for all the help!
It's a very basic operation called Inner Join:
Select *
from table1
inner join table2
on table1.itemid = table2.itemid;
EDIT: As OP wants to create a new table with the fields return by above query and insert data into newly created table; following are the query to insert data once its created:
Insert into tablename
Select *
from table1
natural join table2;
Note: Make sure that the order and datatypes of columns in new table and in the result of above select query must be same.
I'm assuming you want to create table from the combined results. See this page for details.
Basically you write and test the SQL query then CREATE TABLE table_name AS sql_query
create table new_item_table
as
select
a.item_id,
a.category,
a.subcategory,
b.item_name,
b.item_char_1,
b.item_char_2
from
item_category a inner join item_char b on a.item_id = b.item_id;
This will Do:
select a.*,b.ItemName,b.ItemChar1,b.ItemChar2 from FirstTable a join select * from SecondTable b on a.ItemId=B.ItemId;
Use left join if some of the records are not there in the second table

What is better way to join in mysql?

I wanted to join 3 or more tables
table1 - 1 thousand record
table2 - 100 thousands record
table3 - 10 millions record
Which of the following is best(speed wise performance):-
Note: pk and fk are primary and foreign key for respective tables and FILTER_CONDITION1 and FILTER_CONDITION2 are respective restricting records query normally found in where
Case 1 :taking smaller tables first and joining larger one later
Select table1.*,table2.*,table3.*
from table1
join table2
on table1.fk = table2.pk and FILTER_CONDITION1
join table3
on table2.fk = table3.pk and FILTER_CONDITION2
Case 2
Select table1.*,table2.*,table3.*
from table3
join table2
on table2.fk = table3.pk and FILTER_CONDITION2
join table1
on table1.fk = table2.pk and FILTER_CONDITION1
Case 3
Select table1.*,table2.*,table3.*
from table3
join table2
on table2.fk = table3.pk
join table1
on table1.fk = table2.pk
where FILTER_CONDITION1 and FILTER_CONDITION2
The cases you show are equivalent. What you are describing is in the end the same query and will be seen by the database as such: the database will make a query plan.
The best thing you can do is use EXPLAIN and check out what your query actually does: this way you can see they will probably be run the same, AND if there might be a bottle neck in there.
As #Nanne updated in his answer that normally mysql do it its own (right ordering) but some time (rare case) mysql can read table join in wrong order and can kill query performance in this case you can follow below approach-
If you can filter data from your bulky tables like table2 and table3 (suppose you can get only 500 records after joining these tables and applying filter) then first you filter your data and then you can join that filtered data with your small table..in this way you can get performance but there can be various combinations, so you have to check by which join you can do more filteration..yes explain will help you to know it and index will help you to get filtered data.
After above approach you can say mysql to use ordering as you have in your query by syntax "SELECT STRAIGHT_JOIN....." same as some time mysql does not use proper index and we have to use force index

Getting a string from a referenced table

I am relativly new to the SQL language. I can do a basic select, but for performance increase, I'd love to know if it is possible to merge the two queries I am doing at the moment into one.
Scenario: There are two tables. Table one has a few columns, one of them is a VARCHAR(45) named 'user', and another one is a INT which is called 'gid'. In the second table, there is a primary key column called 'gid' (INT) and a column called 'permissions' which is a TEXT column and it contains values seperated by ';'.
I have a user name, and want the text in the permissions column. The current way I do it is by fetching the gid of the first table, then doing a second query with the gid to get the permissions.
I've heard there are other ways to do this, and I have searched on Google, but I'm not sure what I should do.
EDIT:
Like this:
select t2.permissions
from table1 t1, table2 t2
where t1.user = '<SPECIFIED NAME>'
and t1.gid = t2.gid;
or you could use INNER JOIN syntax:
select t2.permissions
from table1 t1
inner join table2 t2 on t1.gid = t2.gid
where t1.user = '<SPECIFIED VALUE>'
To do this you use a JOIN. A join connects two tables in a select statement.
Like this
select *
from usertable u
join permissiontable p on u.gid = p.gid
This will give you all the columns from both tables with the id column joined. You can treat the joined table just like any table (eg select a sub-set of columns in the select list, add a where clause, etc).
You can read more about joins in any intro sql book or doing a google search.