MySQL two table dump, merge, between two instances - mysql

Is there a method for comparing data between 2 tables between 2 database instances and then merging them?
So there was everything in both.
Thank You.

In mysql you can select data between table that are in different datababase simply using a query as :
select A.col1, B.col1
from databaseA.tableA as A
inner join databaseB.tableB as B on A.colkey1 = b.colkey1
for merge you can use JOIN or UNION depending by you needs
and last you can use a INSERT SELECT for populate the table you need

If your table schemas are the same, you can just take a dump of one and do a simple INSERT INTO ... SELECT ... ON DUPLICATE KEY UPDATE <your key clash logic> to the other.

Related

MySQL : updating a table from another table by leftjoin vs iterating

I have two tables T1 and T2 and want to update one field of T1 from T2 where T2 holds massive data.
What is more efficient?
Updating T1 in a for loop iteration over the values
or
Left join it with T2 and update.
Please note that i'm updating these tables in a shell script
In general, the JOIN will always work much better than a loop. The size should not be an issue if it is properly indexed.
There is no simple answer which will be more effective, it will depend on table size and data size to which you are going to update in one go.
Suppose you are using innodb engine and trying to update 1,000 or more rows in one go with 2 heavy tables join and it is quite frequent then it will not be good idea on production server as it will lock your table for some time and due to this locking some other operations also can be hit on your production server.
Option1: If you are trying to update few rows and based on proper indexed fields (preferred based on primary key) then you can go with join.
Option2: If you are trying to update a large amount of data based on multiple tables join then below option will be better:
Step1: Create a stored procedure.
Step2: Keep below query results in a cursor.
suppose you want TO UPDATE corresponding field2 DATA of TABLE table2 IN field1 of TABLE table1:
SELECT a.primary_key,b.field2 FROM table1 a JOIN table2 b ON a.primary_key=b.foreign_key WHERE [place CONDITION here IF any...];
Step3: Now update all rows one by one based on primary key using stored values in cursor.
Step4: You can call this stored procedure from your script.

Should I check if rows exist across tables before deleting in MyISAM?

Let's say I have 5 MyISAM tables in my database. Each table has a key, let's call it "id_num" ...
"id_num" is the field which I use to connect all the tables together. A certain value of "id_num" may appear in all tables or sometimes only a subset of the tables.
If I want to delete all instances of a certain "id_num" in the database, can I just make a DELETE command on all tables or should I check to see if that value for "id_num" exists?
DELETE * FROM table1 WHERE id_num = 123;
DELETE * FROM table2 WHERE id_num = 123;
DELETE * FROM table3 WHERE id_num = 123;
DELETE * FROM table4 WHERE id_num = 123;
DELETE * FROM table5 WHERE id_num = 123;
Or should I perform a SELECT command first on each table to check if these rows exist in the table before deletion? What is best practice?
(I am using MyISAM so cascading delete is not an option here.)
To answer your question about first running SELECT, there's no advantage to doing so. If there's no row in a given table, then the DELETE will simply affect zero rows. If there are matching rows, then doing the SELECT first and then the DELETE would just be doing double the work of finding the rows. So just do the DELETE and get it over with.
Are you aware that MySQL has multi-table DELETE syntax?
If you are certain that table1 has a matching row, you can use outer joins for the others:
DELETE table1.*, table2.*, table3.*, table4.*, table5.*
FROM table1
LEFT OUTER JOIN table2 USING (id_num)
LEFT OUTER JOIN table3 USING (id_num)
LEFT OUTER JOIN table4 USING (id_num)
LEFT OUTER JOIN table5 USING (id_num)
WHERE table1.idnum = 123;
I'm assuming id_num is indexed in all these tables, otherwise doing the JOIN will perform poorly. But doing the DELETE without the aid of an index to find the rows would perform poorly too.
Sounds like you need to change your design as follows - have a table with id_num as a PK and make id_num a FK in the other tables, with on-delete-cascade. This will allow you to only run a single delete statement to delete all applicable data (and this is also generally the more correct way of doing things).
The above apparently doesn't work in MyISAM, but there is a workaround using triggers (but now it does seem like a less appealing option).
But I believe your above queries should work, no need to check if something exists first, DELETE will just not do anything.
Most APIs provide you with some sort of rows affected count if you'd like to see whether data was actually deleted.
You should not execute select query before deleting from the table. As select query will put some extra load to the server. However after executing delete query you can check how many rows has been deleted using mysql_affected_rows() function in php.

Moving Data from one column to another similar column in another table MYSQL

I am currently working on a webbased systen using a Mysql db.
I realised that I had initially set up the columns within the tables incorrectly and
I now need to move the data from one table column (receiptno) in table (clients) into a similar table column(receiptno) in table (revenue).
I am still quite inexperienced with Mysql and therefore I dont know the the mysql syntax to accomplish this.
Can I get some help on it.
Thanks
If you simply wanted to insert the data into new records within the revenue table:
INSERT INTO revenue (receiptno) SELECT receiptno FROM clients;
However, if you want to update existing records in the revenue table with the associated data from the clients table, you would have to join the tables and perform an UPDATE:
UPDATE revenue JOIN clients ON **join_condition_here**
SET revenue.receiptno = clients.receiptno;
Learn about SQL joins.
Same smell, different odor to eggyal's answer, this works in Oracle and Postgress so your mileage may vary.
UPDATE revenue t1 SET receiptno = (
SELECT receiptno FROM clients t2 WHERE t2.client_id = t1.revenue_id
);
You will have to adjust the where clause to suit your needs ...
INSERT INTO newtable (field1, field2, field3)
SELECT filed1, field2, field3
FROM oldtable

Data synchronization between tables in mysql

I have two tables (let's call them A and B) with same structure and i need to synchronize data in them...
There's one primary key field, with same value in both tables, and several fields with value in table A and null (or obsolete value that need to be replaced with current value from table A) in table B... I need to copy value from table A to table B.
Is there any easy way (other than replication) to do this in mySQL 4.1?
thanks in advance
Try this -
UPDATE table_b b, table_a a
SET b.field1 = a.field1, b.field2 = a.field2
WHERE b.primary_key = a.primary_key
add the fields as required.
Can you just do:
INSERT INTO table1 (field1,field2,field3)
SELECT field1,field2,field3
FROM table2;
Or do you actually already have data in table2 and you need to update it rather than insert new columns?

SQL: Select Keys that doesn't exist in one table

I got a table with a normal setup of auto inc. ids. Some of the rows have been deleted so the ID list could look something like this:
(1, 2, 3, 5, 8, ...)
Then, from another source (Edit: Another source = NOT in a database) I have this array:
(1, 3, 4, 5, 7, 8)
I'm looking for a query I can use on the database to get the list of ID:s NOT in the table from the array I have. Which would be:
(4, 7)
Does such exist? My solution right now is either creating a temporary table so the command "WHERE table.id IS NULL" works, or probably worse, using the PHP function array_diff to see what's missing after having retrieved all the ids from table.
Since the list of ids are closing in on millions or rows I'm eager to find the best solution.
Thank you!
/Thomas
Edit 2:
My main application is a rather easy table which is populated by a lot of rows. This application is administrated using a browser and I'm using PHP as the intepreter for the code.
Everything in this table is to be exported to another system (which is 3rd party product) and there's yet no way of doing this besides manually using the import function in that program. There's also possible to insert new rows in the other system, although the agreed routing is to never ever do this.
The problem is then that my system cannot be 100 % sure that the user did everything correct from when he/she pressed the "export" key. Or, that no rows has ever been created in the other system.
From the other system I can get a CSV-file out where all the rows that system has. So, by comparing the CSV file and my table I can see if:
* There are any rows missing in the other system that should have been imported
* If someone has created rows in the other system
The problem isn't "solving it". It's making the best solution to is since there are so much data in the rows.
Thanks again!
/Thomas
We can use MYSQL not in option.
SELECT id
FROM table_one
WHERE id NOT IN ( SELECT id FROM table_two )
Edited
If you are getting the source from a csv file then you can simply have to put these values directly like:
I am assuming that the CSV are like 1,2,3,...,n
SELECT id
FROM table_one
WHERE id NOT IN ( 1,2,3,...,n );
EDIT 2
Or If you want to select the other way around then you can use mysqlimport to import data in temporary table in MySQL Database and retrieve the result and delete the table.
Like:
Create table
CREATE TABLE my_temp_table(
ids INT,
);
load .csv file
LOAD DATA LOCAL INFILE 'yourIDs.csv' INTO TABLE my_temp_table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(ids);
Selecting records
SELECT ids FROM my_temp_table
WHERE ids NOT IN ( SELECT id FROM table_one )
dropping table
DROP TABLE IF EXISTS my_temp_table
What about using a left join ; something like this :
select second_table.id
from second_table
left join first_table on first_table.id = second_table.id
where first_table.is is null
You could also go with a sub-query ; depending on the situation, it might, or might not, be faster, though :
select second_table.id
from second_table
where second_table.id not in (
select first_table.id
from first_table
)
Or with a not exists :
select second_table.id
from second_table
where not exists (
select 1
from first_table
where first_table.id = second_table.id
)
The function you are looking for is NOT IN (an alias for <> ALL)
The MYSQL documentation:
http://dev.mysql.com/doc/refman/5.0/en/all-subqueries.html
An Example of its use:
http://www.roseindia.net/sql/mysql-example/not-in.shtml
Enjoy!
The problem is that T1 could have a million rows or ten million rows, and that number could change, so you don't know how many rows your comparison table, T2, the one that has no gaps, should have, for doing a WHERE NOT EXISTS or a LEFT JOIN testing for NULL.
But the question is, why do you care if there are missing values? I submit that, when an application is properly architected, it should not matter if there are gaps in an autoincrementing key sequence. Even an application where gaps do matter, such as a check-register, should not be using an autoincrenting primary key as a synonym for the check number.
Care to elaborate on your application requirement?
OK, I've read your edits/elaboration. Syncrhonizing two databases where the second is not supposed to insert any new rows, but might do so, sounds like a problem waiting to happen.
Neither approach suggested above (WHERE NOT EXISTS or LEFT JOIN) is air-tight and neither is a way to guarantee logical integrity between the two systems. They will not let you know which system created a row in situations where both tables contain a row with the same id. You're focusing on gaps now, but another problem is duplicate ids.
For example, if both tables have a row with id 13887, you cannot assume that database1 created the row. It could have been inserted into database2, and then database1 could insert a new row using that same id. You would have to compare all column values to ascertain that the rows are the same or not.
I'd suggest therefore that you also explore GUID as a replacement for autoincrementing integers. You cannot prevent database2 from inserting rows, but at least with GUIDs you won't run into a problem where the second database has inserted a row and assigned it a primary key value that your first database might also use, resulting in two different rows with the same id. CreationDateTime and LastUpdateDateTime columns would also be useful.
However, a proper solution, if it is available to you, is to maintain just one database and give users remote access to it, for example, via a web interface. That would eliminate the mess and complication of replication/synchronization issues.
If a remote-access web-interface is not feasible, perhaps you could make one of the databases read-only? Or does database2 have to make updates to the rows? Perhaps you could deny insert privilege? What database engine are you using?
I have the same problem: I have a list of values from the user, and I want to find the subset that does not exist in anther table. I did it in oracle by building a pseudo-table in the select statement Here's a way to do it in Oracle. Try it in MySQL without the "from dual":
-- find ids from user (1,2,3) that *don't* exist in my person table
-- build a pseudo table and join it with my person table
select pseudo.id from (
select '1' as id from dual
union select '2' as id from dual
union select '3' as id from dual
) pseudo
left join person
on person.person_id = pseudo.id
where person.person_id is null