How to sync two tables that are not identical? - mysql

I have two projects using the same data. However, this data is saved in 2 different databases. Each of these two databases has a table that is almost the same as his counterpart in the other database.
What I am looking for
I am looking for a method to synchronise two tables. Easier said, if database_one.table gets an insert, that same record needs to be inserted into database2.table.
Database and Table One
Table Products
| product_id | name | description | price | vat | flags |
Database and Table Two
Table Articles
| articleId | name_short | name | price | price_vat | extra_info | flags |
The issue
I have never used and wouldn't know how to use any method of database synchronisation. What also worries me is that the tables are not identical and so I will somehow need to map columns to one another.
For example:
database_one.Products.name -> database_two.articles.name_short
Can someone help me with this?

You can use MERGE function:
https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/
Then create a procedure that runs at desired frequency or if it needs to be instant insert the merge into a trigger.

One of possible method is to use triggers. You need to create trigger for insert, update and delete on database_one.table, that does coresponding operation on database2.table. I guess, that there won't be any problems with insert/update/delete between both databases. When using triggers, you can very easily map columns.
However you need to consider prons and cons of using triggers - read something here or here. From my experience performance is very important, so if you have heavy loaded DB it is not a good idea to use triggers for data replication.
Maybe you should check this too?

Related

In MySQL trigger, is it possible to set a user variable with NEW.col and use that in update query?

Thanks in advance for the help. I looked all over and couldn't find an example quite like what I'm needing help with. I'm creating a trigger to update a table after insert but I don't know the table name to update until after the insert happens. This is the code I'm trying to use, but I get an error.
BEGIN
SET #ven = NEW.`ven_code`;
SET #ventable = concat('pp_ven_',#ven);
UPDATE #ventable SET `stock_qty`=NEW.`endingStock` WHERE `iin`=NEW.`iin`;
END
This is not possible as described with dynamical sql / Prepared Statement. It would generate Error Code: 1336. Dynamic SQL is not allowed in stored function or trigger upon the attempt to even CREATE TRIGGER.
About the closest you could get to automation is to use CREATE EVENT. Events are scheduled stored programs that run on the schedule / Interval of your choosing. The intervals are:
interval:
quantity {YEAR | QUARTER | MONTH | DAY | HOUR | MINUTE |
WEEK | SECOND | YEAR_MONTH | DAY_HOUR | DAY_MINUTE |
DAY_SECOND | HOUR_MINUTE | HOUR_SECOND | MINUTE_SECOND}
Your could set a "flag" so to speak on a row, such as the table your are depicting above that has the After Insert trigger. The event could then perform the Prepared Statement dynamically and execute it.
See my answer here on Event Management.
I have to say that even if run in an event, what you are proposing is almost Always the sign of a poor schema design that wouldn't hold up well to peer review.
A reason why the dynamic sql and a Prepared Stmt is disallowed is because the trigger needs to be fast, and even DDL could be snuck into the string and executed. And DDL stmts like ALTER TABLE are disallowed in triggers(they could take hours to run literally).
Your schema could just as well have one table shared with a column ven_code being the differentiator column. Instead you chose to create new tables for each ven_code. That typically is a poor design and performance choice.
If you need help with schema design, I am happy to chat about it with you in a chat room.
You should look for "prepared statements"
This answer might be helpful:
How to select from MySQL where Table name is Variable

Optimize Mysql Query (rawdata to processed table)

Hi everyone so my question is this, So I have a file that reads in roughly 3000 rows of data by the local infile command. After which there is a trigger on the table that's inserted into that copies three columns from from the updated table and two columns from a table that exist in the database already(if this is unclear to what I mean the structures are coming). From there only combinations that have unique glNumbers will be entered into the processed table. This takes over a minute and half normally. I find this pretty long, I was wondering if this is normal for what I'm doing(can't believe that's true) or is there a way to optimize the queries so it goes faster?
Tables that are inserted to are labeled the first three letters of each month. Here is the default structure.
RawData Structure
| idjan | glNumber | journel | invoiceNumber | date | JT | debit | credit | descriptionDetail | totalDebit | totalCredit |
(sorry) for the poor format there isn't a really good way to do this it seems)
After Insert Trigger Query
delete from processedjan;
insert into processedjan(glNumber,debit,credit,bucket1,bucket2)
select a.glNumber, a.totalDebit, a.totalCredit, b.bucket1, b.bucket2
from jan a inner join bucketinformation b on a.glNumber = b.glNumber
group by glNumber;
Processed Datatable Structure
| glNumber | bucket1| bucket2| credit | debit |
Also I guess it helps to know the bucket 1 and bucket 2 come from another table where its matched against the glNumber. That table is roughly 800 rows with three columns for the glNumber and the two buckets.
While postgresql has statement level triggers, mysql only has row level triggers. From the mysql reference:
A trigger is defined to activate when a statement inserts, updates, or
deletes rows in the associated table. These row operations are trigger
events. For example, rows can be inserted by INSERT or LOAD DATA
statements, and an insert trigger activates for each inserted row.
So while you are managing to load 3000 rows in one operation, unfortunately 3000 more queries are executed by the triggers. But the complex nature of your transaction sounds like you might actually be performing 2-3 queries per row. That's the real reason for the slow down.
You can speed things up by disabling the trigger and carrying out a INSERT .. SELECT after the load data in file. You can automate it with a small script.

Finding Out Which Tables Are Included in a MySQL Merge Table

I have a couple of MRG_MyISAM tables that merge a bunch of other tables in a MySQL database. I would like to figure out programmatically which tables are included in each merge table.
I know I could run SHOW CREATE TABLE and then parse the UNION=(tbl1, tbl2) part of the statement, but that seems a little hacky. Is there a better way?
In an ideal world, I'm looking for something like this:
SELECT * FROM ?? WHERE merge_table = 'merge_table_1'
That would return rows that each contain the name of a table that's included in "merge_table_1":
--------------
| table_name |
--------------
| tbl1 |
--------------
| tbl2 |
--------------
I don't think there is any data in INFORMATION_SCHEMA to list the members of a MERGE table.
If your application has direct access to the data directory on your database server, you can simply read the .MRG file for the merge table. It is a human-readable file that simply lists the tables in the merge, and any other merge table options.
You really shouldn't be using MERGE tables anymore. You should use MySQL's PARTITIONING engine, which is much more flexible. With partitioned tables, you can query the INFORMATION_SCHEMA.PARTITIONS table to find information on each partition.
In fact, you shouldn't be using MyISAM tables either. InnoDB is more scalable, and MyISAM doesn't support any of the properties of ACID.
SHOW CREATE TABLE table_name; -- see if this gives you the information

Copy column data from one table to another

I have two databases, one which is old and one which is new. I need to copy one particular column from the old to the new. Structure-wise they are both totally identical, although the new table is significantly larger than the old, and the only way i can connect these two tables together by a foreignkey is the uni_id, which is just a normal integerfield, but its unique.
So this is basically the structure of the table:
| id | name | name_pseudo   | uni_id |
------------------------------------------------------------
I want to compare each row of new_db.mytable with old_db.mytable by uni_id and insert old_db.mytable.name_pseudo into new_db.mytable.name_pseudo.
Can such expression in pure MySQL be constructed?
From the MySQL manual on UPDATE
You can also perform UPDATE operations covering multiple tables. However, you cannot use ORDER BY or LIMIT with a multiple-table UPDATE. The table_references clause lists the tables involved in the join. Its syntax is described in Section 13.2.9.2, “JOIN Syntax”. Here is an example:
UPDATE items,month SET items.price=month.price
WHERE items.id=month.id;
Which in your case should read like:
UPDATE newdb.mytable AS new, old_db.mytable AS old
SET new.name_pseudo=old.name_pseudo
WHERE old.uni_id=new.uni_id;

Triggers with complex configurable conditions

Some background
We have a system which optionally integrates to several other systems. Our system shuffles data from a MySQL database to the other systems. Different customers want different data transferred. In order to not trigger unnecessary transfers (when no relevant data has changed) to these external systems, we have an "export" table which contains all the information any customer is interested in and a service which runs SQL queries defined in a file to compare the data in the export table to the data in the other tables and update the export table as appropriate, a solution we're not really happy with for several reasons:
No customer uses more than a fraction of these columns, although each column is used by at least one customer.
As the database grows, the service is causing increasing amounts of strain on the system. Some servers completely freeze while this service compares data, which may take up to 2 minutes (!) even though no customer has particularly large amounts of data (~15000 rows across all relevant tables, max). We fear what might happen if we ever get a customer with very large amounts of data. Performance could be improved by creating some indexes and improving the SQL queries, but we feel like that's attacking the problem from the wrong direction.
It's not very flexible, nor scalable. Having to add new columns every time a customer is interested in transferring data that no other customer has been interested in before (which happens a lot), just feels... icky. I don't know how much it really matters, but we're up to 37 columns in this table at the moment, and it keeps growing.
What we want to do
Instead, we would like to have a very slimmed down "export" table which only contains the bare minimum information, i.e. the table and primary key of the row that was updated, the system this row should be exported to, and some timestamps. A trigger in every relevant table would then update this export table whenever a column that has been configured to warrant an update is updated. This configuration should be read from another table (which, sometime in the future, could be configured from our web GUI), looking something like this:
+--------+--------+-----------+
| system | table | column |
+--------+--------+-----------+
| 'sys1' | 'tbl1' | 'column1' |
+--------+--------+-----------+
| 'sys2' | 'tbl1' | 'column2' |
+--------+--------+-----------+
Now, the trigger in tbl1 will read from this table when a row is updated. The configuration above should mean that if column1 in tbl1 has changed, then an export row for sys1 should be updated, if column2 has changed too, then an export row for sys2 should also be updated, etc.
So far, it all seems doable, although a bit tricky when you're not an SQL genius. However, we would preferably like to be able to define a little bit more complex conditions, at least something like "column3 = 'Apple' OR column3 = 'Banana'", and this is kind of the heart of the question...
So, to sum it up:
What would be the best way to allow for triggers to be configured in this way?
Are we crazy? Are triggers the right way to go here, or should we just stick to our service, smack on some indexes and suck it up? Or is there a third alternative?
How much of a performance increase could we expect to see? (Is this all worth it?)
This is actually impossible because dynamic SQL is not supported in SQL. Therefore we came up with reading the config table from PHP and generating "static" triggers. We'll try having 2 tables, one for columns and one for conditions, like so:
Columns
+--------+--------+-----------+
| system | table | column |
+--------+--------+-----------+
| 'sys1' | 'tbl1' | 'column1' |
+--------+--------+-----------+
| 'sys2' | 'tbl1' | 'column2' |
+--------+--------+-----------+
Conditions
+--------+--------+-------------------------------------------+
| system | table | condition |
+--------+--------+-------------------------------------------+
| 'sys1' | 'tbl1' | 'column3 = "Apple" OR column3 = "Banana"' |
+--------+--------+-------------------------------------------+
Then just build a statement like this in PHP (pseudo-code):
DROP TRIGGER IF EXISTS `tbl1_AUPD`;
CREATE TRIGGER `tbl1_AUPD` AFTER UPDATE ON tbl1 FOR EACH ROW
BEGIN
IF (*sys1 columns changed*) AND (*sys1 condition1*) THEN
updateExportTable('sys1', 'tbl1', NEW.primary_key, NEW.timestamp);
END IF;
IF (*sys2 columns changed*) THEN
updateExportTable('sys2', 'tbl1', NEW.primary_key, NEW.timestamp);
END IF;
END;
This seems to be the best solution for us, maybe even better than what I was asking for, but if anyone has a better suggestion I'm all ears!