I've setup mysql replication only for a specific database on the master.
If I connect to the master and don't specify a database (e.g. in the connection string or with the 'use database' command) the statement is not sent to the slave. Is this a bug?? Why does this happen?
Example 1
with no db specified up till now: won't replicate
insert into exampledb.mytable values(1,2,3);
Example 2
replicates
use exampeldb;
insert into mytable values(1,2,3);
Not a bug. This behavior is defined in the MySql docs:
The main reason for this “check just
the default database” behavior is that
it is difficult from the statement
alone to know whether it should be
replicated (for example, if you are
using multiple-table DELETE or
multiple-table UPDATE statements that
go across multiple databases). It is
also faster to check only the default
database rather than all databases if
there is no need.
Related
Currently using AWS DMS to replicate data from our Aurora MySQL database to S3. This results in a low-latency data lake we can use to get lineage of all changes occurring and build additional data pipelines off of. However, when making a change via pt-online-schema-change script the modified table stops replicating at all. Is there any reason why this would happen?
After running the change the logs show that the schemas for the source table no longer match what DMS is expecting, and the CDC changes are skipped. The only possible reason for this is DMS is not properly tracking the DML statements.
Table alter triggered with percona (in this case, add column)
New table synced by AWS DMS
Trigger adds throw warnings in AWS DMS as not supported
Table is renamed
Table column count does not match, ignoring extra columns.
Table column size mismatch, skipping.
Notably, all the DML statements being used by Percona (outside triggers) are supported by AWS DMS and S3 as a target. Does anyone else have any experience with this situation or combination of tools?
Edit:
Here's an example of the command used to make these changes with Percona:
pt-online-schema-change --host=<host> \
--user=<user> \
--ask-pass \
--execute \
--no-drop-old-table \
--no-check-alter \
--alter="ADD COLUMN db_row_update_stamp TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3)" \
D=<db>,t=<REPLACE_TABLE_NAME_HERE>
So looking at the DETAILED_DEBUG logs for this task, I was testing a RENAME scenario in AWS DMS manually.
This resulted in the following.
2021-02-27T00:38:43:255381 [SOURCE_CAPTURE ]T: Event timestamp '2021-02-27 00:38:38' (1614386318), pos 54593835 (mysql_endpoint_capture.c:3293)
2021-02-27T00:38:43:255388 [SOURCE_CAPTURE ]D: > QUERY_EVENT (mysql_endpoint_capture.c:3306)
2021-02-27T00:38:43:255394 [SOURCE_CAPTURE ]T: Default DB = 'my_db' (mysql_endpoint_capture.c:1713)
2021-02-27T00:38:43:255399 [SOURCE_CAPTURE ]T: SQL statement = 'RENAME TABLE test_table TO _test_table_old' (mysql_endpoint_capture.c:1720)
2021-02-27T00:38:43:255409 [SOURCE_CAPTURE ]T: DDL DB = '', table = '', verb = 0 (mysql_endpoint_capture.c:1734)
2021-02-27T00:38:43:255414 [SOURCE_CAPTURE ]T: >>> Unsupported or commented out DDL: 'RENAME TABLE test_table TO _test_table_old' (mysql_endpoint_capture.c:1742)
It seems that this version of DMS does not properly read RENAME statements despite the documentation claiming support for RENAME's.
Am looking into opening a bug on AWS's side. This impacted AWS DMS server version 3.4.3.
Will be testing against previous versions, will post an update if I find a specific version has this fixed until it is resolved in a newer version. Can't 100% claim it's a bug in DMS, but taking Percona out of the picture I was able to replicate the problem.
The single option here how to fix broken replication is to "Reload table data".
In AWS Console DMS select your migration task, go to "Table statistics" tab and
select your table, which have been altered (RENAME-ed) under the hood by Percona tool.
In a nutshell "Reload table data" action refreshes replication instance, cleans up caches and creates new snapshot for your data in S3.
Creating new snapshot in S3 will cause your replication being out-of-sync for some time. The period of time for recovering replication will linearly depends of volume of your data in table and performance of chosen replication instance.
Unfortunately RENAME TABLE statement isn't supported by DMS and assuming, that no one else could support it properly, as it breaks data checksums (or checkpoints in AWS).
We are using MYSQL in that we have 10 databases as single project.
my problem is to auto-merge 10 database tables into single database using replication.
for example :
MasterDatabases
database1
....table1
....table2
database2
....table21
....table22
database3
....table31
....table33
Replication Database
slavedatabase
....table1
....table2
....table21
....table22
....table31
....table33
You can use --replicate-rewrite-db for that.
Tells the slave to create a replication filter that translates the
default database (that is, the one selected by USE) to to_name if it
was from_name on the master. Only statements involving tables are
affected (not statements such as CREATE DATABASE, DROP DATABASE, and
ALTER DATABASE), and only if from_name is the default database on the
master. To specify multiple rewrites, use this option multiple times.
The server uses the first one with a from_name value that matches. The
database name translation is done before the --replicate-* rules are
tested. You can also create such a filter by issuing a CHANGE
REPLICATION FILTER REPLICATE_REWRITE_DB statement.
Read more about it here.
I'm getting data from an MSSQL DB ("A") and inserting into a MySQL DB ("B") using the date created in the MSSQL DB. I'm doing it with simple logics, but there's got to be a faster and more efficient way of doing this. Below is the sequence of logics involved:
Create one connection for MSSQL DB and one connection for MySQL DB.
Grab all of data from A that meet the date range criterion provided.
Check to see which of the data obtained are not present in B.
Insert these new data into B.
As you can imagine, step 2 is basically a loop, which can easily max out the time limit on the server, and I feel like there must be a way of doing this must faster and during when the first query is made. Can anyone point me to right direction to achieve this? Can you make "one" connection to both of the DBs and do something like below?
SELECT * FROM A.some_table_in_A.some_column WHERE
"it doesn't exist in" B.some_table_in_B.some_column
A linked server might suit this
A linked server allows for access to distributed, heterogeneous
queries against OLE DB data sources. After a linked server is created,
distributed queries can be run against this server, and queries can
join tables from more than one data source. If the linked server is
defined as an instance of SQL Server, remote stored procedures can be
executed.
Check out this HOWTO as well
If I understand your question right, you're just trying to move things in the MSSQL DB into the MySQL DB. I'm also assuming there is some sort of filter criteria you're using to do the migration. If this is correct, you might try using a stored procedure in MSSQL that can do the querying of the MySQL database with a distributed query. You can then use that stored procedure to do the loops or checks on the database side and the front end server will only need to make one connection.
If the MySQL database has a primary key defined, you can at least skip step 3 ("Check to see which of the data obtained are not present in B"). Use INSERT IGNORE INTO... and it will attempt to insert all the records, silently skipping over ones where a record with the primary key already exists.
Recently i am working on a replication between heterogeneous dbs with Tungsten Replicator. We have a mysql master and an oracle slave. According to the docs such a setup should work. I am using tungsten-replicator-2.0.5. I call
$TUNGSTEN_HOME/tools/configure \
--verbose \
--home-directory=$INSTALL_HOME \
--cluster-hosts=$MA_HOST,$SL_HOST \
on the master node for creating a basic installation on both nodes. Note: using the installer (as recommended) fails, due to the heterogeneous setup, since the installer fails to find a mysql instance on the slave node. The replicator instances are configured by adding static-$SERVICENAME.properties to the conf directory and modifying conf/services.properties (replicator.host=$HOSTNAME, replicator.masterListenPortStart=12112, replicator.rmi_port=20000).
Launching the replicators resulted in an ORA-01850 when issuing an update statement against trep_commit_seqno in the tungsten schema, due to a missing 'timestamp' keyword in the SQL-Statement. Just in order to get beyond this error, i altered datatype of update_timestamp and extract_timestamp to varchar. The replicators are now starting up und some simple inserts where replicated but when the test script issues a
DROP TABLE IF EXISTS table1;
replication fails due to an ORA-00933, because of the 'IF EXISTS' clause. I am not sure if this is an error in my configuration or if tungsten in general has problems with the differences in DDL statements between those two products. Has somebody successfully set up a similar replication using tungsten?
The Tungsten docuemntation has some useful guidance. In particular, this point from the "Advanced Principles of Operation" is relevant: "Also, DDL statements beyond the simplest CREATE TABLE expressions are rarely at all portable. ". In your case, DROP TABLE IF EXISTS table1; is not valid Oracle DDL.
Read it here.
For anybody who is interested: Up to now, Tungsten does not perform any transformation of ddl statements in a heterogeneous environment (as MithunSasidharan wrote). Now i wrote a custom filter, that skips ddl statements using regular expressions. For synchronizing the schema defition, we will use Apache DdlUtils, which serves quite well for transforming a schema definition between mysql and oracle. I assume it works for other vendors similarly well. Thanks.
Is it possible to notify a database from another database, both the database are on different server. and both the database are different type. eg. if i have a DB2 database on one server and some records are updated on this database than need to notify another DB and modify some table in on that DB(my case it is MYSql).
Thanks in advance
You can create a link between databases and using triggers - update corresponding data in the other database.
I would investigate replication, to see if mySQL can be a replication target for DB2.
Read introduction to SQL Replication.