MySQL: Strategy to move data to another server - mysql

So my situation is as follows:
There is a single Master-Slave Replication on a MySQL 5.5 basis.
The master use a small SSD as data partition.
Therefore I want to clean a certain Inno Table (lets call this table MasterA) and move old (datediff < -2) rows to another database on the slave (SlaveA) with more space on the SATA-HDD.
The problem gets interesting as in some cases I need to access data from SlaveA.
So I think it would be the best if an event triggers a transaction like this:
INSERT INTO SlaveA SELECT * FROM MasterA WHERE datediff(created, now()) < -2;
DELETE FROM MasterA WHERE datediff(created, now()) < -2;
But how could I access SlaveA from the master? I already tried the federated engine, but it gets stuck with the read_only option activated on the slave and the super privilege for the user accessing the federated table.
Maybe the event should only call the copy query on the slave, but how to delete the rows on the master afterwards?
There should be other options than installing MySQL 5.6 and use another partition for the SlaveA table on the master.
Thanks in advance!

An external daemon process (with handles to both databases) could accomplish what you are looking for but it is not a very clean solution.
If you did have a single handle with access to both databases a trigger would be a viable solution. I would change your code to use a MySQL user defined variable setting it in the first statement and use it in the second statement.
On the other hand I would question why you think you need the write master on a SSD. Insert queries are normally a lot cheaper than delete queries. If you make sure all the reads are against the slaves the master should have very minimal latency. I would recommend putting it on SATA HDD and not running delete quires against it. Then you don't have to create a custom trigger; MySQL's built in replication should work just fine.

Related

General Log Move Another Table

Using MYSQL, I want to record my data from the general_log table on server A to a table on server B instantly at every data and delete the data from server A at the end of the day. I tried to use Trigger for this, but the general_log does not allow me to write triggers because it sees the system file. Alternatively, when I use the Fedareted table, when I delete the data on server A, those on server B are also deleted. Thanks in advance for your help.
I would recommend the following strategy:
First, partition the data on in general_log by date. You can learn about table partitioning in the documentation.
Second, set up replication so server B is identify to server A in real time. Once again, you may need to refer to the documentation.
Third, set up a job to remove the previous partition from A shortly after midnight.
To be honest, if you don't understand table partitioning and replication, you should get a DBA involved. In fact, if you are trying to coordinate multiple database servers, you should have a DBA involved, who would understand these concepts and how best to implement them in your environment.
I recommend to develop an ETL job to move the data every day and delete it from the old server

How to re-replicate ignored tables

I'm currently thinking about the following problem:
A customer has set up a simple master/slave replication between two mariaDB systems. For unknown reasons they have set the flag "Replicate_Wild_Ignore_Table" to skip "logdb.%". Obviously, they decided to skip the skipping of that database and want the logdb to be included in the replication again.
I'm curious now, is it possible to somehow remove that flag and have the database in question be replicated as the rest or is there no way to circumvent the "stop slave, dump master, import dump, recreate replication based on current logpos, start slave" procedure?
You can't assume that the master still has all relevant binlogs that once contained updates to the logdb.% tables. That is, even if you could re-apply those updates, do you have enough history to account for all changes to the tables?
Another risk is if you use statement-based replication, if there were ever statements that referenced both a table in logdb.% and a table in another database, the replication filter has skipped that statement. So for example:
INSERT INTO mydb.mytable SELECT * FROM logdb.othertable;
Therefore even the tables that are not in logdb.% might be compromised. The point is you don't know for sure.
The bottom line is that you should definitely reinitialize the replica now by taking a current backup of the master, and avoid using replication filters in the future.
If you use InnoDB tables, you might consider using Percona XtraBackup to make the process easier. See https://www.percona.com/doc/percona-xtrabackup/2.3/howtos/setting_up_replication.html

Sync.\Maintaining updated data in 2 DATABASE TABLES(MYSQL)

I have 2 Databases
Database 1,
Database 2
Now Each Database has Table say Table 1(IN DATABASE 1) and Table 2(IN DATABASE 2).
Table 1 is Basically a Copy of Table 2(Just for Backup).
How can i Sync Table 2 if Table 1 is Updated?
I am using MYSQL,Storage Engine:InnoDBand in back-end programming i am using php.
Further i can check for update after every 15 minutes using php script but it takes too much time because each table has51000 rows.
So, How can i achieve something like if Administrator/Superuser updates table 1, that update should me immediately updated in Table 2.
Also, is there a way where Bi-Directional Update can work i.e Both can be Masters?
Instead Table 1 as the only master, Both Table 1 and Table 2 can be Master's? if any update is done at Any of the tables other one should update accordingly?
If not wrong, what you are looking for is Replication which does this exact thing for you. If you configure a Transnational Replication then every DML operation will get cascaded automatically to the mirrored DB. So, no need for you to do continuously polling from your application.
Quoted from MySQL Replication document
Replication enables data from one MySQL database server (the master)
to be replicated to one or more MySQL database servers (the slaves).
Replication is asynchronous - slaves need not be connected permanently
to receive updates from the master. This means that updates can occur
over long-distance connections and even over temporary or intermittent
connections such as a dial-up service. Depending on the configuration,
you can replicate all databases, selected databases, or even selected
tables within a database.
Per your comment, Yes Bi-Directional Replication can also be configured.
See Configuring Bi-Directional Replication
As Rahul stated, what you are looking for is replication.
The standard replication of mysql is master -> slave which means that one of the databases is "master", the rest slaves. All changes must be written to the master db and will then be copied to the slaves. More info can be found in the mysql documentation on replication.
There is aslo an excellent guide on the digitaloceans community forums on master <-> master replication setup.
If the requirements for "Administrator/Superuser" weren't in your question, you could use the mysql's Replication functions on the databases.
If you want the data to be synced immediately to the Table2 upon inserting in Table1, you could use a trigger on the table. In that trigger you can check which user (if you have a column in that table specifying which user inserted the data) submitted data. If the user is an admin, configure the trigger to duplicate the data, if the user is a normal user, don't do anything.
Next for normal users entering data, you could keep an counter on each row, increasing by 1 if it's a new 'normal' user's data. Again in the same trigger, you could also check for what number the counter already is. Let's say if you reach 10, then duplicate all the rows to the other table and reset the counter + remove the old counter values from the just-duplicated-rows.

MySQL replication without delete statments

I have been looking for a way to prevent MySQL delete statements from getting processed by the slave, I'm working on data warehousing project, and I would like to delete data from production server after having data replicated to slave.
what is the best way to get this done?
Thank you
There are several ways to do this.
Run SET SQL_LOG_BIN=0; for the relevant session on the master before executing your delete. That way it is not written to the binary log
Implement a BEFORE DELETE trigger on the slave to ignore the deletes.
I tend to use approach #1 for statements that I don't want to replicate. It requires SUPER privilege.
I have not tried #2, but it should be possible.
You'll only be able to achieve this with a hack, and it will likely cause problems. MySQL replication isn't designed for this.
Imagine you insert a record in your master, it replicates to the slave. You then delete from the master, but it doesn't delete from the slave. If someone adds a record with the same unique key, there will be a conflict on the slave.
Some alternatives:
If you are looking to make a backup, I would do this by another means. You could do a periodic backup with a cronjob that runs mysqldump, but this assumes you don't want to save EVERY record, only create periodic restore points.
Triggers to update a second, mirror database. This can't cross servers though, you'd have to recreate each table with a different name. Also, the computational cost would be high and restoring from this backup would be difficult.
Don't actually delete anything, simply create a Status field which is Active or Disabled, then hide Disabled from the users. This has issues as well, for example, ON DELETE CASCADE couldn't be used, it would have to be all manually done in code.
Perhaps if you provide the reason you want this mirror database without deletes, I could give you a more targeted solution.

Copying data from PostgreSQL to MySQL

I currently have a PostgreSQL database, because one of the pieces of software we're using only supports this particular database engine. I then have a query which summarizes and splits the data from the app into a more useful format.
In my MySQL database, I have a table which contains an identical schema to the output of the query described above.
What I would like to develop is an hourly cron job which will run the query against the PostgreSQL database, then insert the results into the MySQL database. During the hour period, I don't expect to ever see more than 10,000 new rows (and that's a stretch) which would need to be transferred.
Both databases are on separate physical servers, continents apart from one another. The MySQL instance runs on Amazon RDS - so we don't have a lot of control over the machine itself. The PostgreSQL instance runs on a VM on one of our servers, giving us complete control.
The duplication is, unfortunately, necessary because the PostgreSQL database only acts as a collector for the information, while the MySQL database has an application running on it which needs the data. For simplicity, we're wanting to do the move/merge and delete from PostgreSQL hourly to keep things clean.
To be clear - I'm a network/sysadmin guy - not a DBA. I don't really understand all of the intricacies necessary in converting one format to the other. What I do know is that the data being transferred consists of 1xVARCHAR, 1xDATETIME and 6xBIGINT columns.
The closest guess I have for an approach is to use some scripting language to make the query, convert results into an internal data structure, then split it back out to MySQL again.
In doing so, are there any particular good or bad practices I should be wary of when writing the script? Or - any documentation that I should look at which might be useful for doing this kind of conversion? I've found plenty of scheduling jobs which look very manageable and well-documented, but the ongoing nature of this script (hourly run) seems less common and/or less documented.
Open to any suggestions.
Use the same database system on both ends and use replication
If your remote end was also PostgreSQL, you could use streaming replication with hot standby to keep the remote end in sync with the local one transparently and automatically.
If the local end and remote end were both MySQL, you could do something similar using MySQL's various replication features like binlog replication.
Sync using an external script
There's nothing wrong with using an external script. In fact, even if you use DBI-Link or similar (see below) you probably have to use an external script (or psql) from a cron job to initiate repliation, unless you're going to use PgAgent to do it.
Either accumulate rows in a queue table maintained by a trigger procedure, or make sure you can write a query that always reliably selects only the new rows. Then connect to the target database and INSERT the new rows.
If the rows to be copied are too big to comfortably fit in memory you can use a cursor and read the rows with FETCH, which can be helpful if the rows to be copied are too big to comfortably fit in memory.
I'd do the work in this order:
Connect to PostgreSQL
Connect to MySQL
Begin a PostgreSQL transaction
Begin a MySQL transaction. If your MySQL is using MyISAM, go and fix it now.
Read the rows from PostgreSQL, possibly via a cursor or with DELETE FROM queue_table RETURNING *
Insert them into MySQL
DELETE any rows from the queue table in PostgreSQL if you haven't already.
COMMIT the MySQL transaction.
If the MySQL COMMIT succeeded, COMMIT the PostgreSQL transaction. If it failed, ROLLBACK the PostgreSQL transaction and try the whole thing again.
The PostgreSQL COMMIT is incredibly unlikely to fail because it's a local database, but if you need perfect reliability you can use two-phase commit on the PostgreSQL side, where you:
PREPARE TRANSACTION in PostgreSQL
COMMIT in MySQL
then either COMMIT PREPARED or ROLLBACK PREPARED in PostgreSQL depending on the outcome of the MySQL commit.
This is likely too complicated for your needs, but is the only way to be totally sure the change happens on both databases or neither, never just one.
BTW, seriously, if your MySQL is using MyISAM table storage, you should probably remedy that. It's vulnerable to data loss on crash, and it can't be transactionally updated. Convert to InnoDB.
Use DBI-Link in PostgreSQL
Maybe it's because I'm comfortable with PostgreSQL, but I'd do this using a PostgreSQL function that used DBI-link via PL/Perlu to do the job.
When replication should take place, I'd run a PL/PgSQL or PL/Perl procedure that uses DBI-Link to connect to the MySQL database and insert the data in the queue table.
Many examples exist for DBI-Link, so I won't repeat them here. This is a common use case.
Use a trigger to queue changes and DBI-link to sync
If you only want to copy new rows and your table is append-only, you could write a trigger procedure that appends all newly INSERTed rows into a separate queue table with the same definition as the main table. When you want to sync, your sync procedure can then in a single transaction LOCK TABLE the_queue_table IN EXCLUSIVE MODE;, copy the data, and DELETE FROM the_queue_table;. This guarantees that no rows will be lost, though it only works for INSERT-only tables. Handling UPDATE and DELETE on the target table is possible, but much more complicated.
Add MySQL to PostgreSQL with a foreign data wrapper
Alternately, for PostgreSQL 9.1 and above, I might consider using the MySQL Foreign Data Wrapper, ODBC FDW or JDBC FDW to allow PostgreSQL to see the remote MySQL table as if it were a local table. Then I could just use a writable CTE to copy the data.
WITH moved_rows AS (
DELETE FROM queue_table RETURNING *
)
INSERT INTO mysql_table
SELECT * FROM moved_rows;
In short you have two scenarios:
1) Make destination pull the data from source into its own structure
2) Make source push out the data from its structure to destination
I'd rather try the second one, look around and find a way to create postgresql trigger or some special "virtual" table, or maybe pl/pgsql function - then instead of external script, you'll be able to execute the procedure by executing some query from cron, or possibly from inside postgres, there are some possibilities of operation scheduling.
I'd choose 2nd scenario, because postgres is much more flexible, and manipulating data some special, DIY ways - you will simply have more possibilities.
External script probably isn't a good solution, e.g. because you will need to treat binary data with special care, or convert dates&times from DATE to VARCHAR and then to DATE again. Inside external script, various text-stored data will be probably just strings, and you will need to quote it too.