can a MySQL slave instance have different row values for the same ID when binlog_format is set to STATEMENT and we insert something like:
insert into foo values(CURRENT_TIMESTAMP)
As I understand it, the slave read the SQL statement and execute it thus, if the replication is lagging, could lead to differences for the same row. Right or wrong ?
How can I avoid this situation ?
Thank you.
Your approach is perfectly safe in statement level replication. The TIMESTAMP is written to the binary log, so the value for CURRENT_TIMESTAMP will be consistent across the master and the slave even if the slave is behind. You can also use the NOW() function safely for the same reason.
The function to avoid is SYSDATE(), which will not use the TIMESTAMP from the binary log, and therefore the slave's value will represent when the statement ran on the slave, rather than when the statement ran on the master.
Related
We currently use a trigger on our MySQL database that sets a "last-modified" timestamp to CURRENT_TIMESTAMP. It is called on update.
We also need to use statement based reproduction for the cluster.
Is there a way to modify the trigger so that the propagated CURRENT_TIMESTAMP is identical on every cluster instance?
Currently the statement based reproduction calls the statement for every cluster instance, resulting in slightly different timestamps.
You must switch to mixed binlog format to save inside the binlog not only the statement but also the data for non deterministic writes.
You can do that without service disruption with:
SET GLOBAL binlog_format = 'MIXED';
On the master server generating the binlog.
On mysql, I have two data bases "parque_test" and "tabelas_temporais", and binary logs are activated.
Every action that modifies an InnoDB table belonging to "parque_test" is recorded on the binary log. However, "parque_test" has stored procedures that use temporary tables to retrieve a result (they are not used to perform update, delete or insert).
To avoid recording the activity of the temporary tables on the bin log, I have set the
"/etc/mysql/my.cnf" file so that mysql register all the activities on "parque_test" with the exception of "tabelas_temporais".
cat /etc/mysql/my.cnf"
...
#log_bin = /var/log/mysql/mysql-bin.log
log_bin=/mysql-log/bin-log
binlog_do_db=parque_test
binlog_do_db=parque_prod
expire_logs_days = 10
max_binlog_size = 100M
#binlog_do_db = include_database_name
#binlog_ignore_db = include_database_name
binlog_ignore_db=tabelas_temporais
...
All the temporary tables are created on the "tabelas_temporais" schema; however, the binary log still records the activities on "tabelas_temporais" when for example a stored procedure from "parque_test" is executed a containing a command such as
DROP TEMPORARY TABLE IF EXISTS tabelas_temporais.temp_mod_user;
Any help would be much appreciated!
mysql Ver 14.14 Distrib 5.5.40, for debian-linux-gnu (x86_64) using readline 6.2
Database filtering in the MySQL binary log can be somewhat unexpected if you don't know exactly how it works. from the manual
When using statement-based logging, the following example does not work as you might expect. Suppose that the server is started with --binlog-ignore-db=sales and you issue the following statements:
USE prices;UPDATE sales.january SET amount=amount+1000;
The UPDATE statement is logged in such a case because --binlog-ignore-db applies only to the default database (determined by the USE statement). Because the sales database was specified explicitly in the statement, the statement has not been filtered. However, when using row-based logging, the UPDATE statement's effects are not written to the binary log, which means that no changes to the sales.january table are logged; in this instance, --binlog-ignore-db=sales causes all changes made to tables in the master's copy of the sales database to be ignored for purposes of binary logging.
In short: it seems you might want to look into ROW based logging instead of STATEMENT or MIXED. However:
You should keep in mind that the format used to log a given statement may not necessarily be the same as that indicated by the value of binlog_format. For example, DDL statements such as CREATE TABLE and ALTER TABLE are always logged as statements, without regard to the logging format in effect, so the following statement-based rules for --binlog-ignore-db always apply in determining whether or not the statement is logged.
DROP is also a DDL which gets logged. So, does that mean there's no way? On the contrary:
.... temporary tables are logged only when using statement-based replication, whereas with row-based replication they are not logged. With mixed replication, temporary tables are usually logged; exceptions happen with user-defined functions (UDFs) and with the UUID() function....
So, in short, for 'normal' tables this becomes next to impossible while working in a schema that is logged, however, TEMPORARY tables are discarded in ROW based replication by default. This means: switch to ROW based replication, and you don't need to use a different schema for true temporary tables.
However, if you need to switch from STATEMENT / MIXED to ROW based replication, do check performance of this, and if you often do a bulk update (a lot of rows affected), your binlogs will quite a bit larger, as it will log every row changed rather then the single 'simple' UPDATE statement which caused it.
I have one master and 2 slaves and have replication set at statement level.
I want to purge records older than specific date on master and one of the slave. I created a procedure to do that.
Is there any way I can skip procedure call on 2nd slave and execute that on 1st slave?
Please note that I want to execute all other statements and I want to schedule purge procedure call as MySQL event.
Thanks and best regards,
Santosh!
Declare the procedure to some sort of No-Op on the 2nd slave. The statement will still get sent and processed but won't do anything.
I have a number mysql servers running version 5.1.63 and whilst running some queries against the slave earlier this week, I noticed some data on the slave that should have been removed using an update statement on the master.
My initial thoughts were:
someone on the team was updating the slave, which I have since disproved
that the column being updated had changed
So, I investigated by running a mysql show status "table" query. This was run against a test database on each of the servers to see what the data length was, in a lot of cases it was showing me the data length differed between servers, but on an eyeball look at the data I could see the data was the same, so I couldn't use this method to see if there were any differences as it appears to be prone to error.
Next I ran a simple (across all dbs) row count for each table to confirm the row count was the same - it was.
I then started looking in the bin logs for replication. I could see the update statements that should have run clearly visible in the logs, but the update never ran.
What I need to know is:
is replication broken? I'm assuming it is
if I create new slave servers, will I encounter the same issue?
how do I find out the extent of the issue on my servers?
Any help is appreciated.
If you are using statement based replication then it is easily possible to end up with different results on master and slave due to badly constructed INSERT statements.
INSERT SELECT without ORDER BY, or where the ORDER BY can leave non deterministic results will cause the slaves to diverge from master.
From the MySQL site http://dev.mysql.com/doc/refman/5.1/en/insert-select.html
The order in which rows are returned by a SELECT statement with no
ORDER BY clause is not determined. This means that, when using
replication, there is no guarantee that such a SELECT returns rows in
the same order on the master and the slave; this can lead to
inconsistencies between them. To prevent this from occurring, you
should always write INSERT ... SELECT statements that are to be
replicated as INSERT ... SELECT ... ORDER BY column. The choice of
column does not matter as long as the same order for returning the
rows is enforced on both the master and the slave. See also Section
16.4.1.15, “Replication and LIMIT”.
If this has happened then your replicas have diverged and the only safe way to bring them back in line is to rebuild them from a recent backup of the master DB. The worst part of this is the error may never cause the replication to fail, yet the results are inconsistent. Normally replication fails when an UPDATE or DELETE statement affects a different number of rows than on master, this is confusing as it was not the UPDATE that actually caused the error and the only way I know to fix the issue is to inspect every INSERT query in the code base!
Status details are from information_schema which collates data from databases statistics for Mysql instance and it never remained the same at every execution. It can be considered as just a rough estimation of data sizes in bytes but never an exact value as for index and data length. It can be used for estimations but not for cross check. For replication you may check the slave io and sql against the master is running or not. And relay-info you might see the corresponding log details from master and that of slave.
Of,course (1) way of doing is count(*) of tables EOD ensures the data in tables on master and slave are consistent or not. But to be accurate either (2) take random value fields and cross check with master and slave. Also if you aren't satisfied with it, (3) you may take them into outfile and take diff or checksum. I prefer (1) and (2). If (1) is not possible (2) still convinces me. ;)
There is a tool to verify replication named pt-table-checksum
Ok,I'm running a setup with a single master and a number of slaves. All writes go through the master and are replicated down to the slaves which are used strictly for reads.
Now I have a stored procedure (not function) which is called by a trigger on an insert. According to the MySQL docs, for replication triggers log the call to the trigger while stored procedures actually log the result of the stored procedure.
So my question is, when my trigger gets fired, will it replicate both the trigger and the results of the procedure that the trigger calls (resulting in the procedure effectively being run twice)? Or will it simply replicate the trigger have the slaves re-run the stored procedure on their own?
Thanks
In MySQL 5.0 (and MySQL 5.1 with statement based binary logging), only the calling query is logged, so in your case, the INSERT would be logged.
On the slave, the INSERT will be executed and then the trigger will be re-run on the slave. So the trigger needs to exist on the slave, and assuming it does, then it will be executed in exactly the same way as the master.
In MySQL 5.1, there is row-based binary logging, which will log only the rows being changed, so the trigger would not be re-fired on the slave, but all rows that changed would still be propagated.
In addition to Harrison's excellent answer:
Assuming the databases are in sync (schema, data, same version) to start with, it should just work
If it doesn't, then it may be that you're using something non deterministic in your queries or trigger. Fix that.
Regardless of how you use replication, you need to have monitoring to check that the slaves are always in sync. Without any monitoring, they will become out of sync (subtly) and you won't notice. MySQL has no automatic built-in feature for checking this or fixing it.