I have a number mysql servers running version 5.1.63 and whilst running some queries against the slave earlier this week, I noticed some data on the slave that should have been removed using an update statement on the master.
My initial thoughts were:
someone on the team was updating the slave, which I have since disproved
that the column being updated had changed
So, I investigated by running a mysql show status "table" query. This was run against a test database on each of the servers to see what the data length was, in a lot of cases it was showing me the data length differed between servers, but on an eyeball look at the data I could see the data was the same, so I couldn't use this method to see if there were any differences as it appears to be prone to error.
Next I ran a simple (across all dbs) row count for each table to confirm the row count was the same - it was.
I then started looking in the bin logs for replication. I could see the update statements that should have run clearly visible in the logs, but the update never ran.
What I need to know is:
is replication broken? I'm assuming it is
if I create new slave servers, will I encounter the same issue?
how do I find out the extent of the issue on my servers?
Any help is appreciated.
If you are using statement based replication then it is easily possible to end up with different results on master and slave due to badly constructed INSERT statements.
INSERT SELECT without ORDER BY, or where the ORDER BY can leave non deterministic results will cause the slaves to diverge from master.
From the MySQL site http://dev.mysql.com/doc/refman/5.1/en/insert-select.html
The order in which rows are returned by a SELECT statement with no
ORDER BY clause is not determined. This means that, when using
replication, there is no guarantee that such a SELECT returns rows in
the same order on the master and the slave; this can lead to
inconsistencies between them. To prevent this from occurring, you
should always write INSERT ... SELECT statements that are to be
replicated as INSERT ... SELECT ... ORDER BY column. The choice of
column does not matter as long as the same order for returning the
rows is enforced on both the master and the slave. See also Section
16.4.1.15, “Replication and LIMIT”.
If this has happened then your replicas have diverged and the only safe way to bring them back in line is to rebuild them from a recent backup of the master DB. The worst part of this is the error may never cause the replication to fail, yet the results are inconsistent. Normally replication fails when an UPDATE or DELETE statement affects a different number of rows than on master, this is confusing as it was not the UPDATE that actually caused the error and the only way I know to fix the issue is to inspect every INSERT query in the code base!
Status details are from information_schema which collates data from databases statistics for Mysql instance and it never remained the same at every execution. It can be considered as just a rough estimation of data sizes in bytes but never an exact value as for index and data length. It can be used for estimations but not for cross check. For replication you may check the slave io and sql against the master is running or not. And relay-info you might see the corresponding log details from master and that of slave.
Of,course (1) way of doing is count(*) of tables EOD ensures the data in tables on master and slave are consistent or not. But to be accurate either (2) take random value fields and cross check with master and slave. Also if you aren't satisfied with it, (3) you may take them into outfile and take diff or checksum. I prefer (1) and (2). If (1) is not possible (2) still convinces me. ;)
There is a tool to verify replication named pt-table-checksum
Related
I'm currently thinking about the following problem:
A customer has set up a simple master/slave replication between two mariaDB systems. For unknown reasons they have set the flag "Replicate_Wild_Ignore_Table" to skip "logdb.%". Obviously, they decided to skip the skipping of that database and want the logdb to be included in the replication again.
I'm curious now, is it possible to somehow remove that flag and have the database in question be replicated as the rest or is there no way to circumvent the "stop slave, dump master, import dump, recreate replication based on current logpos, start slave" procedure?
You can't assume that the master still has all relevant binlogs that once contained updates to the logdb.% tables. That is, even if you could re-apply those updates, do you have enough history to account for all changes to the tables?
Another risk is if you use statement-based replication, if there were ever statements that referenced both a table in logdb.% and a table in another database, the replication filter has skipped that statement. So for example:
INSERT INTO mydb.mytable SELECT * FROM logdb.othertable;
Therefore even the tables that are not in logdb.% might be compromised. The point is you don't know for sure.
The bottom line is that you should definitely reinitialize the replica now by taking a current backup of the master, and avoid using replication filters in the future.
If you use InnoDB tables, you might consider using Percona XtraBackup to make the process easier. See https://www.percona.com/doc/percona-xtrabackup/2.3/howtos/setting_up_replication.html
We have a one master and two VIP slave database servers. We changed data type of column from VARCHAR(255) to TEXT on master.
The application is currently configured to use master only for writing operations and configured slaves for reading operation.
After changing the data type on master server using ALTER TABLE command the slave server becomes unresponsive.
We are using Mariadb 10.0
[PROCESSES INFORMATION]
Id User Host Db Command Time(sec) State Info
-----------------------------------------------------------------------
203739 repl slave1 Binlog Dump 75,143,121 Master has sent all binlog to slave; waiting for binlog to be updated
203740 repl slave2 Binlog Dump 75,143,121 Master has sent all binlog to slave; waiting for binlog to be updated
The slave instance becomes very slow due to slow queries.
number of sessions: 1590
thread_pool_max_thread=500
Current value =648
After performing ALTER TABLE on Master server, it was replicating to slave server and in the same time number of sessions were get increased rapidly on slave server.
I think slaves becomes unresponsive because of slow queries.
But I don't know why this queries became so slow and slaves got unresponsive.
The DBA's saying that after executing ANALYZE TABLE command, the issue has been solved.
But I don't understand why this happened because ANALYZE TABLE only update the statistic information.
It would be helpful if anyone comment on this why it happened?
How to avoid such issues in future.
There is one minor case where TEXT is slower than VARCHAR. When a SELECT needs to build a temporary table (often for sorting due to GROUP BY or ORDER BY), it first tries to build a MEMORY table. But, TEXT and BLOB prevent it from using such, so it uses MyISAM instead. This is slower (but gets the job done).
I say this is a "minor case" because users rarely identify it with phrases like "very slow" and "becomes unresponsive". I would guess that a SELECT might run twice as slow.
Also, the ANALYZE TABLE discussion does not hold water. Again it may be coincidence, not causation.
So, the change to TEXT may be a 'red herring'. Instead, let's discover what is being slow by using the slowlog. See this for what I like to work from.
I came across this problem a few days ago and have been tinkering with- and pondering about several different approaches, but I cannot seem to find a good answer:
I have two MySQL servers, one master/hot and one slave/archive. All write requests go to the master, and shall also (eventually) be replicated/copied to the slave. However, certan data in the master grows "stale" after a while (say a week) and shall then be purged, so to keep the master's tables short. This purge should however not affect the slave. How can I go about achiving this?
Essentially, my master database acts sort of like a "hot" database, where data is fresh and is purged once it goes old. It should contain data that users might need quickly, and thus we want to keep the tables small. My slave on the other hand works more like an archive, which should contain all data, regardless of "hotness". Queries to the slave doesn't need to execute quickly, and the slaves data can lag behind a few minutes, but it needs to contain all records since our beginning of time.
My initial thought was to utilize ordinary replication, but can I somehow filter certain queries to not affect the slave? I was thinking of creating a purge query, which removes old data from the master but doesn't effect the slave. From reading the MySQL documentation, it seems that this filtering can only be done on Database or Tabel level.
Another thought was to do this via an external application, and manually SELECT data from the master and INSERT it into the slave, and then use some clever logic to decide what data to select. This works good for log-tables, which will only ever add data, but it doesn't work good for tables that represents states, such as user settings. This approach will probably also include a lot of special cases, as I cannot find a good, consistent way of describing all tables in our database (there are log-tables, state-tables, config-tables and a few which I cannot really categorize).
None of these approaches seem to solve the problem in a simple fashion, but I feel I cannot be the first to have this problem. Any ideas are welcome, and thanks in advance.
If more info is needed, feel free to comment and I'll edit it in
Just use regular replication. When you delete data on the master you do in the same session
SET sql_log_bin = 0;
DELETE FROM my_table WHERE whatever = true;
SET sql_log_bin = 1;
This prevents that those statements are written to the binary log. And therefore it won't be replicated to the slave.
read more about it here
SO, we are trying to run a Report going to screen, which will not change any stored data.
However, it is complex, so needs to go through a couple of (TEMPORARY*) tables.
It pulls data from live tables, which are replicated.
The nasty bit when it comes to take the "eligible" records from
temp_PreCalc
and populate them from the live data to create the next (TEMPORARY*) table output
resulting in effectively:
INSERT INTO temp_PostCalc (...)
SELECT ...
FROM temp_PreCalc
JOIN live_Tab1 ON ...
JOIN live_Tab2 ON ...
JOIN live_Tab3 ON ...
The report is not a "definitive" answer, expectation is that is merely a "snapshot" report and will be out-of-date as soon as it appears on screen.
There is no order or reproducibility issue.
So Ideally, I would turn my TRANSACTION ISOLATION LEVEL down to READ COMMITTED...
However, I can't because live_Tab1,2,3 are replicated with BIN_LOG STATEMENT type...
The statement is lovely and quick - it takes hardly any time to run, so the resource load is now less than it used to be (which did separate selects and inserts) but it waits (as I understand it) because of the SELECT that waits for a repeatable/syncable lock on the live_Tab's so that any result could be replicated safely.
In fact it now takes more time because of that wait.
I'd like to SEE that performance benefit in response time!
Except the data is written to (TEMPORARY*) tables and then thrown away.
There are no live_ table destinations - only sources...
these tables are actually not TEMPORARY TABLES but dynamically created and thrown away InnoDB Tables, as the report Calculation requires Self-join and delete... but they are temporary
I now seem to be going around in circles finding an answer.
I don't have SUPER privilege and don't want it...
So can't SET BIN_LOG=0 for this connection session (Why is this a requirement?)
So...
If I have a scratch Database or table wildcard, which excludes all my temp_ "Temporary" tables from replication...
(I am awaiting for this change to go through at my host centre)
Will MySQL allow me to
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
INSERT INTO temp_PostCalc (...)
SELECT ...
FROM temp_PreCalc
JOIN live_Tab1 ON ...
JOIN live_Tab2 ON ...
JOIN live_Tab3 ON ...
;
Or will I still get my
"Cannot Execute statement: impossible to write to binary log since
BINLOG_FORMAT = STATEMENT and at least one table uses a storage engine
limited to row-based logging..."
Even though its not technically true?
I am expecting it to, as I presume that the replication will kick in simply because it sees the "INSERT" statement, and will do a simple check on any of the tables involved being replication eligible, even though none of the destinations are actually replication eligible....
or will it pleasantly surprise me?
I really can't face using an unpleasant solution like
SELECT TO OUTFILE
LOAD DATA INFILE
In fact I dont think I could even use that - how would I get unique filenames? How would I clean them up?
The reports are run on-demand directly by end users, and I only have MySQL interface access to the server.
or streaming it through the PHP client, just to separate the INSERT from the SELECT so that MySQL doesnt get upset about which tables are replication eligible....
So, it looks like the only way appears to be:
We create a second Schema "ScratchTemp"...
Set the dreaded replication --replicate-ignore-db=ScratchTemp
My "local" query code opens a new mysql connection, and performs a USE ScratchTemp;
Because I have selected the default database of the "ignore"d one - none of my queries will be replicated.
So I need to take huge care not to perform ANY real queries here
Reference my scratch_ tables and actual data tables by prefixing them all on my queries with the schema qualified name...
e.g.
INSERT INTO LiveSchema.temp_PostCalc (...) SELECT ... FROM LiveSchema.temp_PreCalc JOIN LiveSchema.live_Tab1 etc etc as above.
And then close this connection just as soon as I can, as it is frankly dangerous to have a non-replicated connection open....
Sigh...?
I have 2 servers set up using MySQL. It's using a standard replication setup, with one slave, no circular replication.
Is there a way to programmatically tell how far behind the slave is in reading the data from the binary log?
If I run the statement:
SHOW MASTER STATUS;
On the master, and run
SHOW SLAVE STATUS;
on the slave, I can compare the Position column from master status, and the Read_Master_Log_Pos column from slave status to determine how far behind the slave is.
However, this only works if the slave is reading from the same file the master is writing to. So if the slave is still reading a previous log file, because it is running behind, I can't figure out how to determine how much data is left until it catches up to the current position that the master is at. A solution using only SQL would be optimal, but I'm open to other solutions. Hopefully not one that requires reading the directory containing the log files.
I like to use the 'Seconds_behind_master' field from SHOW SLAVE STATUS in order to determine if the MASTER-SLAVE servers are caught up. As a secondary guard I also to a COUNT(*) query on a specific table (ie one that gets updated frequently) on both servers and then compare the record counts.