I want to write a listener which detects the DML changes on a table and perform some actions. This listener cannot be embedded in the application and it runs separately.
I thought let the application write to blackhole table and I will detect the changes from the binary log file.
But in the docs I found that enabling binary logging slows down the mysql performance slightly. Thats why i was wondering is there a way i can make the mysql master to log the changes related to a specific table.
Thanks!
SQL is the best way to track DML change and call function based on that. But, as you want to explore other options you may try
writing a cronjob with General Query Log which includes SELECT / SHOW statements as well which you don't need
mysqlbinlog : It slows down performance just a little, but it is necessary for point in time data recovery and replication.
Suggestions:
On a prod environment, MySQL binary log must be enabled. and general
query log must be disabled as general query logs almost everything
and gets filled very quickly and might run out of disk space if not
rotated properly.
On a dev/qa environment, general query log can be enabled with proper
rotation policy.
Related
I have a database shared by two completely separate server applications and those applications cannot communicate with one another at all. Let's say those two applications are called A and B. Whenever A updates table in the shared DB, B should quickly know that there was a change somehow (remember *A and B cannot communicate with each other). Also, I want to avoid using setInterval type of approach where I query every x seconds. Initially I thought there would be a way to 'watch' changes within MySQL itself but seems like there isn't. What would be the best approach to achieve this? I'm using Node.js, MySQL Workbench, and PHP.
TDLR:
I'm trying to find a best way to 'watch' any table changes and trigger action (maybe like http request) whenever change is detected. I'm using MySQL Workbench and Node.js. I really want to avoid using setInterval type of approach. Any recommendation?
What you want is a Change Data Capture (CDC) feature. In MySQL, the feature is the binary log.
Some tools like Debezium are designed to watch and filter the binary log, and transform it into events on a message queue (e.g. Kafka).
Some comments above suggest using triggers, but this is a problematic idea, because triggers fire during a data change, when the transaction for that change is not yet committed. If you try to invoke an http request or any other application action when a trigger fires, then you risk having the action execute even if the data change is subsequently rolled back. This will really confuse people.
Also there isn't a good way to run application actions from triggers. They are for making subordinate data changes, not actions that are outside transaction scope.
Using the binary log as a record of changes is safer, because changes are not written to the binary log until they are committed. Also the binary log contains all changes to all tables.
Whereas with a trigger solution you would have to create three triggers (INSERT, UPDATE, and DELETE) for each table. Also MySQL does not support triggers for DDL statements (CREATE, ALTER, DROP, TRUNCATE, etc.).
Is there any way to check the query that occurs in my MySql database?
For example:
I have an application (OTRS) that allows you to generate reports according to the frames that I desire. I would like to know which query is made by the application in the database.
Because I will use it to integrate with other reporting software.
Is this possible?
Yes, you can enable logging in your MySQL server. there are several types of logs you can use, depending on what you want to log, starting from errors only or slow queries, and to logs that write everything done on your server.
See the full doc here
Although, as Nir says, mysql can log all queries (you should be looking at the general log or the slow log configured with a threshold of 0 seconds) this will show all the queries being run; on a production system it may prove difficult to match what you are doing in your browser with specific entries in the log.
The reason I suggest using the slow query log is that there are tools available which will remove the parameters from the queries, allowing you to see what SQL code is running more frequently.
If you have some proficiency in Perl it should be straightforward to output - all queries are processed via an abstraction layer.
(Presumably you are aware that the schema is published)
I have the app a MySQL DB is a slave for other remote Master DB. And i use memcache to do caching of some DB data.
My slave DB can be updated if there are updates in a Master DB. So in my application i want to know when my local (slave) DB is updated to invalidate related cached data and display fresh data i got from master.
Is there any way to run some program when slave mysql DB is updated ? i would then filter q query and understand if i need to clean a cache or not.
Thanks
First of all you are looking for solution similar to what Facebook did in their db architecture (As I remember they patched MySQL for this).
You can build your own solution based on one of these techniques:
Parse replication log on slave side, remove cache entry when you see update of data in the log
Load UDF (user defined function) for memcached, attach trigger on replica side (it will call UDF remove function) to interested tables inside MySQL.
Please note that this configuration is complicated during the support and maintenance. If you can sacrifice stale data in the cache maybe small ttl will help you.
As Kirugan says, it's as simple as writing your own SQL parser, and ensuring that you also provide an indexed lookup keyed to the underlying data for anything you insert into the cache, then cross reference the datasets for any DML you apply to the database. Of course, this will be a lot simpler if you create a simplified, abstract syntax to represent the DML, but thereby losing the flexibilty of SQL and of course, having to re-implement any legacy code using your new syntax. Apart from fixing the existing code, it should only take a year or two to get this working right. Basing your syntax on MySQL's handler API rather than SQL will probably save a lot of pain later in the project.
Of course, if you need full cache consistency then you need to ensure that a logical transaction now spans all the relevant datacentres which will have something of an adverse impact on your performance (certainly much slower than just referencing the master directly).
For a company like facebook, with hundreds of thousands of servers and terrabytes of data (and no requirement for cache consistency) such an approach to solving the problem leads to massive savings. If you only have 2 servers, a better solution would be to switch to multi-master replication, possibly add another database node, optimize the storage (e.g. switching to ssds / adding fast bcache) make sure you have session affinity to the dbms from the aplication (but not stcky sessions) and spend some time tuning your dbms, particularly its cache performance.
Last week I noticed after a crash that my mysql log file had become so large that it consumed the disk - not a massive disk. I recently implemented a new helpdesk/ticketing system which was adopted by the entire company much quicker than was anticipated thus a log file with 99% selects.
So my question is this; Can I retain mysql logging but exclude select statements? Further more can I keep select statements but exclude certain databases(i.e. helpdesk)?
Thanks for any response
You can't restrict MySQL General log file to certain database or certain DML statements. It logs everything being executed on your MySQL server and ofcourse it's a overhead on a MySQL server in production environment.
I suggest you to turn-off General log on production server and enable slow query log with appropriate settings so that only problamatic queries will be logged which needs attention, later you can optimize those queries to achieve better MySQL performance.
If you still needs general log to be enabled then make sure that logrotate script is used for General log file which will keep it's size to a certain limit.
http://www.thegeekstuff.com/2010/07/logrotate-examples/
I am researching the possibility to log all the changes made to a MySQL database including DDL statements that may occur and use that information so it can be synchronized with a remote database.
The application itself is written in C# so the best synchronization technology that I have seen so far to be available is Microsoft Sync Framework. This framework itself proposes a solution to track changes made to the DB by adding triggers and additional tables to store the deleted rows.
This does not seem to be a great idea for my case since it involves changing the schema of a standard DB used by more than 4 products. This method is also effectively doubling the number of tables (by adding a new table for the deleted rows of each table) which also does not feel to good.
On the other side MySQL has this great thing binlog, which tracks all the changes and can also use the so called mixed mode to track statements in most cases (so they can be executed again on the remote DB to replicate data) and the raw data when a non-deterministic function is called (like NOW()) so the data updated is the same on both places.
Also there seems to be 2 standard ways to retrieve this data:
1) The mysqlbinlog utility
2) Calling 'SHOW BINLOG EVENTS'
Option 2 seems the better to me since it does not require calling another external application, and running an application on the DB machine, BUT it does not include the actual data for the logged ROW format statements (only stuff like: table_id: 47 flags: STMT_END_F which tells me nothing).
So finally my questions are:
Is there a better way to track the changes made to a MySQL db without changing the whole structure and adding a ton of triggers and tables? I can change the product to log it's changes too but then we have to change all the products using this db to be sure we log everything ... and I think it's almost impossible to convince everyone.
Can I get all the information about the changes made using SHOW BINLOG EVENTS? Including the ROW data.
P.S. I researched MySQL Proxy too, but the problem in logging statements in all cases is that the actual data in non deterministic functions is not included.
Option 3 would be to parse the bin log yourself from within your app - that way you get total control of how often you check etc, and you can see all the statements with the actual values used.