Alternative to timestamp column for syncing purpose - mysql

I have a local mysql db and a production mysql db. I make changes locally and use a third party tool to sync the changes to the live server. The tool uses the checksum feature to identify the rows changed. My db structure is simple, one varchar200 field (acts as primary key), and a text field.
The problem is the sync is taking ages since there are thousands of rows. I believe adding a timestamp field will held in getting the checksum quickly for the tool to identify the rows to be synced. this created further more problems, as the timestamp field is different in local and prod servers due to the timezone differences.
I am looking for a useful idea or an alternative to timestamp that gets changed when a row is modified.
PS: I posted a similar question but didn't get any useful answers. I dont wany to rely on additional tables.

My tip: Don't use TIMESTAMP datatype, use DATETIME. They hold the same kind of data, but difference is TIMESTAMP is updated every time you touch the row, even if you don't set that column, it will be updated with "now", including insertions.
This means when you use TIMESTAMP, you can never truly synch the two databases - that column will always be different. If you use DATETIME, you can preserve that column's data.
If you can't code your applications to update the DATETIME column with "now", simply create a trigger that will do it for you.

You could do several things:
add a column for "dirty" to the source table. Make it a single BIT that you flip when the row gets changed and flip it back when it gets sync'd. If the row id is a primary key this is a simple insert ... on duplicate key update
store all your times as GMT. So no more fighting over timezone. This is standard practice anywhere time is being stored anyway.
setup replication between the two servers so MySQL will do the copying / updating for you. This is precisely what its designed for and it works well.

Timestamping the rows is in fact a bad idea, by the time the client updates its own latest syncing time, many rows may get updated on the server and you may miss these. Use a counter which increases by one every time you add or modify rows on the server, the client update itself and get the latest value of the counter. The client may not get the very last value for the counter (e.g. some row get updated while the client is requesting the update) but it's guarantee to catch up at the next update

Related

Subscribe to changes in MySQL database without polling

I have a MySQL database that is updated by a different application which I want to subscribe to for changes from my node.js server. Is it possible to monitor the database for any updates without long polling all the rows/columns for any changes to their value?
One potential solution I have seen is to use redis to subscribe to the database to listen for any changes and then it informs my client (which will be my server in this case). How do I subscribe redis to MySQL database, if this is possible?
Could you not just add an updated column to your tables?
You could add ON UPDATE CURRENT_TIMESTAMP on the column, which would automatically store the current time every time that row was updated. The rule is applied to the database itself, so you don't need to update any other clients which use the database, it will work automatically.
Any client could then make queries based on the last time it checked for updates. You just need to SELECT rows based on its updated field.
You're only checking one column that way, and its quite a fast query.
You could index the datetime field too, apperently. That would probably make the queries very fast indeed.

SQL: insert row with expiration

Is there a way to insert a row into SQL with an expiration (c.f. you can insert a new key that expires in a minute with Memcached)?
The context is that I want an integration test to insert rows into a database, but I'd prefer not deleting them myself, as it's shared by many. Those delete queries must be manual, or they may not be run, or they may have disastrous typos, etc. I'd prefer the system to do it for me if it can (i.e. automatically and efficiently and well-tested).
(I assume this is not part of the SQL standard and the answer is no.)
related: SQL entries that expire after 24 hours
related: What is the best way to delete old rows from MySQL on a rolling basis?
CONTEXT: I can't make any changes to the database schema, or any of the associated infrastructure.
If you were doing unit testing, I would suggest wrapping each unit test in a BEGIN TRAN / ROLLBACK.
Since you are doing integrated testing, you probably need the data to live outside the scope of a single transaction. SQL Agent would work fine here, except that it would not distinguish between test data and real data. However, you could get around this by INSERTing some identifier to the specific records to be deleted upon expiration. That could be done in a single stored proc..
You might be able to accomplish this by using SQL Server Service Broker. I have not worked with the service broker, but maybe there is a way to delay message processing until a specific time has passed.
add an expiration date column to your table(s). create a job that will delete data that is past expiration on some schedule (say nightly).

Can I use a "last update" timestamp to select MySQL records for update?

I have a MySQL database with about 30,000 rows. I update this database to a remote server nightly, but never are more than 50 rows updated at a time. I am the only one who updates the database. I would like to develop a method in which only CHANGED rows are exported to the remote server.
To save space in the database and to save time when I export to the remote server, I have built "archive" tables (on the remote server) with records that will no longer be updated, and which do not reside on the local database. But I know splitting up this data into multiple tables is bad design that could lead to problems if the structure of the tables ever needs to be changed.
So I would like to rebuild the database so that ALL the records with similar table structures are in a single table (like they were when the database was much smaller). The size of the resulting table (with all archived records) would exceed 80,000 rows, much to large to export in a whole-database package.
To do this, I would like to
(1) Update a "last updated" timestamp in each row when the row is added or modified
(2) Select only rows in tables for export when their "last update" timestamp is greater than the timestamp of the last export operation
(3) Write a query that builds the export .sql file with only new and updated rows
(4) Update the timestamp for the export operation to be used for comparison during the next export
Has anyone ever done this? If so, I would be grateful for some guidance on how to accomplish this.
Steve
If you add a column with a timestamp datatype, for example last_updated timestamp, it will be automatically updated to now() every time a row changes.
Then early every day, simply ship yesterday's changes:
select * from mytable
where last_updated between subdate(CURDATE(), 1) and CURDATE()
Why not just setup the remote server as a replication slave? MySQL will only send updated rows in that situation, and very quickly / efficiently at that.
Using an official replication strategy is generally advisable rather than rolling your own. You'll have lots of examples to work from and lots of people who understand what's going on if you run into problems.

SQL Server 2008 - How to implement a "Watch Dog Service" which woofs when too many insert statements on a table

Like my title describes: how can I implement something like a watchdog service in SQL Server 2008 with following tasks: Alerting or making an action when too many inserts are committed on that table.
For instance: Error table gets in normal situation 10 error messages in one second. If more than 100 error messages (100 inserts) in one second then: ALERT!
Would appreciate it if you could help me.
P.S.: No. SQL Jobs are not an option because the watchdog should be live and woof on the fly :-)
Integration Services? Are there easier ways to implement such a service?
Kind regards,
Sani
I don't understand your problem exactly, so I'm not entirely sure whether my answer actually solves anything or just makes an underlying problem worse. Especially if you are facing performance or concurrency problems, this may not work.
If you can update the original table, just add a datetime2 field like
InsertDate datetime2 NOT NULL DEFAULT GETDATE()
Preferrably, make an index on the table and then with whatever interval that fits, poll the table by seeing how many rows have an InsertDate > GetDate - X.
For this particular case, you might benefit from making the polling process read uncommitted (or use WITH NOLOCK), although one has to be careful when doing so.
If you can't modify the table itself and you can't or won't make another process or job monitor the relevant variables, I'd suggest the following:
Make a 'counter' table that just has one Datetime2 column.
On the original table, create an AFTER INSERT trigger that:
Deletes all rows where the datetime-field is older than X seconds.
Inserts one row with current time.
Counts to see if too many rows are now present in the counter-table.
Acts if necessary - ie. by executing a procedure that will signal sender/throw exception/send mail/whatever.
If you can modify the original table, add the datetime column to that table instead and make the trigger count all rows that aren't yet X seconds old, and act if necessary.
I would also look into getting another process (ie. an SQL Jobs or a homemade service or similar) to do all the housekeeping, ie. deleting old rows, counting rows and acting on it. Keeping this as the work of the trigger is not a good design and will probably cause problems in the long run.
If possible, you should consider having some other process doing the housekeeping.
Update: A better solution will probably be to make the trigger insert notifications (ie. datetimes) into a queue - if you then have something listening against that queue, you can write logic to determine whether your threshold has been exceeded. However, that will require you to move some of your logic to another process, which I initially understood was not an option.

MySQL table modified timestamp

I have a test server that uses data from a test database. When I'm done testing, it gets moved to the live database.
The problem is, I have other projects that rely on the data now in production, so I have to run a script that grabs the data from the tables I need, deletes the data in the test DB and inserts the data from the live DB.
I have been trying to figure out a way to improve this model. The problem isn't so much in the migration, since the data only gets updated once or twice a week (without any action on my part). The problem is having the migration take place only when it needs to. I would like to have my migration script include a quick check against the live tables and the test tables and, if need be, make the move. If there haven't been updates, the script quits.
This way, I can include the update script in my other scripts and not have to worry if the data is in sync.
I can't use time stamps. For one, I have no control over the tables on the live side once it goes live, and also because it seems a bit silly to bulk up the tables more for conviencience.
I tried doing a "SHOW TABLE STATUS FROM livedb" but because the tables are all InnoDB, there is no "Update Time", plus, it appears that the "Create Time" was this morning, leading me to believe that the database is backed up and re-created daily.
Is there any other property in the table that would show which of the two is newer? A "Newest Row Date" perhaps?
In short: Make the development-live updating first-class in your application. Instead of depending on the database engine to supply you with the necessary information to enable you to make a decision (to update or not to update ... that is the question), just implement it as part of your application. Otherwise, you're trying to fit a round peg into a square hole.
Without knowing what your data model is, and without understanding at all what your synchronization model is, you have a few options:
Match primary keys against live database vs. the test database. When test > live IDs, do an update.
Use timestamps in a table to determine if it needs to be updated
Use the md5 hash of a database table and modification date (UTC) to determine if a table has changed.
Long story short: Database synchronization is very hard. Implement a solution which is specific to your application. There is no "generic" solution which will work ideally.
If you have an autoincrement in your tables, you could compare the maximum autoincrement values to see if they're different.
But which version of mysql are you using?
Rather than rolling your own, you could use a preexisting solution for keeping databases in sync. I've heard good things about SQLYog's SJA (see here). I've never used it myself, but I've been very impressed with their other programs.