SQL SERVER 2008 R2 - Merge Replication Identity Management Issue - sql-server-2008

Accidental DBA here. I am managing a merge replication and the Identity Management fields are not updating with new ranges and are constantly needing me to reseed or run sp_adjustpublisheridentityrange. I have noticed the following oddities:
When running DBCC CHECKIDENT([tablename]) the current identity value is a lot smaller than the current column value
When running DBCC CHECKIDENT([tablename]), sometimes the current identity value is set to under 100 when the Current Identity Value is 500,000 plus. But it will do it more than once with the same number.
The MSarticles in the distribution database is empty as well as the MSrepl_identity_range (I kind of thought that this should have the ranges in it, no?)
Sometimes it's the code throwing the error that the identity range is incorrect and a simple refresh fixes it.
The sync is run every 4 hours (which seems a bit long to me as I was used to running the sync more often at last job)
I was thinking I could run sp_adjustpublisheridentityrange in a job in the agent every hour in between syncs to fix this, but it seems to easy.
Any ideas? Thanks in advance.
Also, if anyone knows a really great book on replication, troubleshooting replication, and fixing replication that would be helpful.

Related

MySql AutoIncrement causes slow MySQL connection

This may be an open-ended question, which could have many answers, but I will ask anyway...
I am in process of moving over MySql databases to AWS while developing a new web app. I try to keep the current MySQL PROD updated every time I change something in the DEV AWS system now. So when I sync the DEV with PROD weekly, table structures are all consistent.
I am also using Sequelize NodeJs package, which by default, is best when it works with ID fields when hooking to other tables/models. And I would rather avoid writing the SQL query statements. Therefore I needed to create an ID field on an existing table, rather than previous Primary Keys, which included 3 columns. When I add the ID column and assign it as the Primary and AI and remove the other 3 columns as not primary, the server opens up very slowly in MS Access, which is what the current company is using while I develop the Web application and slowly move away from current database.
Why would my actions cause this slowness? And how can I fix the issue of having the ID field as an AI field, but not cause MySQL to open up slow. I have verified this is what's causing this by undoing what I did and then MS Access can connect to db in 1-2 minutes.
Apparently when MS Access starts and finally connects to MySQL, everything runs fine after, it is the initial connection that takes around 15 minutes where before it was taking 1-2 minutes.
I have done a lot of research focusing more on MS Access configurations and MySQL configurations with no luck rather than maybe focusing on what I did that could have caused this. In which, I do not quite understand.
Thanks in advance.

AWS DMS to replicate transactional data to data warehouse on an ongoing basis

I'm hoping someone can tell me if I'm absolutely crazy before I go too far down this path. I have an application with MySQL as the backend. I needed to create more robust reporting and i opted to build a data warehouse in pgsql. The challenge is I don't want the DW to be just updated once or twice a day. I'd like it to be near real time (some lag is expected and not a problem).
I looked at AWS glue and a few other options and finally settled on DMS as a method of replicating the data from the MySQL source to the pgsql target db for staging. I then set up trigger functions that will basically manipulate the inserted/updated data in the pgsql db, landing it in the data warehouse. The application is also connected to the DW and can pull reports and dashboard metrics from the DW as needed.
I've built a proof of concept and it seems to work, but it's really only me hitting the application at the moment, so I'm not sure if it will hold up if I were to proceed with this idea and put it in production.
I currently have a dms.t2.small replication instance (engine version 2.4.4) running at about 15-20% CPU utilization. I don't have it configured for Multi AZ currently.
I'm seeing combined CDCLatencyTarget/CDCLatencySource values averaging about 9 seconds. I think if that holds true it wouldn't be unbearable, although the less time the better. I'd say if it gets up over a minute we may start to see complaints.
I know that DMS is more meant for migrations, so I'd like to know if I'm just doing this in a really stupid way, or if this is a more or less valid use case? Are there issues with DMS that I am unaware of that will cause me to later regret this decision?
Also, I'd love any ideas you have for how I can put safeguards in place to ensure that source and target stay synced or if they don't that I'm made aware of it, or something that would allow it to self-heal.

RDS Read Replica Considerations

We hired an intern and want to let him play around with our data to generate useful reports. Currently we just took a database snapshot and created a new RDS instance that we gave him access to. But that is out of date almost immediately due to changes on the production database.
What we'd like is a live (or close-to-live) mirror of our actual database that we can give him access to without worrying about him modifying any real data or accidentally bringing down our production database (eg by running a silly query like SELECT (*) FROM ourbigtable or a really slow join).
Would a read replica be suitable for this purpose? It looks like it would at least be staying up to date but I'm not clear what would happen if a read replica went down or if data was accidentally changed on it or any other potential liabilities.
The only thing I could find related to this was this SO question and this has me a bit worried (emphasis mine):
If you're trying to pre-calculate a lot of data and otherwise modify
what's on the read replica you need to be really careful you're not
changing data -- if the read is no longer consistent then you're in
trouble :)
TL;DR Don't do it unless you really know what you're doing and you
understand all the ramifications.
And bluntly, MySQL replication can be quirky in my experience, so even
knowing what is supposed to happen and what does happen if there's as
the master tries to write updated data to slave you've also
updated.... who knows.
Is there any risk to the production database if we let an intern have at it on an unreferenced read replica?
We've been running read-replicas of our production databases for a couple years now without any significant issues. All of our sales, marketing, etc. people who need the ability to run queries are provided access to the replica. It's worked quite well and has been stable for the most part. The production databases are locked down so that only our applications can connect to it, and the read-replicas are accessible only via SSL from our office. Setting up the security is pretty important since you would be creating all the user accounts on the master database and they'd then get replicated to the read-replica.
I think we once saw a read-replica get into a bad state due to a hardware-related issue. The great thing about read-replicas though is that you can simply terminate one and create a new one any time you want/need to. As long as the new replica has the exact same instance name as the old one its DNS, etc. will remain unchanged, so aside from being briefly unavailable everything should be pretty much transparent to the end users. Once or twice we've also simply rebooted a stuck read-replica and it was able to eventually catch up on its own as well.
There's no way that data on the read-replica can be updated by any method other than processing commands sent from the master database. RDS simply won't allow you to run something like an insert, update, etc. on a read-replica no matter what permissions the user has. So you don't need to worry about data changing on the read-replica causing things to get out of sync with the master.
Occasionally the replica can get a bit behind the production database if somebody submits a long running query, but it typically catches back up fairly quickly once the query completes. In all our production environments we have a few monitors set up to keep an eye on replication and to also check for long running queries. We make use of the pmp-check-mysql-replication-delay command in the Percona Toolkit for MySQL to keep an eye on replication. It's run every few minutes via Nagios. We also have a custom script that's run via cron that checks for long running queries. It basically parses the output of the "SHOW FULL PROCESSLIST" command and sends out an e-mail if a query has been running for a long period of time along with the username of the person running it and the command to kill the query if we decide we need to.
With those checks in place we've had very little problem with the read-replicas.
The MySQL replication works in a way that what happens on the slave has no effect on the master.
A replication slave asks for a history of events that happened on the master and applies them locally. The master never writes anything on the slaves: the slaves read from the master and do the writing themselves. If the slave fails to apply the events it read from the master, it will stop with an error.
The problematic part of this style of data replication is that if you modify the slave and later modify the master, you might have a different value on the slave than on the master. This can be avoided by turning on the global read_onlyvariable.

Is there a difinitive answer to what causes catastrophic failure in delphi

I have read a few/lots of things on this but they don't seem to help much.
I have an App (it's called "TieUp" but that is irrelevant) I run it manually daily to collate data from several locations.
It is using as sources:
A) Data from a remote SOAP source and loaded into an in-memory TClientDataset via an XMLtransform setup.
B) CSV files downloaded daily and loaded into an in-memory TClientDataset
C) A Mysql Database on the same computer as the program (it's a restored backup of the live source)
D) A remote MS-SQL (SQLServer 2008) database
E) A Mysql Database on a remote server
Data is only read from sources A, B, C and D
Data source E is updated with the consolidated data.
There are between 800 to 2000 records daily so the datasets are not vast although the target (E) has grown to around 150,000 and increasing daily.
I can normally run this all happily and everything works as expected if a little slowly because of all the individual remote lookups to the MS-SQL system) but some days it really screws up and the error is always "Catastrophic Failure!".
The failure does not occur during any particular phase or operation that I can see. The steps are:
1) Get the SOAP(A) data first.
2) Tie in with CSV/In Memory data(B).
3) Lookup References data on Sources C and D to collate
4) Write the consolidated data to source E
After reading in the data into the in memory datasets every thing is In TClientDatasets accesses via DatasetProviders linked to TSQLQueries (they all on the same servers currently but I did it that way to keep some flexibility in future where it might goes true three tier). All queries are contained within the SQLQuery components as they are actually quite simple - it's just a matter of tying things together.
I am using completely standard components from Delphi 2009 Enterprise. All updates and database update packs have been applied. Each data source has its own DataModule these are auto created at startup
There is obviously quite a lot of data access going on here but when it crashes (with catastrophic failure) It gets stuck, completely stuck. Windows can't end the task from the normal "TieUp has stopped working" I have to go to the process and kill it.
There is so much going on and as this only happens once a week or so I really don't know where to start looking.
The reasons for asking the question is twofold: 1) is that I am trying to eliminate any manual stuff and fully automate it, but I can't rely on it if if bombs every week or so. 2) if it happens in the update phase to E - I have to manually delete the new records for the day and start again as I do not have (or haven't written yet) a mechanism to restart from a random point and I would still have to query the DB manually to establish that point for certain.
My next step is to install Delphi on another computer and always run it under the debugger until I can catch it, if it does not freeze first. But that introduces yet another different network connection (instead of the local host one).
So: "Is there a definite answer?" or what is the most likely offending component/connection? Where is the favoured place to start looking?
Thanks in advance...

mySQL Table Sync

I am looking for suggestions on the best way to sync mySQL tables (myISAM) from 2 different databases.
Currently we use Navicat to sync tables from our production server to our test server but we have been running into many problems. Just about everyday we have been running into a sync failure on a table.
We get the error below a lot of the times, not to mention Navicat spams our e-mails with successful and unsuccessful syncs(is there anyway to just receive only the unsuccessful syncs?). I also know altering the table in anyway will cause a failure to sync. So altering the table in anyway must be done to the master first (This makes sense but is there any way around this?).
-[Sync] Finished - Unsuccessful Synchronization: List index out of bounds (0)
Is there any reason to not use the Navicat sync? My boss suggested using mySQL replication instead but my first concern is finding why we have so many problems because it seems like we just are misusing the sync thus giving us all these problems.
Thanks.
sync tables from our production server to our test server
It sounds like you're trying to replicate your production environment in your test environment, right?
A common pattern to follow in this situation is using a tool like mysqldump to create a backup of the entire database, then later import the backup into the test environment. By doing a complete backup and restore, you're not only ensuring that you have at least one backup method that's known to work, you're also ensuring that the test database can never contain modifications that a sync tool might miss. (Sync tools generally require a primary or unique key on each table to operate effectively.)
The backup and reimport process should be an easy thing for you to automate. At my workplace, we perform a mysqldump-based database dump every night, and perform optional imports into each developer's personal copy of the dev environment early the following morning.