MySQL group replication multi primary mode with multiple bootstraped - mysql

I have 7 MySQL servers in different locations. All servers has same database with same structure. All tables are structured with UUID based primary keys. (No auto increment values).
1 (Central) server is always connected to the network. (Internet)
All other 6 servers can get connected/disconnected from the network anytime.
All 6 servers must have an ability to work individually (Read/Write) and locally when not connected to internet.
They must replicate each other when network connected.
Once all databases completely replicated, all databases must have same contents of data. (Including Main server)
I just mentioned 1 server as a main server here. (But no any main server). It is main server, when all other 6 are not connected, because head office use it to query past reports from it.
I have read about MySQL group replication (Multi Primary Mode). Is it possible to use it in my requirement. Please advise me if someone already have this experience.

Group replication assumes all servers will contain the same data, and when you join a new server it will fetch from the group the data it is missing.
However, if the server has more data than the group, it won't be able to join.
So, in theory your setup will only work if these 6 servers don't receive writes and diverge while "offline", because if they do, you can no longer add them back to a group (without extra reconciliation operations).

Related

Sync multiple local databases to one remotely

I need to create a system with local webservers on Raspberry Pi 4 running laravel for API calls, websockets, etc. Each RPI will be installed in multiple customers places.
For this project i want to have the abality to save/sync the database to a remote server (when the local system is connected to internet).
Multiple locale databases => One remote database cutomers based
The question is, how to synchronize databases and identify properly each customers data and render them in a mutualised remote dashboard.
My first thought was to set a customer_id or a team_id on each tables but it seems dirty.
The other way is to create multiple databases on the remote server for the synchronization and one extra database to set customers ids and database connection informations...
Someone has already experimented something like that? Is there a sure and clean way to do this?
You refer to locale but I am assuming you mean local.
From what you have said you have two options at the central site. The central database can either store information from the remote databases into a single table with an additional column that indicates which remote site it's from, or you can setup a separate table (or database) for each remote site.
How do you want to use the data?
If you only ever want to work with the data from one remote site at a time it doesn't really matter - in both scenarios you need to identify what data you want to work with and build your SQL statement to either filter by the appropriate column, or you need to direct it to the appropriate table(s).
If you want to work on data from multiple remote sites at the same time, then using different tables requires tyhat you use UNION queries to extract the data and this is unlikely to scale well. In that case you would be better off using a column to mark each record with the remote site it references.
I recommend that you consider using Uuids as primary keys - it may be that key collision will not be an issue in your scenario but if it becomes one trying to alter the design retrospectively is likely to be quite a bit of work.
You also asked about how to synchronize the databases. That will depend on what type of connection you have between the sites and the capabilities of your software, but typically you would have the local system periodically talk to a webservice at the central site. Assuming you are collecting sensor data or some such the dialogue would be something like:
Client - Hello Server, my last sensor reading is timestamped xxxx
Server - Hello Client, [ send me sensor readings from yyyy | I don't need any data ]
You can include things like a signature check (for example an MD5 sum of the records within a time period) if you want to but that may be overkill.

Database replication: multiple geographical locations with local database, one main remote database

I've got a very specific use case and because I'm not too familiar with database replication, I am open to suggestions and ideas about how to accomplish the following in the best possible way:
A web application + database is running on a remote server. Let's call this set-up R for remote.
Now suppose there are 3 separate geographical locations which need read+write access to the database. I will call these locations L1, L2 and L3.
The main problem: the remote server might be unavailable or the internet connection of one of the locations might not always work, rendering the remote application unavailable; but we want the application to work as a high availability solution (on-site) even when the remote server is down or when there is an internet connection problem.
Partial solution: So I was thinking about giving each geographical location its own server with a local copy of the web application. The web application itself can get updated when needed from a version control system automatically (for example using git hooks).
So far so good... (at least I believe so?)
But what about our data? The really tricky part seems to be the database replication. Let's assume no DNS or IP failover and assume that the user first tries to access the remote server directly and if this does not work, the user can still use the local server on-site instead. This all happens inside a web browser (or similar client).
One possible (but unsatisfactory) solution would be to use master-slave replication from R (master) to L1, L2 and L3 (slaves). When doing this asynchronously this should be quite fast? I think this is a viable solution for temporary local read-only database access when the main server is broken or can't be accessed.
But... what about read-write support? I suppose we would need multi-master replication in this case, but I am afraid that synchronous replication using something like (for example) MySQL Cluster or Galera would slow things down, especially since L1, L2 and L3 are on lower bandwidth connections. And they are connected through WAN. (Also, L1, L2 or L3 might not always be online.)
The real question: How would you tackle this specific use case? At the moment I am leaning towards multi-master replication if it doesn't slow down things too much. The application itself will mainly be used by employees on-site but by some external people over WAN as well. Would multi-master replication work well? What if for example L1 is down for 24 hours and suddenly comes back on-line? What if R can't be accessed?
EXTRA: not my main question, but I also need the synchronized data to be sent securely over SSL, if possible, please take this into account for your answer.
Perhaps I am still forgetting some necessary details; if so, please respond with some feedback and I will try to update my question accordingly.
Please note that I haven't decided on a database yet and the database schema will be developed from scratch, so ideas using other databases or database engines are welcome as well. (At the moment I have most experience with MySQL and PostgreSQL)
As you are still undecided, I would strongly recommand you to have a look at MS-SQL merge replication. It is strong, highly reliable, replicates through LAN and HTTPS (so called web replication), and not that expensive.
Terminology differs from the mySql Master\Slave idea. We are here talking about one publisher, and multiple subscribers. All changes done at subscriber's level are collected and sent to the publisher, then redistributed to all subscribers (with, if needed, fancy options like 'filtered subscriptions').
Standard architecture will then be:
a publisher, somewhere on a server, which collects and redistributes changes between subscribers. Publisher might not be accessed by end users.
other database subscribers servers, either for local or web access, replicating with the publisher. Subscribers are accessed by end users.
We have been using this architecture for years, including:
one subscriber for internet access
one subscriber for intranet access
tens of subscribers for local access: some subscribers are on our constructions projects, somewhere in the desert ....
Such an architecture is not available "from the shelf" with MySQL. I guess it could be built, but it would then certainly be a lot more expensive than just buying the corresponding MS-SQL licenses. Do not forget that the free SQLEXPRESS version of MS-SQL can be a subscriber.
Be careful: If you are planning to go through such a configuration, I would (really) strongly advise you to have all primary keys set to uniqueIdentifier data type, and randomly generated. This will avoid the typical replication pitfall, where PK's are set to int with automatic increment, and where independant servers generate identical primary keys between two replications (MS-SQL proposes a tool to avoid such problems, where you can allocate PK ranges per server, but this solution is a real PITA ...).

2 tables in 2 different databases with different structures with same type of data to be synced

My problem is I have a website that customers place orders on. That information goes into orders, ordersProducts, ...etc tables. I have a reporting Database on a DIFFERENT server where my staff will be processing the orders from. The tables on this server will need the order information AND additional columns so they can add extra information and update current information
What is the best way to get information from the one server (order website) to the other (reporting website) efficiently without the risk of data loss? Also I do not want the reporting database to be connecting to the website to get information. I would like to implement a solution on the order website to PUSH data.
THOUGHTS
mySQL Replication - Problem - Replicated tables are strictly for reporting and not manipulation. Example what if customer address changes? Need products added to order? This would mess up the replicated table.
Double Inserts - Insert into Local tables and then insert into Reporting Database. Problem - If for whatever reason the reporting database goes down there is a chance I lose data because the mySQL connection wont be able to push the data. Implement some sort of query log?
Both Servers use mySQL and PHP
Mysql replication sounds exactly like what you are looking for, I'm not too sure I understand what you've listed as the disadvantage there.
The solution to me sounds like a master to read-only slave where the slave is the reporting database. If your concern is changes to the master then making the slave out of sync then this shouldn't be too much of an issue, all changes will be synced over. In the situation of a loss of connectivity then the slave would track how many seconds it is behind master and execute the changes until the two are back in sync.

Mysql Replication Question

currently I have this scenario,
multiple desktop client with mysql db installed on their windows machine.
need to sync over to one server hosted on web for reporting purpose.
just need to do one way sync ( client to web ).
client ip is always changing since they use standard adsl with no fix ip.
each client db will sync to one stand alone db on server ( hosted on web).
can this syncing run on scheduler ? like every 3 hour since once.
I m thinking of using mysql replication, but I have some of the question on how to setup this? shall I setup this as master to slave ? or master to master ?
I assume that the client will be master and the server will be slave , since the server is only use for reporting purpose, but checking on lots of mysql replication , it seem like the replication is initial from slave ? ( i see there are setting like master-host=ip on slave server setting ) this defeat the purpose since the server not sure about the client ip...
Perhaps this is totally off the mark given some of the items you're mentioning (slave/master/etc), but in an app I am developing, I have a similar architecture with the single source feeding multiple clients of unknown/dynamic IP. My solution was to include another field with a timestamp of when that row was last updated, then to sync, the clients search their local db for the MAX in that column, and send that as a variable to a webservice that then returns all rows with a more recent timestamp. The client then parses through the response data, and REPLACES INTO their local db, so that old data is over-written.
One detail I did not address (as my scenario does not need it) is how to communicate that an item has been deleted...perhaps when a row is deleted, an entry is made in another table with the row primary id, and timestamp of deletion, and then the web service could include an array of all rows with a more recent timestamp of that table.

Database access related info on clustered environment

My understanding of database cluster is less because I have not worked on them. I have the below question.
Database cluster has two instance db server 1 & server 2. Each instance will have a copy of databases, considering the database has say Table A.
Normally a query request will be done by only one of the servers which is randomly decided.
Question1: I would like to know given the access can we explicitly tell which server should process the query?
Question2: Given the access, can a particular server say db server 2 be accessed from outside directly to query?
Either in Oracle or MySQL database.
/SR
There are many different ways to implement a cluster. Both MySQL and Oracle provide solutions out of the box - but very different ones. And there's always the option of implementing different clustering on top of the DBMS itself.
It's not possible to answer your question unless you can be specific about what cluster architecture and DBMS you are talking about.
C.
In Oracle RAC (Real Application Clusters), the data-storage (ie the disks on which the data is stored) are shared, so it's not really true to say there is more than one copy of the data... there is only one copy of the data. The two servers just access the storage separately (albeit with some co-operation)
From an Oracle perspective:
cagcowboy is correct; in an Oracle RAC system, there is but one database (the set of files on disk), multiple database instances (executing programs) on different logical or physical servers access those same files.
In Oracle, a query being executed in parallel can perform work using the resources of any member of the cluster.
One can "logically" partition the cluster so that a particular application prefers to connect to Member 1 of the cluster instead of Member 2 through the use of service names. However, if you force an application to always connect to a particular member of the cluster, you have eliminated a primary justification to cluster - high availability. Similarly, if the application connects to a functionally random member of the cluster, different database sessions with read and/or write interest in the same Oracle rows can significantly degrade performance.