I have a mysql database on server A, a postgres database on server B and another mysql database on server C. I need a way to join tables from the three servers to get a combined result. Is there a way to do this in ruby ? If not ruby any other language will also suffice.
I need to join somewhere around a few 1000 rows of data. The joined data needs to get pushed to elasticsearch. I was going to use the _bulk api in elasticsearch to push it.
If it's a one-off, just download the data from two of the DBs and upload it to the third. Do your work there.
If it's a regular thing it might be worthwhile putting the effort into linking the databases properly. PostgreSQL offers Foreign Data Wrapper plugins that let you talk to a different database (PostgreSQL or others). One is the mysql_fdw.
You define entries for each remote server, user mappings between local user and the remote user, and then describe each table to access. After that you can treat them as local (although performance will generally be much worse of course).
In theory you could write your own fdw plugin to link to elasticsearch too but there doesn't seem to be one currently available.
You can do it by LINQ and Entity Framework in Microsoft .Net
Related
I need to create a system with local webservers on Raspberry Pi 4 running laravel for API calls, websockets, etc. Each RPI will be installed in multiple customers places.
For this project i want to have the abality to save/sync the database to a remote server (when the local system is connected to internet).
Multiple locale databases => One remote database cutomers based
The question is, how to synchronize databases and identify properly each customers data and render them in a mutualised remote dashboard.
My first thought was to set a customer_id or a team_id on each tables but it seems dirty.
The other way is to create multiple databases on the remote server for the synchronization and one extra database to set customers ids and database connection informations...
Someone has already experimented something like that? Is there a sure and clean way to do this?
You refer to locale but I am assuming you mean local.
From what you have said you have two options at the central site. The central database can either store information from the remote databases into a single table with an additional column that indicates which remote site it's from, or you can setup a separate table (or database) for each remote site.
How do you want to use the data?
If you only ever want to work with the data from one remote site at a time it doesn't really matter - in both scenarios you need to identify what data you want to work with and build your SQL statement to either filter by the appropriate column, or you need to direct it to the appropriate table(s).
If you want to work on data from multiple remote sites at the same time, then using different tables requires tyhat you use UNION queries to extract the data and this is unlikely to scale well. In that case you would be better off using a column to mark each record with the remote site it references.
I recommend that you consider using Uuids as primary keys - it may be that key collision will not be an issue in your scenario but if it becomes one trying to alter the design retrospectively is likely to be quite a bit of work.
You also asked about how to synchronize the databases. That will depend on what type of connection you have between the sites and the capabilities of your software, but typically you would have the local system periodically talk to a webservice at the central site. Assuming you are collecting sensor data or some such the dialogue would be something like:
Client - Hello Server, my last sensor reading is timestamped xxxx
Server - Hello Client, [ send me sensor readings from yyyy | I don't need any data ]
You can include things like a signature check (for example an MD5 sum of the records within a time period) if you want to but that may be overkill.
I want to fetch data from multiple mysql databases which are on multiple servers.
I'm using phpmyadmin (mysql). All the databases will be mysql database (same vendor) which are on multiple servers. First I want to connect to those server databases and then I want to fetch data from them and then put the result in central database.
For example : remote_db_1 on server 1, remote_db_2 on server 2, remote_db_3 on server 3. and I have central database where I want to store the data which comes from different databases.
Query : select count(user) from user where profile !=2; same query will be run for all the databases.
central_db
school_distrct_info_table
id school_district_id total_user
1. 2 50
2. 55 100
3. 100 200
I've tried federated engine but it doesn't fit to our requirement.What can be done in this situation any tool, any alternative method or anything.
In future no. of databases on different server will be increased. It might 50, 100, maybe more, exporting the tables from source server & then load to central db will be hard task. So I'm also looking for some kind of etl tool which can directly fetch data from multiple source databases and then sending the data to destination database. In central db table, structure,datatypes,columns everything will be different. Sometimes we might need to add extra column to store some data I know it can be achieved through etl tool in the past I've used ssdt which works with SQL Server but here this is mysql.
The easiest way to handle this problem is with federated servers. But, you say that won't work for you.
So, your next best way to handle the problem is to export the tables from the source servers and then load them into your central server. But that's much harder. This sort of operation is sometimes called extract / transform / load or ETL.
You'll write a program in the programming language of your choice (Python, php, Java, PERL, nodejs??) to connect to each database separately, then query it, then put the information into a central database.
Getting this working properly is, sad to say, incompatible with really urgent. It's tricky to get working and to test.
May I suggest you write another question explaining why server federation won't meet your needs, and asking for help? Maybe somebody can help you configure it so it does. Then you'll have a chance to finish this project promptly.
Google says NO triggers, NO stored procedures, No views. This means the only thing I can dump (or import) is just a SHOW TABLES and SELECT * FROM XXX? (!!!).
Which means for a database with 10 tables and 100 triggers, stored procedures and views I have to recreate, by hand, almost everything? (either for import or for export).
(My boss thinks I am tricking him. He cannot understand how previous, to me, employers did that replication to a bunch of computers using two clicks and I personally need hours (or even days) to do this with an internet giant like Google.)
EDIT:
We have applications which are being created in local computers, where we use our local MySQL. These applications use MySQL DB's which consist, say, from n tables and 10*n triggers. For the moment we cannot even check google-cloud-sql since that means almost everything (except the n almost empty tables) must be "uploaded" by hand. And we cannot also check using google-cloud-sql DB since that means almost everything (except the n almost empty tables) must be "downloaded" by hand.
Until now we do these "up-down"-loads by taking a decent mysqldump from the local or the "cloud" MySQL.
It's unclear what you are asking for. Do you want "replication" or "backups" because these are different concepts in MySQL.
If you want to replicate data to another MySQL instance, you can set up replication. This replication can be from a Cloud SQL instance, or to a Cloud SQL instance using the external master feature.
If you want to backup data to or from the server, checkout these pages on importing data and exporting data.
As far as I understood, you want to Create Cloud SQL Replicas. There are a bunch of replica options found in the doc, use the one that fits the best to you.
However, if you said "replica" as Cloning a Cloud SQL instance, you can follow the steps to clone your instance in a new and independent instance.
Some of these tutorials are done by using the GCP Console and can be scheduled.
There's a project I'm working on, kind of a distributed Database thing.
I started by creating the conceptual schema, and I've partitioned the tables such that I may require to perform joins between tables in MySQL and PostgreSQL.
I know I can write some sort of middleware that will break down the SQL queries and issue sub-queries targeting individual DBs, and them merge the results, but I'd like to do do this using SQL if possible.
My search so far has yielded this (Federated storage engine for MySQL) but it seems to work for MySQL databases.
If it's possible, I'd appreciate some pointer's on what to look at, preferably in Python.
Thanks.
It might take some time to set up, but PrestoDB is a valid OpenSource solution to consider.
see https://prestodb.io/
You connect connect to Presto with JDBC, send it the SQL, it interprets the different connections, dispatches to the different sources, then does the final work on the Presto node before returning the result.
From the postgres side, you can try using a foreign data wrapper such as mysql_ftw (example). Queries with joins can then be run through various Postgres clients, such as psql, pgAdmin, psycopg2 (for Python), etc.
This is not possible with SQL.
Your options are to write your own "middleware" as you hinted at. To do that in Python, you would use the standard DB-API drivers for both databases and write individual queries; then merge their results. An ORM like sqlalchemy will go a long way to help with that.
The other option is to use an integration layer. There are many options out there, however, none that I know that are written in Python. mule esb, apache servicemix, wso2 and jboss metamatrix are some of the more popular ones.
You can colocate the data on a single RDBMS node (either PostgreSQL or MySQL for example).
Two main approaches
Readonly - You might want to use read-replicas of both source systems, then use a process to copy the data to a new writeable converged node; OR
Primary - You might chose a primary database of 2. Move the data from one to the primary using a conversion process (eg. ETL or off the shelf table-level replication)
Then you can just run the query on the one RDBMS with JOINs as usual.
BONUS: You can also do log reading from RDBMS that can ship logs through Kafka. You can make it really complex as required.
I am working with a client who is syncing between SQL Server and MySQL containing the exact same schema and data. We want to centralize that data into one database. Other then performance and maintainability issues, what else is bad about the original design?
You can create a linked server instance in SQL Server, with the MySQL instance.
Despite being completely proprietary, one of the nice connectivity features offered in SQL Server is the ability to query other servers through a Linked Server. Essentially, a linked server is a method of directly querying another RDBMS; this often happens through the use of an ODBC driver installed on the server.
Refer This article : step-by-step process SQL Server Linked Server to MySQL.
Providing you grant the MySQL user you connect on behalf of proper permissions, you can write to the MySQL instance accouding to you. So you can update stored procedures to do an additional step to insert records into MySQL.
Much easier solution is to use commercial application - Omega Sync from Spectral Core
Omega Sync can compare and synchronize both database schema and table data. You can even synchronize data of heterogeneous databases (for example, compare your local SQL Server database with a MySQL replica on your web site - and synchronize all the differences in just a few minutes).
on the otherhand I think you've already mentioned what possible problems you may encounter when synchronizing 2 db at the same time aside from this two I think it would be the resources. since there are different RDBMS working for the application they would also have a separate resources for each, like when I update a particular record of a user it still needs to check on which resource does it really exist, but I love to hear more from other people out there this is really an interesting topic to discuss. ;)