How to import a JSON file to Cloud Firestore - json

I had developed a web app on a mySQL database and now I am switching to Android Mobile Development but I have large amount of data to be exported into Firebase's Cloud Firestore. I could not find a way to do so, I have the mySQL data stored in JSON and CSV.
Do I have to write a script? If yes then can you share the script or is there some sort of tool?

I have large amounts of data to be exported into Firebase's Cloud Firestore, I could not find a way to do so
If you're looking for a "magic" button that can convert your data from a MySQL database to a Cloud Firestore database, please note that there isn't one.
Do I have to write a script?
Yes, you have to write code in order to convert your actual MySQL database into a Cloud Firestore database. Please note that both types of databases share two different concepts. For instance, a Cloud Firestore database is composed of collections and documents. There are no tables in the NoSQL world.
So, I suggest you read the official documentation regarding Get started with Cloud Firestore.
If yes then can you share the script or is there some sort of tool.
There is no script and no tool for that. You should create your own mechanism for that.

Related

Insert JSON data into SQL DB using Airflow/python

I extracted data from an API using Airflow.
The data is extracted from the API and saved on cloud storage in JSON format.
The next step is to insert the data into an SQL DB.
I have a few questions:
Should I do it on Airflow or using another ETL like AWS Glue/Azure Data factory?
How to insert the data into the SQL DB? I google "how to insert data into SQL DB using python"?. I found a solution that loops all over JSON records and inserts the data 1 record at a time.
It is not very efficient. Any other way I can do it?
Any other recommendations and best practices on how to insert the JSON data into the SQL server?
I haven't decided on a specific DB so far, so feel free to pick the one you think fits best.
thank you!
You can use Airflow just as a scheduler to run some python/bash scripts in defined time with some dependencies rules, but you can also take advantage of the operators and the hooks provided by Airflow community.
For the ETL part, Airflow isn't an ETL tool. If you need some ETL pipelines, you can run and manage them using Airlfow, but you need an ETL service/tool to create them (Spark, Athena, Glue, ...).
To insert data in the DB, you can create your own python/bash script and run it, or use the existing operators. You have some generic operators and hooks for postgress, MySQL and the different databases (MySQL, postgres, oracle, mssql), and there are some other optimized operators and hooks for each cloud service (AWS RDS, GCP Cloud SQL, GCP Spanner...), if you want to use one of the managed/serverless services, I recommend using its operators, and if you want to deploy your service on a VM or K8S cluster, you need to use the generic ones.
Airflow supports almost all the popular cloud services, so try to choose your cloud provider based on cost, performance, team knowledge and the other needs of your project, and you will surly find a good way to achieve your goal with Airlfow.
You can use Azure Data Factory or Azure Synapse Analytics to move data in Json file to SQL server. Azure Data Factory supports 90+ connectors as of now. (Refer MS doc on Connector overview - Azure Data Factory & Azure Synapse for more details about connectors that are supported by Data Factory).
Img:1 Some connectors which are supported by ADF.
Refer MS docs on pre-requisites and Required Permissions to connect Google cloud storage with ADF
Take source connector as Google Cloud storage in copy activity. Reference: Copy data from Google Cloud Storage - Azure Data Factory & Azure Synapse | Microsoft Learn
Take SQL DB connector for sink.
ADF supports Auto create table option when there is no table created in Azure SQL database. Also, you can map the source and sink columns in mapping settings.

Connecting a database with Thingsboard

Will you Please help me in one more important thing??? I need to store dashboard's data in a database.. according to my study thingsboard supports 3 database at the moment. NoSQL, Mysql and Hybrid (Psql+cassandra) so i have researched a lot but could not find any way to send my telemetry data to any database. I know it is possible because the thingsboard doc itself say so... but how do i do that?? I checked Psql database thingsboard that i created during installation but there are those relations are present that was made by default. i need to store my project's data in databases just like in AWS we store IoT core's data in Dynamo DB or in IoT analytics. Thingsboard do not provide any node related to DB in his rule engine?? so How do i make a rule chain to transfer my projects data in any Database server. i have installed pgadmin4 to Graphically see the database but nothing useful found. Documentation and stakoverflow geeks said that configuring Thingsboard.yml file located in monolithic installation on linux (/etc/thingsboard/conf/thingsboard.conf ) in this path it have cassandra,mysql,postgres configuration but how to properly configure it??? i tried to access the default database of postgres named thingsboard that i created on installing time but when i list the content of database it only shows the default things/relations of thingsboard if i create a device on thingsboard that is not showing in database why?? I really can use your help. Please enlighten me a way to connect my THINGSBOARD with a DATABASE.
see in my attached images there are everything default, nothing that i create on thingsboard.
enter image description here
That's wrong, ThingsBoard currently supports 3 database setups: Postgres only, Hybrid Postgres + Cassandra (only telemetries) & Postgres + Timescale. So there is no MySQL database used anywhere.
https://thingsboard.io/docs/user-guide/install/ubuntu/#step-3-configure-thingsboard-database
Find guides for connecting your devices to Thingsboard here, e.g. via MQTT:
https://thingsboard.io/docs/reference/mqtt-api/
If you would like to forward the stored telemetries of ThingsBoard to different databases, this is not possible directly with rulechains (there's only one node to store data in a cassandra table)
One way to achieve this, would be fetching the data with an external microservice/program via HTTP API and persisting the data in the database of your choice. You could use a python script for example.
https://thingsboard.io/docs/reference/rest-api/
Alternatively, you can send the data to a Message queue like Kafka instead of fetching via HTTP API. But still it would require additional tools for storing the data in external databases outside ThingsBoard.

Saving one record using Google Cloud Functions and then overwriting it

I'm working with gmail API and need to save the historyID to determine the changes that happened in the email from the pubsub events.
However, I don't need to store all the historyIDs and just need to pull the old historyID, use it in my function, and overwrite it with the new one.
Wondering what kind of architecture would be best for this. I can't use the temp storage of google cloud functions because it would not be persistent.
Using google sheets requires extra authorization within the cloud function. Do I really need to make a new cloud bucket for one text file?
It seems like Cloud Datastore would be your best alternative to Cloud Storage if your use case is to persist, retrieve and update the historyID as log data at low-cost. Using Google Cloud Functions and Cloud Datastore would be like a server-less log system.
Datastore is a NoSQL document database built for automatic scaling, high performance, and ease of application development. It can handle large amount of non-relational data with relatively low price. It also has a user-friendly Web Console.
I found a very useful web tutorial which you can use to help you architect a Cloud Functions with Cloud Datastore solution like so:
Create Cloud Datastore
Obtain Datastore Credential
Update your Code
Deploy your Cloud Function
Send a Request to Cloud Function
Check the Log on Datastore
Take a look at the full tutorial here. Hope this helps you.

What is the most efficient way to export data from Azure Mysql?

I have searched high and low, but it seems like mysqldump and "select ... into outfile" are both intentionally blocked by not allowing file permissions to the db admin. Wouldn't it save a lot more server resources to allow file permissions than to disallow them? Any other import/export method I can find uses executes much slower, especially with tables that have millions of rows. Does anyone know a better way? I find it hard to believe Azure left no good way to do this common task.
You did not list the other options you found to be slow, but have you thought about using Azure Data Factory:
Use Data Factory, a cloud data integration service, to compose data storage, movement, and processing services into automated data pipelines.
It supports exporting data from Azure MySQL and MySQL:
You can copy data from MySQL database to any supported sink data store. For a list of data stores that are supported as sources/sinks by the copy activity, see Supported data stores and formats
Azure Data Factory allows you to define mappings (optional!), and / or transform the data as needed. It has a pay per use pricing model.
You can start an export manually or using a schedule using the .Net or Python SKD , the Rest api or Powershell.
It seems you are looking to export the data to a file, so Azure Blob Storage or Azure Files are likely to be a good destination. FTP or the local file system are also possible.
"SELECT INTO ... OUTFILE" we can achieve this using mysqlworkbench
1.Select the table
2.Table Data export wizard
3.export the data in the form of csv or Json

How to migrate MySql Database to Firestore

I'm searching the best way to migrate my MySql database to Firebase's new Cloud Firestore.
What are the steps? I'm trying first of all to convert my tables and relations in my relational db to a document logic.
I read about Cloud Firestore REST API because I have more experience on using REST instead of socket, but I'm not sure if that's the point.
Is it a good idea to create a script starting from this sample and running it on nodeJS?
Has anyone already did this thing?
Thanks
I couldn't find a way to do that. But the way I suggest is using a programming language (python is preferable), make a script to turn all your data in the MySql database to a json format file structuring data in the way you want to store in Firestone. Then read the firestore API to populate instances in the documentary. This must work for sure!
You can convert your mySQL database to a CSV file, and then convert that CSV to JSON.