Which database should I choose? MySQL or mongoDB? [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm working on a project which is somewhat familiar to WhatApp, except that I'm not implementing the chatting function.
The only thing I need to store in the database is user's information, which won't be very large, and I also need an offline database in the mobile app which should be synced with the server database.
Currently I use MySQL for my server database, and I'm thinking of using JSON for the syncing between mobile app and the server, then I found that mongoDB has a natural support for JSON, which caused me wonder should I change to mongoDB.
So here are my questions:
Should I change to mongoDB or should I still use MySQL? The data for each user won't be too large and it does have some requirement for data consistency. But mongoDB's JSON support is somewhat attractive.
I'm not familiar with the syncing procedure, I did some digging and it appears that JSON is a good choice, but what data should I put into the JSON files?

I actually flagged this as too broad and as attracting primarily opinion based answers but I'll give it a go anyhow and hope to stay objective.
First of all you have 2 separate questions here.
What database system should I use.
How do I sync between app and server.
The first one is easily answered because it doesn't really matter. Both are good options for storing data. MySQL is mature and stable and MongoDB although it's newer has very good reviews and I don't know of any known problems which would prevent it from being used. So take the database which you find easy to use.
Now for second I'll first put in a disclaimer that for data synchronization between multiple entities entire books are written and that it is after all this time still the subject of Phds.
I would advice against directly synchronizing between mobile app and database because that requires the database credentials to be contained within the app. Mobile apps can and will be decompiled and credentials extracted which would compromise your entire database. So you'll probably want to create some API which first does device/user authentication and then changes the database.
This already means that using MongoDB for sake of this would probably be a bad idea.
Now JSON itself is just a format of representing data with some structure, just as XML. As such it's not a method of synchronization but transport.
For synchronizing data it's important that you know the source of truth.
If you have 1 device <-> 1 record it's easy because the device will be the source of truth, after all the only mutations that take place are presumably done by the user on the device.
If you have n devices <-> 1 record then it becomes a whole lot more annoying. If you want to allow a device to change the state when offline you'll need to do some tricks to synchronize the data when the device comes back online. But this is probably a question too complex and situation dependent to answer on SO.
If you however force the device to always immediately propagate changes to the database then the database will always contain the most up to date record, or truth. Downside is that part of the app will not be functional when offline.
If offline updates don't change the state but merely add new records then you can push those to the server when it comes online. But keep in mind you won't be able to order these events.

Related

migration from mysql to nosql database in production without code change and mysql without foreign keys and indexes [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
i have two scenarios here :
migrating mysql database to nosql without code change(no orms are used)
using no foriegn keys and indexes in mysql(because they want to migrate to different database in future)
3.all this done by very less code change
these questions are asked by my team lead. so i dont have a answer to give him properly because i feel it very unlikely to do mysql with no indexes and foreign keys and first of all if they are not meant to use mysql.then why they choose that.
i want to know that people do like this in software industries
ofently or they will choose on their need fits correctly
they are saying that foreign key validitations are done by api level
not by mysql level
i dont understand them becasue i have less experience so i dont have an answer why they are saying like this. please give me some insight to this that if this is a good practice or not ?
I don't think it will be possible without adding code - you need to implement how your data is managed by your nosql dB engine in some way. If the project is coded with a clear separation of business logic and database code, it's a simple matter of using the new database implementation instead of the old one. If that is not the case and your db implementation leaked into your business logic, then it will not be possible to switch without changing code. Depending on the size of the code base it might /will most likely be too expensive.
If you want to see an example of a clean separation of dB logic from business logic, have a look at this repository: https://github.com/fathersson/money-transfer
(this is not my repository, I just stumbled upon it today)
If you want to learn and understand the principles driving that design, start by looking for "clean architecture" and/or "Domain Driven Design" - the first one is easier to understand in my opinion and there are some talks on YouTube by Robert C. Martin that you can have a look at before buying some books.
Edit: The project I'm working on at the moment did change from postgresql running on rds to dynamodb using a different repository without changing any existing business logic. It saves a lot of money that way. So yes, changing the db backend does happen and is driven by requirements.
In addition to that, when I start working on a new feature set/micro service/bounded context I usually start with a simple in memory repository implementation that's using a map. After I'm done with the initial set of use cases, I know more about the db requirements and choose the db engine based on these and the general requirement to limit the number of different technologies in use.

Database Design: When should I use multiple database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Background:
I'm trying to build the backend services of an app. This app has rooms where a user can join in to. When a user joins a room, s/he can exchange some data with other users through socket. Outside of the room, the user can view the processed data from the transactions that happened inside the rooms s/he joined in to. Room list, room informations, and the data transactions inside the room should be stored in a database.
My idea is to create a project with one database, however, an experienced developer suggested to the database into two:
One that uses MongoDB for storing data transactions happening inside the room.
One that uses MySQL for storing and returning room list, room information and analytics of what happened inside a room.
Problem I see with using multiple database:
I did some research and from what I understand, multiple database is not recommended but could be implemented if data are unrelated. Displaying analytics will need to process data transactions that happened inside a room and also display the room's information. If I use the two database approach, I will need to retrieve data from both database in order to achieve this.
Question:
I personally think it's easier to use a single database approach since I don't see the data inside and outside of the room as 'unrelated'. Am I missing an important point on when to use multiple database?
Thanks in advance. Have a good day.
You can look at this problem from two perspectives; technical and practical.
Technically, if your back-end is expected to become very complex or scaled, it is recommended to break it down into multiple microservices, each in charge of a tiny task. Ideally, these small services should help you achieve separation of concerns, so each service only works with one piece of the data. If other services need to read and/or modify that piece of data, they have to go through the service in charge.
In this case, depending on the data each service is dealing with, you can pick the proper database. For instance, if you have transactional data, you can use MySQL, MongoDB for large schemaless content, or Elasticsearch if you want to perform a text search.
Practically, it is recommended to start small with one service and database (if you prefer to develop your app sooner), and then, over time break it down into multiple services as you need to add and/or improve features.
There are multiple points to keep in mind. First, if you expect to have a large user base, you should start development with the right architecture from the beginning to avoid scaling issues. Second, sometimes one database cannot perform the task you need. For example, it would be very inefficient to do a text search in MySQL. Finally, there is no absolute right way of doing things. However you do one thing, another friend might show up tomorrow and ask you why you did not do it his/her way. The most important thing is to start doing and then, learning and improving along the way. Facebook was started with MySQL and it worked fine initially. Could it survive one database type today? I suspect the answer is no, but would it have made sense for them to add all the N databases that they have now back then?
Good luck developing!

Microservice shared database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I know that this question has been discussed a lot already, but I would like to describe my situation.
As far as I know, there are techniques and best practices to solve the shared database in microservices architectures (event sourcing, CQRS...) but all of that seems to complex for my case, let me explain.
I built a rest API using nodejs. This API allow you to fetch, using a GET request, the data stored in a mysql database.
Now I need to import a lot of data in the same database (creating every time a new table). The first solution could be to add a new endpoint (POST request) to the existing microservice to create the new table and add the new data.
But I was thinking about to create a different nodejs microservice (import service) because the import feature could be very CPU time consuming and nodejs is single thread; I don’t want that a user has to wait to fetch the data because another one is importing the new one.
The problem whit that solution is that I have to share the same database between the 2 microservices. Using the typical approaches (event sourcing, CQRS) could be the best solution but it complexs too much the architecture ( for this project I don't need to address the data consistency problem).
There are others 2 solution that I can use:
create a common Lib to access the DB and use the lib in the microservices
The “import microservice” instead of access to the database directly, can use the API rest of the other service to post the new data as soon as they are ready to be imported.
What is the best solution? Do you know other possible ways to address this problem?
Thank you very much
In the microservices world, services should be divided according to the bussiness domain that each service represents and not by specific technical functionality as you are proposing. This is an architectural design that I don't recommend bypassing just to workaround some specific technical funtionality.
The approach to solve your problem is not by splitting the microservice into two services that are in the same business domain.
Your problem is a performance issue. Performance issues are generally solved by scaling. It is totally normal to replicate your serivices by deploying multiple containers. Docker gives you this option by default where when deploying a service, you can specify the number of replicas to deploy.

Do stored procedures improve performance? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am newbie to web development.
I have an application server where my ASP.NET code resides. My application server communicates to a MySQL instance which is on a different server.
I was wondering, whether it is a good practice to move the computation from the application server to the database server by having a Stored Procedure with Views or should I just move on with all logic kept in application server and query the database only to retrieve data from tables directly without having stored procedures and views.
I am a strong advocate of putting database logic into the database and not splitting it between the application and the server. This means that I prefer to wrap all database calls in stored procedures and views.
The driving reasons are maintenance, security, and functionality, not performance, although performance is often better on the server side.
The number one reason is to isolate the application from changes in the underlying data structure. So, if the data structure changes, the application does not (always) break.
Other reasons the come to mind:
The same logic gets used for the same thing. That is, one piece of code doesn't define "foobar" one way and another "foobar" another way.
Auditing and logging are implemented within stored procedures rather than using triggers.
Database tables are off-limits to all users, unless they go through the defined interface.
A newer version and older version can often co-exist.
Admittedly, for a one-off, quick-and-dirty application these issues may not be important. However, I think it is a good idea to have well defined interfaces (APIs) between different components of a system, and databases and the application layer are a prime example where such APIs are quite useful.
I agree with Gordon on separating out a "layer" of code between the application and the actual database. I dispute how practical Stored Routines are at such.
PHP (etc) is far more expressive than SProcs.
One SProc can execute multiple queries faster because it is closer to the server. This can be an overwhelming performance gain if the client and server are on opposite sides of the country.
Error checking is clumsy in SProcs.
PHP recompiles only when the code changes; SProcs recompile once per connection; Perl always recompiles; etc.
VIEWs are sometimes poorly optimized, so I avoid them.
The secret to a good design for the "layer" is in the compromise between the forces tugging on either side. One example: Can you completely hide a schema change from the app? Even if you split one table into two?
A really bad example was when the UI did pagination by using page numbers. The layer thought in terms of OFFSET and LIMIT, and fed that to the MySQL back-end. Then came an item will 216K pages (Yes, that many!) They found out that OFFSET+LIMIT is not a good way to implement "next page", but fixing it required a changes to all layers of the system.

I would like to create a database with the goal of populating this database with comprehensive inventory information obtained via a shell script [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I would like to create a database with the goal of populating this database with comprehensive inventory information obtained via a shell script from each client machine. My shell script currently writes this information to a single csv file located on a server through an ssh connection. Of course, if this script were to be run on multiple machines at once it would likely cause issues as each client potentially would try to write to the csv at the same time.
In the beginning, the inventory was all I was after; however after more thought I began to ponder wether or not much much more could be possible after I gathered this information. If I were to have this information contained within a database I might be able to utilize the information to initialize other processes based on the information of a specific machine or group of "like" machines. It is important to note that I am already currently managing a multitude of processes by identifying specific machine information. However pulling that information from a database after matching a unique identifier (in my mind) could greatly improve the efficiency. Also allowing for more of server side approach cutting down on the majority of client side scripting. (Instead of gathering this information from the client machine on the startup of each client I would have it already in a central database allowing a server to utilize the information and kick off specific events)
I am completely foreign to SQL and am not certain if it is 100% necessary. Is it necessary? For now I have decided to download and install both PostgreSQL and MySQL on separate Macs for testing. I am also fairly new to stackoverflow and apologize upfront if this is an inappropriate question or style of question. Any help including a redirection would be appreciated greatly.
I do not expect a step by step answer by any means, rather am just hoping for a generic "proceed..." "this indeed can be done..." or "don't bother there is a much easier solution."
As I come from the PostgreSQL world, I highly recommend using it for it's strong enterprise-level features and high standard compliance.
I always prefer to have a database for each project that I'm doing for the following benefits:
Normalized data is easier to process and build reports on;
Performance of database queries will be much better due to the caching done by the DB engine, indexes on your data, optimized query paths;
You can greatly improve machine data processing by using SQL/MED, which allows querying external data sources from the database directly. You can have a look on the Multicorn project and examples they provide.
Should it be required to deliver any kinds of reports to your management, DB will be your friend, while doing this outside the DB will be overly complicated.
Shortly — go for the database!