I have a mysql database as my basic datastore for the master data. For complex multilevel queries similar to Friends of friends type i am having a graph datastore neo4j. Problem I am facing is while maintaining transactions, where i have to insert a user record in mysql and user node in neo4j. Now i want both of them to be successful. What i have accomplished is if neo4j insert is successful then I insert a mysql user record. but what if insert mysql user record fails i have an inconsistent state. Similar can happen if i insert mysql first and then neo4j. Is there any way i can accomplish transaction across mysql and neo4j. Would i need to maintain some kind of failed transaction log and execute them later?
Thanks! I know many wont agree with the combined approach for neo4j and mysql but i find this the best solution at the moment if i can pull away with consistent datastores.
Polyglot persistence (multi-database systems) is becoming commonplace, and this is a common challenge. You won't find any built-in mechanisms for transactions across disparate databases. Now: While there's no single "right answer" for your scenario, (and I'll do my best to keep my answer objective), think about your System of Record - which database holds the truth and needs to be correct? My guess is, it's the MySQL database.
So: Now you have your MySQL database, properly taken care of, saving content transactionally. Now there's your Neo4j database, which is being used for ancillary functions (searching for friends, in your case). Why not add your Neo4j graph nodes (and relationships) after the fact, as a separate operation? Will it really impact your system's operation if the insert into Neo4j is delayed? Is there anything detrimental to having an eventually-consistent update to your graph database? I suspect that, even with a follow-on operation, the delay will be minimal. That said: Only you know your app, and whether this synchronization between databases needs to be absolute.
Related
I have a MySQL database on my server, and a Windows WPF application from which my clients will be inserting and deleting rows corresponding to their data. There may be hundreds of users working on the application at the same time, and they will be inserting or deleting rows in the db.
My question is whether or not all the database execution can go successfully or should I adapt some other alternative?
PS: There won't be any clash on rows while insertion/deletion by users as a user will be able to add/remove his/her corresponding data only.
My question is whether or not all the database execution can go successfully ...
Yes, like most other relational database systems, MySQL supports concurrent inserts, updates and deletes so this shouldn't be an issue provided that the operations don't conflict with each other.
If they do, you need to find a way to manage concurrency.
MySQL concurrency, how does it work and do I need to handle it in my application
I have a application where I need to maintain the audit log operation performed on the collection. I am currently using the MongoDB for storage purpose which work well so far.
Now for audit log I am thinking to use the MySQL database where reasons are-
1. Using the mongo implicit audit filter degrade the performance.
2. Storage will be huge if I store the logs also in the mongoDB which will impact in replication of nodes in cluster.
Conditions to see the logs are not very often in application, so thinking to store logs out of main storage. I am confused to use mongoDB with MySQL, also is this a right choice for future perspective.
Also Is MySQL a good choice to store the audit log, or any other database can help me in storage and conditional query later.
Performance is not guaranteed to go to a completely different database system only for this purpose.
My first attempt for separation would be creating a new database in your current database system and forward to there or even using a normal text file.
Give your feedbacks.
Earlier in our database design, we use to create mandate fields for each of the table and few important fields were:
created_by
created_time
created_by_ip
updated_by
updated_time
updated_by_ip
Now, its an era of no-schema design. We prefer mongodb or some other just writing databases.
My question here is:
Is it a good practise to maintain logs in a separate database?
Do we need to create separate log table for each mysql tables considering mongodb or is it okay to have single mongodb audit table for
all mysql tables?
What things need to be considered in querying the results from mongodb?
What should be the structure for mongodb table structure?
Any other alternatives to store logs?
Considering situation where if we want to delete registered user if not authenticated in specified time(max of 48hrs).
If all the time logs are handled in mongodb. How can we query the same from mysql?
You usually want this (audit?) data next to the real data and definitely not in a different DB engine as the number of partial errors to support becomes quite a nightmare (e.g. someone registered, but you fail to insert audit data - is this ok? should the account become orphan? What happens if the app goes down half way?).
Systems that have this separation usually use messaging and 2 different listeners are responsible for storing the data and storing the audit (e.g. one in a relational DB and the other in an event store). In this way you have a higher chance of achieving eventual consistency.
Edit
There are a few options around using messaging and the assumption here is that both sources of data must be in sync (or as close as possible). Please bear in mind that I still think that storing data+audit together is by far the simplest and more sensible approach.
Using messaging, your app can emit a message on certain events (e.g. user created). Then 2 different listeners react to this message. One listener stores the data in one DB engine; Another listener stores the audit data. The problem with this approach is that you might need to ensure ordering on the messages, which makes it really slow.
Another (scary) approach is to use distributed (XA) transactions between MySQL and a messaging system (as mongo doesn't support transactions). Then the data to MySQL and the message would be committed together, and a listener can receive the audit data and store it in mongo.
I need to emphasize that the 2 approaches above are horrible and should never be implemented.
There are more sensible approaches but might require a different tech stack. For example using an EventSourcing+CQRS you can store the events (with the audit data) and store the final read models without the audit data.
I am new to server side programming and am trying to understand relational databases a little better. Whenever I read about MYSQL vs SQLite people always talk about SQLite not being able to have multiple users. However, when I program with the Django Framework I am able to create multiple users on the sqlitedb. Can someone explain what people mean by multi-user? Thanks!
When people talk about multiple users in this context, they are talking about simultaneous connections to the database. The users in this case are threads in the web server that are accessing the database.
Different databases have different solutions for handling multiple connections working with the database at once. Generally reading is not a problem, as multiple reading operations can overlap without disturbing each other, but only one connection can write data in a specific unit at a a time.
The difference between concurrency for databases is basically how large units they lock when someone is writing. MySQL has an advanced system where records, blocks or tables can be locked depending on the need, while SQLite has a simpler system where it only locks the entire database.
The impact of this difference is seen when you have multiple threads in the webserver, where some threads want to read data and others want to write data. MySQL can read from one table and write into another at the same time without problem. SQLite has to suspend all incoming read requests whenever someone wants to write something, wait for all current reads to finish, do the write, and then open up for reading operations again.
As you can read here, sqlite supports multi users, but lock the whole db.
Sqlite is used for development ussualy, buy Mysql is a better sql for production, because it has a better support for concurrency access and write, but sqlite dont.
Hope helps
SQLite concurrency is explained in detail here.
In a nutshell, SQLite doesn't have the fine-grained concurrency mechanisms that MySQL does. When someone tries to write to a MySQL database, the MySQL database will only lock what it needs to lock, usually a single record, sometimes a table.
When a user writes to a SQLite database, the entire database file is momentarily locked. As you might imagine, this limits SQLite's ability to handle many concurrent users.
Multi-user means that many tasks (possibly on many separate computers) can have open connections to the database at the same time.
A multi-user database provides things like locks to allow these tasks to update the database safely.
Look at ScimoreDB. It's an embedded database that supports multi-process (or user) read and write access. It also can work as a client-server database.
Does it make sense to use a combination of MySQL and MongoDB. What im trying to do basically is use MySQl as a "raw data backup" type thing where all the data is being stored there but not being read from there.
The Data is also stored at the same time in MongoDB and the reads happen only from mongoDB because I dont have to do joins and stuff.
For example assume in building NetFlix
in mysql i have a table for Comments and Movies. Then when a comment is made In mySQL i just add it to the table, and in MongoDB i update the movies document to hold this new comment.
And then when i want to get movies and comments i just grab the document from mongoDb.
My main concern is because of how "new" mongodb is compared to MySQL. In the case where something unexpected happens in Mongo, we have a MySQL backup where we can quickly get the app fallback to mysql and memcached.
On paper it may sound like a good idea, but there are a lot of things you will have to take into account. This will make your application way more complex than you may think. I'll give you some examples.
Two different systems
You'll be dealing with two different systems, each with its own behavior. These different behaviors will make it quite hard to keep everything synchronized.
What will happen when a write in MongoDB fails, but succeeds in MySQL?
Or the other way around, when a column constraint in MySQL is violated, for example?
What if a deadlock occurs in MySQL?
What if your schema changes? One migration is painful, but you'll have to do two migrations.
You'd have to deal with some of these scenarios in your application code. Which brings me to the next point.
Two data access layers
Your application needs to interact with two external systems, so you'll need to write two data access layers.
These layers both have to be tested.
Both have to be maintained.
The rest of your application needs to communicate with both layers.
Abstracting away both layers will introduce another layer, which will further increase complexity.
Chance of cascading failure
Should MongoDB fail, the application will fall back to MySQL and memcached. But at this point memcached will be empty. So each request right after MongoDB fails will hit the database. If you have a high-traffic site, this can easily take down MySQL as well.
Word of advice
Identify all possible ways in which you think 'something unexpected' can happen with MongoDB. Then use the most simple solution for each individual case. For example, if it's data loss you're worried about, use replication. If it's data corruption, use delayed replication.