combination mysql mongodb - mysql

I am building a web application that requires to be scalable. In a nutshell:
We got users, users have friends, so they got a friendlist. Users can create messages, and messages from your friends are displayed on the homepage, each message is linked to a location and these messages can be filtered by date, for example I want to display all the messages from my friends that where posted yesterday, or display me all messages from location X.
I am now building the application fully in MongoDb, however I am heading into trouble atm. For example:
On the mainpage, we got the message list of the friends of the users, no problem we use:
$db->messages->find(array('users._id' => array('$in' => $userFriendListGoesHere)));
So then we got our messages, however after that, each message has a location, so I have to make a loop through all messages, and get the location from another collection, and also multiple users can be bound to a single message, so we also have to get all the user data from another collection, in MySql simply a join query, in MongoDb 2 loops, and this is my first question: is this a problem? Does this require alot of resources, the looping?
So my idea is to split up with MySql and MongoDb, I use MongoDb to store all the locations (since it are over 350.000+ locations and use lat long calculations) and MySql for the message, users and friends of the users, so second question, can you help me with my decision, should I keep using MongoDb with the loops? Or use a combination?
Thanks for reading and your time.

.. in MySql simply a join query, in MongoDb 2 loops, and this is my first question: is this a problem?
This is par for the course with MongoDB, in fact, it's a core MongoDB trade-off.
MongoDB is based on the precept that joins do not scale. So it has no joins and leaves you to "roll your own". Some libraries like Morphia (for Java) provide built-in logic for loading references.
PHP has the Doctrine project, which should help with some of this.
Does this require alot of resources, the looping?
Kind of? This will really depend on implementation.
It's obviously going to involve a bunch of back and forth with the DB, but it may be less network traffic than the SQL version. You will need memory space for all of the data coming back. But again, that's not terribly different from SQL.
Really, it's up to you to make all of the trade-offs about how this is implemented and who is keeping what in memory.

should I keep using MongoDb with the loops
MongoDB is a great idea when your data is not inherently relational.
In the example you provided, it kinda seems like your data is relational. MySQL and other relational DBs (such as Postgres) are better data stores than MongoDB for relational data. This blog post covers this topic in more detail.
In summary, I'd recommend the following:
Please spend some time analyzing whether your data is inherently relational or not.
If it is not, then MongoDB can give you benefits over using MySQL.
If it is relational, then MySQL is the better solution.
Using both is, of course, possible - but it will create additional work & complexity for you. In the long term - is that worth the effort? Only you will know the answer.
Best of luck with your web app!

Related

How to migrate data from mongodb to mysql?

I am currently working on an application like to analitics, i has Angularjs app which communicates with Spring REST Client App from which user creates token(trackingID) and use generated script with this id putting on his website to collect information about visitor's actions through another Spring REST tracking App, for tracking app i am using as mongodb to collect visitor actions/visitor info for fast insertion, but for rest client app mysql with user/accounts details.
My question is how to migrate mongo data from tracking app to mysql maybe for getting posibility of join for easily and fastest way of analyze data with any kind of filters from angularjs client app, to create manually any workers that periodically will transfer data from last point to present state from mongo to mysql, or are any existed tools that can be setted for this transfer?
There is no official library to do this.
But you can use mongoexport feature from mongoDB to export it in a CSV format and mysqlimport to import them into MySQL.
Here are links to the documentation MySQL import and MongoDB Export.
One more method you can try to write a program in one of your favorite language and read from MongoDB and write into MySQL
MySQL 5.7 has a new JSON data type, that can be very convenient.
You can create a table at MySQL to receive the JSON messages AS IS, and then use SQL to query it or do a post processing to load the data in a structured set of database tables.
Check this out: https://dev.mysql.com/doc/refman/5.7/en/json.html
I realise this question is a few years old - but recently I've had a number of people enquiring whether a tool I developed (https://virtual.blue/apps/json-converter) can do exactly what the OP is asking (convert MongoDB to SQL) so I am guessing it is still something people want. Keep reading to find out why I am honestly not surprised by this.
The short answer to whether the tool can help you is: perhaps. If your existing data relationships are not too complicated, and your database is not enormous, it may well be worth a try.
However, I thought it might help to try and explain what the issues are with this kind of conversion, since all the answers I have seen so far are along the lines of "try tool X" or "first convert to format Y and then you can slurp it into MySQL using utility Z". ie there is no thought to whether what you get at the end of doing this is going to make sense in terms of data relationships and integrity.
For example, you could just stick your entire database dump in a single field of a single SQL table (ok space limitations might prevent this in reality, but hopefully you get my point). Then your database would be "in MySQL format", but it would be absolutely no use to anyone.
The point is, what you actually want is a fully defined database model, correctly encapsulating all of the intrinsic data relationships. ("Database normalization" as it is known.) If your conversion process gets those relationships wrong, then you have a broken model, and any queries you try to run over it are likely to return nonsense. Unfortunately there is no magic tool that is just going to "know" the best way to represent your data in MySQL, and closing your eyes and shovelling it into a bunch of random tools is unlikely to miraculously get you what you want.
And herein lies the fundamental problem with the "NoSQL" philosophy (fad). They sold people the bogus notion of "non-relational data". My first thought when I heard this was, "How does that work? Surely all data is relational?" By the looks of things we are steadily getting more and more evidence that my instincts were right. ("NoSQL? Why stop there? I go with 'NoDatabase'. It returns no results at all, but it sure is fast!")
The NoSQL madness throws several important fundamental engineering principles to the wind. We shouted "don't hard code!", "DRY!" (Don't Repeat Yourself) because these actions infuse inflexibility into systems. Traditional wisdom makes precisely the same flexibility argument when it advises "create a fully described model with all the data relationships represented". Then you can execute any arbitrary query over it and expect meaningful results. "Yes but there are a whole bunch of queries we are never going to need to run," says the NoSQL proponent. But surely we learnt our lesson on things we are "never going to need to do"? ("I hard code liberally, because I know I am never going to want to change my code." Hmm...)
The arguments about speed are largely moot. Say it turns out you are frequently doing a complex 9 table join, with unsurprisingly sluggish performance. So create an index. Cache it. Swap some disk space for speed. The NoSQL philosophy is to swap data integrity for speed, which makes no sense at all.
When you generate your fast lookup index (cache/table/map/whatever) what you are really doing is creating a view over your model. If your model changes, you can readily update your view. Going from a model to a view is easy - it's a one to many operation and you are on the right side of entropy.
However, when you went with MongoDB you effectively decided to create views without bothering to describe your fundamental model. Now you discover there are queries you want to run, but can't - and so it's no wonder you want to move over to SQL and actually have your data modelled correctly. The problem is you now want to go from a view to a model. Now you're on the wrong side of entropy. Your view is a lossy representation of the model's fundamental relationships. You can't expect a tool to "translate" your database, because you are asking it to insert new relationships which were not originally defined. These are real world relationships that are not machine-guessable. The tool cannot know what relationships were intended.
In short the only way you can do this reliably is to get your hands dirty. An intelligent human, with complete understanding of the system you are modelling needs to sit down and carefully come up with (possibly a substantial amount of) code which effectively picks through the data and resolves all of the insufficiently represented data relationships. If your data is complex then it's going to be a headache and there is no way to cheat.
If your data is still relatively simple then I would suggest making the conversion as soon as possible, before it becomes difficult. In this case my tool (https://virtual.blue/apps/json-converter) may be able to help.
(They really should have asked a Physicist before they came up with all this nonsense...!)
You can download a trial version of Studio 3T for Mongo and export your database to SQL (or JSON) directly

Using MySQL and MongoDB together

I have never used NoSQL before, generally the applications I write requires relations. However, I have encountered a something that I don't know how to go about. So far, I am only designing the database. For now, my main logic is in the MySQL Database. I have static content that I will be hosting through a CDN. However, I have dynamic content that will be updated but very rarely but will be read almost on every request - like phone number, email address, address, additional info. They will not be used for searching, however this data is unstructured. A user can have multiple email addresses, phone numbers, and addresses; and they would be needed for multiple tables. So, using relational database in this case fails my needs (I don't want to create an Entity-Attribute-Value Table for this) and since I know that it doesn't affect the logic - its only used as a "meta-data" I want to keep them in a JSON format. And after Google-ing for sometime, I found out that MongoDB stores "documents" in JSON which sounded like the perfect solution. However, I have one question regarding this. How do I connect these databases together? Do I need to just add a user_id or organization_id "column"/field for a document on create/update and do a "select" query (whatever is the equivalent in the MongoDB) to receive the meta data? Or is there a different way?
I'll present here my opinion. What you're trying to do here is called "polyglot persistence". If you introduce mongo, you'll have 2 architectures for storing your data, different in strength, api, design, what not and this has its price.
Mongo DB, is a great product, I've used it by myself with a great success, but you have to understand that it doesn't provide all the features you would expect from RDBMS like MySQL. For example, it totally lacks transactions.
Moreover if you store in both MySQL and Mongo you'll have to care for data integrity by yourself (what happens if as a part of logical transaction mysql transaction succeeds, but mongo fails to store the data), there is no rollback...
I believe you've got my point.
Yes, mongo really allows to query by various JSON parameters, in fact it features the whole query language, it resembles SQL to some extent, but its not really a "relational" query engine, because mongo is not a relational database, so you don't have JOINs for example. But you've said by yourself that you are not going search by these fields, so I kind of don't understand what benefit you would have from using mongo. Maybe this is only about terminology, but I'm confused with this statement a little.
Where mongo is really shines is when you have a lot of data (its a big data product after all), then you have funny stuff like replica-sets and sharding, but the question is whether you really need it? do you really have "big data" - really huge amount of objects to be stored?
As an alternative, I think maybe you can use a text column for storing the JSON "as is". I mean, you might have a column, storing the JSON.
You even sometimes have "JSON" type as a native type in the database, I'm not sure whether MySQL supports it.
In this case you even can do some operations on these jsons (like, append, partial update and so forth).
Of course the choice is yours, all I'm saying is that you should think whether you have more benefits while using 2 persistence engines, or will make your project more complicated.
Hope this helps

neo4j or neo4j+mysql for partial graph dataset

Even though I read another question here advising not to use both neo4j and mysql (neo4j - graph database along with a relational database?), I was wondering what approach would be the best for dataset that has some data which can be modeled like a graph and the rest looks relational. For some reasons, I can't post the kind of data I'm using.
I can shoehorn the relational part into neo4j but it looks ugly and complex, something I would want to avoid.
On the other hand, if I use both together, I'll have to do double the amount of queries to get the result, decreasing performance (assume the DBs are on cloud in separate machines).
I can't use mysql alone because one of the queries requires a depth of around 20-30 which I assume can't be handled by mysql.
Have any of you encountered such a situation before ? If so, how did you solve it ?
As everyone else says: "give us a better idea of what data you are trying to model so we can best give you a suggestion".
That being said, dealing with 2 DBs is not an issue and its more common than people think: often-times you use a Full-Text store for searches and then get back a list of Document IDs which you then hit the relational DB for additional metadata. Or hitting Redis to get a list of IDs which you also hit the relational DB for more data.
I proof-of-concepted a system of Neo4j+MySQL for targeted searching based on your social network ("show me all restaurants my network has recommended ordered by depth (e.g. 1st level friend recs are weighted higher than 2nd level, and so on) and it didn't feel awkward. But I also didn't take it to scale.
You will be having to keep both datastores in sync. So in my case when a user recommends a place on the web app (which inserts it into MySQL) you then need to turn around and do the same insert into Neo. You probably want to do this asynchronously as well, so you'll need to setup a message queue with workers.

MongoDB, Mysql and relationships

I'm creating an online chat.
Context (if needed):
So far I was using PHP/MySQL and AJAX to do the job but this is not a healthy solution as I'm stuck with a "pull" type application with concerns about scalability.
I read about the "push" method alternatives and it seems that my choices are limited and exclude PHP.
Websockets could be a very interesting option if it was integrated in every browser but that's not the case (and it seems that for most of those implementing it, it is disabled by default).
Long polling would also be a candidate but it involves other issues like the number of concurrent open connections that may kill your web app too.
This is why, against my will, I think that my only viable option is to use server-side javascript (node.js + now.js would be my choice then).
This said, I may need to rethink the use of a database too.
I need to keep stored data of each users and link these users to their submitted messages.
In case of a chat engine driven by a push system, would MySQL still be a valuable choice then?
I read about NoSQL data management and it seems that MongoDB would be a good addition to node.js.
My two questions:
Is there a reason I'm better off moving to a NoSQL system (which I need to learn from scratch) instead of MySQL (which I know already) in case of a real time web app?
Let's say that in MySQL:
I have a table called user (user_id_p, username)
I have a table called messages (message_id, message, user_id_f)
I want to make a single query to get all the messages associated with the username "omgtheykilledkenny".
Simple enough but how can I achieve that with MongoDB and its collections philosophy?
Thank you for your help.
Working with node.js/MongoDB is cool because Mongo's document structure is already JSONish, so you don't have to convert your queries to JSON. If you already know JavaScript, you have a headstart learning MongoDB. Mongo does scale for writes and reads pretty easily, the speed is pretty awesome, although I've seen some MySQL benchmarks on a single system that compare well to Mongo--it really shines when you start needing multiple boxes.
Assuming you have a separate messages collection, and you already know the id of the user you could just do: db.messages.find({user_id:ObjectId(...)});
Update: If you don't know the user id, then you need to do two queries, yes (unless you use an embedded array as recommended in the other answer--I would advise against that for this sort of use case, though, because you'll end up querying the entire document/list of messages even to display just a subset). Depending on your use case, obviously, if you have the username, you could also keep the user id handy, for situations like this. If it's client input giving the username that wouldn't work.
Update2: If you have unique usernames, you could make the username the _id for the users collection to avoid this issue. Most people would probably advise against this, and it has some definite drawbacks, such as making it harder to change a username.
You can't perform joins in MongoDB, so you can't achieve your second requirement. The Mongo way so do this would be either to nest messages within the user collection:
{ username: 'abc', messages: [...]}
Or use refId's, which is a kind of half-way house between joins and nested documents:
http://uk3.php.net/manual/en/class.mongodbref.php
In terms of switching from MySQL to Mongo, you don't necessarily need to ditch MySQL entirely. There are use cases where one is more appropriate than the other. You could use both for different parts of the system if it's appropriate to do so. Personally, I've used MySQL for a lot of things in the past, and I'm using MongoDB for a big project at the moment. I found the move very easy to make, because it's so easy to use the MongoDB driver, and the MongoDB site is very good for documentation on the whole.
You can convert to and from JSON with json_encode and json_decode from the front end, and you query and insert/update with arrays with MongoDB's PHP driver, so it's arguably more intuitive and easier to use than MySQL. It's just a question of getting used to it.

MongoDB - proper use of collections?

In Mongo my understanding is that you can have databases and collections. I'm working on a social-type app that will have blogs and comments (among other things) and had previously be using MySQL and pretty heavy partitioning in an attempt to limit possible concurrency issues.
With MySQL I've stuffed all my user data into a _user database with several tables to further partition the data (blogs, pages, etc).
My immediate reaction with Mongo would be to create a 'users' database with one collection per user. In this way user 'zach' blog entries would go into the 'zach' collection with associated comments and such becoming sub-objects in the same collection. Basically like dynamically creating one table per user in MySQL, but apparently without the complexity and limitations that might impose.
Of course since I haven't really used Mongo before I'm having trouble gauging the (ahem..) quality of this idea and the potential problems it might cause down the road.
I'd like user data to be treated a lot like a users directory in a *nix environment where user created/non-shared (mostly) gets put into one place (currently with MySQL that would be the appname_users as mentioned above).
Most of the users data will be specific to the users page(s). Some of the user data which is queried across all site users (searchable user profiles) is currently kept in a separate database/table and I expect things like this could be put into a appname_system database and be broken up into collections and/or application specific databases (appname_profiles).
Anyway, since the available documentation on this is currently a little thin and my experience is extremely limited I thought I might find a little guidance from someone with a better working understanding of the system.
On the plus side I'd really already been attempting to treat MySQL as a schema-less document-store and doing this with Mongo seems much more intuitive/sane/rational so I'm really looking forward to getting started.
Thanks,
Zach
I have the same kind of application.
Some things to consider: you can cross query between collection bu not between databases.
So It's probably better to have a database with all you data and then a collection for each Object.
Then each document can contain any kind and number of fields.
I tried to avoid embedding arrays b/c I had trouble query properly my object (it was working fine, but the architecture of my system was designed for this use)
And a database can be shared between several sever automatically so space is not an issue (if you have more than 1 server)