Use NoSql? And if yes how? - mysql

I read and heared a lot (podcasts, stackoverflow questions..) about NoSQL-Databases and I am really curious to use them, but...
Although I read a lot of things like how-to-sql-or-nosql or what-scalability-problems-have-you-solved-using-a-nosql-data-store I am still not certain which kind of DB to use.
The Problem is: For a (school) project we (my project group) need to implement a quite big database (that should serve a rest-server, probably written in erlang, with lots of clients).
We are quite good at designing datamodels for relational databases. So we startet to do that.
Now I played around with some NoSQL and was really impressed by the performance.
So: Is it a good Idea to use a NoSQL Database? Our Datamodel has lots of relations and the queries would have lots of joins (or at least use joined views).
I sometimes read this means I should go with a relational Database and in other places I read this means I could easily redesign it into NoSQL-Style to loose this overhead of relations.
Should I use NoSQL and if yes, which of the systems would you suggest me to use?
Are Things like HanderlerSocket for MySQL are an option?
And how can I easily redesign a relational Datamodel into NoSQL-Style?

The answer to your question is: It totally depends on your data and requirements. In a real-world project you would analyze the benefits of various NoSQL-Databases (HBase, Cassandra, MongoDB, CouchDB, Riak,...) in your special project. Then you could evaluate these against the benefits of a classical RDBMS like MySQL.
In a school project like yours a NoSQL-Database is mainly a decision of taste as your project will probably never benefit from typical NoSQL-advantages like schemalessness or sharding.
A redesign of a relational datamodel can be a very tricky task as you have to wrap up your mind around the different database model of the chosen NoSQL-database. Joins are not necessarily a problem if your business data fits the database model of your chosen NoSQL-database. Sometimes Join-intensive relational models are a lot easier to implement in some NoSQL-Databases (e.g. a Document oriented database like MongoDB).
If you really want to try out NoSQL go with MongoDB as it is very well documented for a first entry.
As a german-speaker (Grützi in die Schweiz aus Berlin) I recommend you to read the following book in German, which helps you to get the main reasons for using a NoSQL-database and explains the main steps to start using the most popular NoSQL-Databases: NoSQL: Einstieg in die Welt nichtrelationaler Web 2.0 Datenbanken

Please keep in mind that you are not required to use just 1 data storage engine. You can use SQL and noSQL solutions in parallel.
Just remember to document your database/noSQL structures properly.
-daniel

If you want to do joins in nosql, you could use playOrm which does joins on partitions. In this way, you can have a 1 trillion row table and 1 partition of that table may only be 100,000 rows and you can join that partition with another one. playOrm also then gives you all the familiar hibernate relationships as well.

Related

RDBMS vs NoSQL for message boards site

I'm in the process of designing and planning a new website.
it is mainly a message boards site
I have past experience with MySQL, but I hear many voices (not in my head)
which telling NoSQL can be as good solution as RDBMS.
the main claim for NoSQL is performance. what do you think about it?
so,
I need a scalable database-design technology for my website.
if I go with NoSQL, I know there are couple of technologies in this area
(document store, key-value store etc) . how to choose?
what do you think is more suitable for a message boards website:
NoSQL or MySQL?
thanks,
socksocket
Both SQL and no-SQL can be used for your purpose. The two main reasons to go with no-SQL is if you really have a lot of traffic (and your sql solution is not working performance-wise) and if you have a lot of unstructured and changing data that benefits from being schema-less.
Personally I believe a significant factor for you to consider is maintainability.
If you create anything using no-sql you are going to have less than 10% of the audience for maintaining it when compared to SQL.
It is common for programmers to want to use the 'best' solution technically but not factor in the maintainability and costs aspects, especially when the solution is considered 'simple' by them.
), for your purposes, I think a NoSQL is probably a better choice than MySQL. You should check out like MongoDB or CouchDB, both are open-source scalable NoSQL DBs (and as already mentioned, there are other NoSQL DBs and file storage systems commercially available)
Basically, messaging boards do not really need a DBMS. In a DBMS, query processing actions are slower than in a NoSQL DB and messaging boards can have a high volume of traffic as well as data that does not necessarily have a fixed schema. The flexibility of NoSQL with regard to data structure enables utilizing and implementing sharding, partitioning, indexing and other technologies easily.
Although performance is one of the key elements, this is not a feature in NoSQL, it is more a consequence of design, what I think is THE feature is the flexibility of its data structure and the possibility to store information in a single row avoiding multiple round trips when you work with records that are close related (take a look of this post http://djondb.com/blog to get a better understanding of what I'm talking about ).
For any website which requires to change its model on a daily basis it's wise to choose a DB which can keep up with this flexibility.
I'm a little bit biased because I'm the author of a NoSQL document store but I suggest you to give NoSQL document store a try, you'll be surprise on how fast you can create solutions using that kind of easy to store approach.
Have you looked at Redis (http://redis.io/) ?
You can model almost everything you have in your RDBMS with Redis. In most cases you will get x10 performance, and it is supported by a great and very active community .
I suggest that you detail your needs in the Redis forum, and you will probably get the most honest and professional responses; part of them may suggest that you use other NoSQL technologies on different parts of your architecture

Are there any advantages to using mongodb over mysql if said mongo db were used without embedded documents?

I'm using a php framework with a mongodb adapter that doesn't currently comprehend embedded documents as a Model/association relationship. After reading about mongodb for a few days it seems that you should use embedded documents for objects that are most often displayed together. This makes a lot of sense to me. It was said during one mongo schema talk that a collection of many small documents can negate some of the advantages of mongo over an RDBMS.
In searching stackoverflow and beyond, I can't seem to see what advantages exist, if any, when deploying mongodb into an environment where it is implemented with a reasonably normalized schema like you'd find in a traditional RDBMS.
Are there still advantages to using MongoDB when used in this way? Scaling? Performance?
If by "reasonably normalized" you mean that you need information from one table to filter the information from another table (i.e. a join), then mongo is going to work against you. In a SQL database you can easily get the info from multiple tables with a single query. In mongo you'll need multiple queries to get data from multiple collections. Any speed advantage mongo gives you in pulling from a single collection will quickly be negated by making multiple round trips to the database.
Here are some advantages that MongoDb might give you (depending on your usecase):
Schemaless: More flexible if document structure is modified later.
Performance: MongoDB utilizes the RAM available very well making it very performant
Easy replication: Replication is easy to setup
Sharding/Clustering: MongoDB is designed with sharding in mind. It is easy to setup and doesn't require experts.
Map/Reduce: If you happen to need this, there is built-in support.
Javascript: Intuitive to use if you already know Javascript (and who doesn't nowadays :) )
MongoDB website has a good list of casestudies of production deployments.
MongoDB has replication and sharding built in.
These are things that can be done with MySQL.
The downside is the learning curve and lack of programmers that know it.
If it's just for you, it would be fun as a learning project.
If this is for a larger project, you'll need to weigh the lack of MongoDB programmers and learning curve against popularity of MySQL.
I have been developing my University dissertation project with MySQL first then thought to give a shot to MongoDB to improve performance. Rewriting code was really easy and straightforward with Jongo. Production has been really smooth.
Unfortunately performance were terrible. I am not particularly skilled with MongoDB queries, but I believe I did quite a lot of research: I have used map reduce, I have used the aggregation framework, $limit and all that stuff... when at same stage I got the message: "request heap use exceeded 10% of physical RAM" I really gave up and delivered the MySQL version.
For me it's really a shame because I was working so hard to make it work the best way possible with MongoDB (as a University project stands out if you do something different). However I think I will continue study MongoDB in future, but for the moment I stick to performance (or better what I can make perform).
I hope my comment will not offend MongoDB fans, but this is my experience.

Using both Mongodb and Mysql in one project

I have been working to learn Mongodb effectively for one week in order to use for my project. In my project, I will store a huge geolocation data and I think Mongodb is the most appropriate to store this information. In addition, speed very important for me and Mongodb responds faster than Mysql.
However, I will use some joins for some parts of the project, and I'm not sure whether I store user's information in Mongodb or not. I heard some issues can occur in mongodb during writing process. should I use only mongodb with collections (instead of join) or both of them?
In most situations I would recommend choosing one db for a project, if the project is not huge. On really big projects (or enterprises in general), I think long term organizations will use a combination of
RDBMS for highly transactional OLTP
NoSQL
a datawarehousing/BI project
But for things of more reasonable scope, just pick the one that does the core of the use case, and use it for everything.
IMO storing user data in mongodb is fine -- you can do atomic operations on single BSON documents so operations like "allocate me this username atomically" are doable. With redo logs (--journal) (v1.8+), replication, slavedelayed replication, it is possible to have a pretty high degree of data safety -- as high as other db products on paper. The main argument against safety would be the product is new and old software is always safer.
If you need to do very complex ACID transactions -- such as accounting -- use an RDBMS.
Also if you need to do a lot of reporting, mysql may be better at the moment, especially if the data set fits on one server. The SQL GROUP BY statement is quite powerful.
You won't be JOINing between MongoDB and MySQL.
I'm not sure I agree with all of your statements. Relative speed is something that's best benchmarked with your use case.
What you really need to understand is what the relative strengths and weaknesses of the two databases are:
MySQL supports the relational model, sets, and ACID; MongoDB does not.
MongoDB is better suited for document-based problems that can afford to forego ACID and transactions.
Those should be the basis for your choice.
MongoDB has some nice features in to support geo-location work. It is not however necessarily faster out of the box than MySQL. There have been numerous benchmarks run that indicate that MySQL in many instances outperforms MongoDB (e.g. http://mysqlha.blogspot.com/2010/09/mysql-versus-mongodb-yet-another-silly.html).
Having said that, I've yet to have a problem with MongoDB losing information during writing. I would suggest that if you want to use MongoDB, you use if for the users as well, which will avoid having to do cross database 'associations', and then only migrate the users to MySQL away if it becomes necessary.

Can redis fully replace mysql?

Simple question, could I conceivably use redis instead of mysql for all sorts of web applications: social networks, geo-location services etc?
Nothing is impossible in IT. But some things might get extremely complicated.
Using key-value storage for things like full-text search might be extremely painfull.
Also, as far as I see, it lack support for large, clustered databases: so on MySQL you have no problems if you grow over 100s of Gb in Database, and on Redis... Well, it will require more effort :-)
So use it for what it was developed for, storing simple things which just need to be retreived by id.
ACID compliance is a must, if data integrity is important. Medical records and financial transactions would be an example. Most of the NoSQL solutions, including Redis, are fast because they trade ACID properties for speed.
Sometimes data is simply more convenient to represent using a relational database and the queries are simpler.
Also, thanks to foreign relationships and constraints in relational databases, your data is more likely to be correct. Keeping data in sync in NoSQL solutions is more difficult.
So, no I don't think we can talk about full replacement. They are different tools for different jobs. I wouldn't trade my hammer for a screwdriver.

Cassandra or MySQL/PostgreSQL?

I have huge database (kinda wordnet) and want to know if it's easier to use Cassandra instead of MySQL|PostrgreSQL
All my life I was using MySQL and PostrgreSQL and I could easily think in terms of relational algebra, but several weeks ago I learned about Cassandra and that it's used in Facebook and Twitter.
Is it more convenient?
What DBMS are usually used nowadays to store social net's data, relationships between objects, wordnet?
There is nothing like a Silver bullet solution, everything is built to solve specific problem and has its own pros and cons. It is up to you to decide - what problem statement you have and what is best solution that fits your problem. Whether you use Cassandra (NoSQL) or MySQL(RDBMS), it is all driven from your system's requirements. Below are the inputs that will help you in taking better decision while deciding on database.
Why to Use NoSQL
In the case of RDBMS database, making choice is quite easy because almost all the databases like MySQL, Oracle, MS SQL, PostgreSQL in this category offer almost same kind of solutions oriented to the ACID property. When it comes to NoSQL, decision becomes difficult because every NoSQL database offers different solution and you have to understand which one is best suited for your app/system requirement. For example, MongoDB fits for use cases where your system demands schema-less document store. HBase might fit for Search engines, analysing log data, any place where scanning huge, two-dimensional join-less tables is a requirement. Redis is built to provide In-Memory search for varieties of data structures like tree, queue, link list etc and can be good fit for making real time leader board, pub-sub kind of system. Similarly there are other database in this category (including Cassandra) which fits for different problems. Now lets move to original question, and answer them one by one.
When to use Cassandra
Being a part of NoSQL family, Cassandra offers solution for problem where your requirement is to have very heavy write system and you want to have quite responsive reporting system on top of that stored data. Consider use case of Web analytics where log data is stored for each request and you want to built analytical platform around it to count hits by hour, by browser, by IP, etc in real time manner. You can refer to blog post (http://blogs.shephertz.com/2015/04/22/why-cassandra-excellent-choice-for-realtime-analytics-workload/) to understand more about the use cases where Cassandra fits in.
When to Use a RDMS instead of Cassandra/NoSQL
Cassandra is based on NoSQL database and does not provide ACID and relational data property. If you have strong requirement of ACID property (for example Financial data), Cassandra would not be a fit in that case. Obviously, you can make work out of it, however you will end up writing lots of application code to handle ACID property and will loose on time to market badly. Also managing that kind of system with Cassandra would be complex and tedious for you.
There are many different flavours of "NoSQL" databases. If your application is really like Wordnet perhaps you should look at a graph database such as Neo4j.
I would suggest to analyse your request.
If you are going with more clusters, machines take NoSQL
If your data model is complicated - require efficient structures take NoSQL (no limits with type of columns)
If you fit in a few machines without scales, and you don't need super performance for multi request (as for example in social network - where lot of users send http request), and you don't think you involve saleability take RDBMS (Postgres have some good functions and structures which you can use, like array column type).
Cassandra should work better with large scales of data, multi purpose.
neo4j - would be better for special structures, graphs.
Cassandra and other NoSQL stores are being used for social based sites because of their need for massive write based operations. Not that MySQL and Postgres can't achieve this but NoSQL requires far less time and money, generally speaking.
Sounds like you may want to look at Neo4J though, just in terms of your object model needs.
All different products and they all have their pro's and conn's. What kind of problem do you have to solve?
Huge, as in TB's?