What database to use for a clicks stream application RELATIONAL OR NOSQL? [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I need to create a database for dealing with click stream (from ~240 subdomains). I use a Java Script for grabbing information like (Host, Page, Date, userID, Referer, HostName, RefererPath, uniqueUserID) for each click and than insert the data to the database through a java web dynamic application. There are about 9 milion new records each day and I have to insert new records every minute. Another application needs to be able to retrieve information about pageviews/unique visitors/ect for a certain article/subdomain in the last (10min, 20min, 30min, 1hour...24 hours). I only need to keep records for the last 3 months.
Initially I thought about using MySQL as I'm only interested in open-source. But I'm thinking about NoSQL solutions. The problem is that I've had experience only with relational databases and am not really able to tell if NoSQL would be a better solution here or not. Also which database should I use if I choose to go wiht NoSQL? and would Key-value store be the best way to go?

I'm guessing this data consistency isn't critical (statistics ?) so you could indeed spare a bit of consistency. NoSQL seems a good choice and a key value store would also be my pick. Now the real question is : what is the most suited one ?
I'd give a consideration to Redis and Riak (which are basically the most well-known ones) :
Riak (AP system) :
Fault-tolerant (masterless with partitioning and replication)
Map reduce
Full text search
BASE
Redis (CP system) :
Really fast
In-memory : You need RAM ! That also means you want replication so you don't lose everything on a crash. Redis also uses disk snapshot I believe.
Master/Slave with reelection
BASE
Both have a lot more features, you should go read the documentation for gotchas. Redis is primarly used as a cache since it's fast, whereas Riak focuses on fault-tolerance. Given your scalability requirements, both can satisfy your need. Therefore you must chose according to what's above.

Related

Is Mongo DB better than Mysql DB for notification storage and retrieval? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I had a mobile app backend written by someone else in node(express). He is managing all data in mysql DB but storing notification for new cutomer signup etc in mongodb is there a performance gain or I should use one d.b throughout the project?
MySQL is highly organized for its flexibility, high performance, reliable data protection, and ease of managing data. Proper data indexing can resolve your issue with performance, facilitate interaction and ensure robustness.
But if your data is not structured and complex to handle, or if predefining your schema is not coming easy for you, you should better opt for MongoDB. What’s more, if you're required to handle a large volume of data and store it as documents, MongoDB will help you a lot!
The result: One isn’t necessarily better than the other. MongoDB and MySQL both serve in different niches.
Reference
MongoDB is good for handling large unstructured data. The best thing about MongoDB is that it is not bound to schema design.
To store notification you can use MongoDB though notifications can be in billions or trillions in number. So, MongoDB could be the choice to store and retrieve that data. Data retrieve is faster in MongoDB if we compare MySql.
Checkout the link MongoDB v/s MySql

What engine type would be better in this scenario? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I’m writing an android app that will sync with a MySQL db on my webserver (there will also be a website reading from/writing to the same dB). The android app will store a copy of the data locally in a sqlite db to provide access while offline. If the user creates a row while offline, that record will be uploaded to the server the next time a data connection is available. I’m designing the app and website myself so I have the ability to set it up as I see fit (meaning it doesn’t have to conform to someone else’s server).
The sqlite db will have a column for id (which will represent the id as stored on the server) and a localID column. When the server receives the data, it will acknowledge the new data by returning an array (in json format) of the id numbers as stored on the server.
What would be better for this type of scenario: a transaction-safe engine or non-transaction-safe (such as isam)? It’s my understanding that isam would be faster and take less space to store but I can’t deal with losing data. I’m thinking that if the android app doesn’t receive the confirmation, it would resubmit the data. It seems like that would prevent data loss but I need a second (more-experienced) opinion. If you would go with a transaction-safe db, which would you recommend as I’ve never worked with one?
TIA!
A real database should be your default choice until you've seen that it's not fast enough.
Consider using UUIDs to generate IDs on the client that are guaranteed to be unique on the server.
Have you thought about how you would handle updates from multiple devices that both had off-line changes? You should consider some known patterns for dealing with this kind of synchronization.
Stack Overflow question
Data Replication book

What storage system to use for a real time messaging? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am developing a Real Time messaging application (such as WhatsApp and co) and I am facing a big question.
The application itself is not as complicated as what exists on the market. However, I am no sure what storage system I should use. I have several ideas but I don't know which one is better that the others:
A simple mysql database with relations between messages/conversations/conversations
A mongodb with replicate of each conversations for all users in the conversations
A redis store with replicate conversations for all users in the conversations.
I don't know which one is better for what I want to do. If you have some advise so I can choose the right solution. (or if there is a solution I haven't listed which is even better :) )
Note : My API is developped in Ruby On Rails (if this can help make a decision)
Data volume and number of read/writes should be the key factor leading you to the decision. If the data volume and number of read/write is not going to be huge you can do with mysql. I believe few TB of data with few hundreds of read/writes per minute is SQL database territory. Beyond that it is NoSQL world. However, you should be ready to deal with increased complexity of non-SQL data store design, query implementation, and achieving eventual consistency if you choose NoSQL solution. All the best!

Which database engine is best for node.js apps? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am looking for a database engine which is the best for storing thousands of records. I first wanted to use MySQL, because I know it best, but I'd like to have strong answer.
I need predefined columns, database can be as small as 10 MB or as "big" as 10 GB of data and it would be cool if that engine is fast for reads (insertions may be a bit slower). I don't need fast-fulltext-search or regexp searching. To give you an example - selecting items via slug extracted from link.
I saw this site before but I still don't know what is best option for me.
So here is my question: Which database engine is best for uses like mine?
You should look at MEAN stack. Personally, I like MongoDB - I use an ORM tool like mongooseJS - It increases your development speed rapidly. The one thing i really like about having Node JS, Express body parser, mongodb and mongoose is I deal everything on the server side in one language - Javascript and I expose REST services which can be consumed on Web (typically Angular - the A in MEAN stack or backbone) based application.
database can be as small as 10 MB or as "big" as 10 GB of data
At that size, you could use virtually any database you want. Remember, 10 GBs of data is small enough to fit into memory on a modern server.
I need predefined columns...
Sounds like SQL. Take you pick: MySQL, PostgreSQL, SQLite... at that size it will barely matter, just use what you like.
The performance difference on a "few gigs" of data will be negligible.
Look at MongoDB.
And don't forget to look at TokuMX - it's very promising!

When does a project get too big for mysql [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I know google uses its own Big Tables (is that right?) and Facebook and Twitter use Cassandra but when does your everyday project outgrow mysql (if ever?)
If you were starting out on a potentially massive scale web application, would you use mysql as an engine or start with an alternative from the start?
I think the only way you can know when MySQL isn't good enough is when you start to see performance issues or you feel like your fighting to keep it going. If you are aware that your application is potentially huge then you should be implementing the right tools from the start otherwise it's a huge headache transferring at a later date.
There is no simple metric which will tell you the answer - it depends not only on the amount of data, number of transactions but also the nature of the replication - number of replicated sites, required speed of replication etc.
Yes, a large scale noSQL clsuter can out-perform a a MySQL cluster built for the same budget for OLTP, however its called noSQL for a reason - when you need to start doing somethng useful with the data, the relational model and SQL language make slicing and dicing the data much easier. OTOH, at some point OLAP then overtakes the relational model in terms of performance - but I think it would be rather difficult to use a datawarehouse for transaction processing.
So its quite possible that the functional requirements of an application will outgrow the capabilities of a noSQL database much faster than the perofrmance requriements would outgrow a relational database.
I'd start with an alternative (PostgreSQL), but not because of scaling issues, but because MySQL's support for transactions and referential integrity is worthless.