Anybody use NoSQL / InnoDB with Memcached?
How stable is it? I have set it up yesterday and going to test today, but maybe you can share some knowledge also?
Not sure what you mean by NoSQL/InnoDB - Innodb is a storage engine used in mysql table schemas and isn't really related to NoSQL key/value stores like Mongo, Redis or CouchDB. If you mean a comparison between the two, here is a basic benchmark on an update statement between mongo, a major NoSQL platform, and mysql tables using the InnoDB engine.
http://mysqlha.blogspot.com/2010/09/mysql-versus-mongodb-update-performance.html
That said, most of the NoSQL alternatives have at this point fairly stable libraries. An application my team worked on utilized memcached alongside mongo utilizing their Python APIs in a search app to store query data to train the search results on later. Basically memcached hashes were stored alongside query data and then called after a result set was picked by the user in order to refine the results for those works. Haven't had any problems with utilizing the two together and implementation was a snap.
Most NoSQL engines now use some serialized key-value data, commonly some variant on the JSON spec. This actually makes things generally even easier than the old RDBMS approach of constructing your objects from across multiple tables and running numerous updates for your persistence tier. In the case of Mongo, we handed the whole serialized BSON doc returned from Mongo to memcached for the temp storage and there were no chokes at all.
This NoSQL thing is pretty cool for those already working with the object paradigm.
Related
I'm doing a project where I have to store data for a NodeJS express server. It's not a LOT of data, but i have to save it somewhere.
I always hear that a database is good for that kinda stuff, but I thought about just saving all the data in objects in NodeJS and back them up as JSON to disk every minute (or 5 minutes). Would that be a good idea?
What im thinking here is that the response time from objects like that are way faster than from a database, and saving them is easy. But then I heared that there are in-memory databases aswell, so my question is:
Are in-memory databases faster than javascript objects? Are JSON-based data backups a good idea in this aspect? Or should I simply go with a normal database because the performance doesn't really matter in this case?
Thanks!
If this is nothing but a school assignment or toy project with very simple models and access patterns, then sure rolling your own data persistence might make sense.
However, I'd advocate for using a database if:
you have a lot of objects or different types of objects
you need to query or filter objects by various criteria
you need more reliable data persistence
you need multiple services to access the same data
you need access controls
you need any other database feature
Since you ask about speed, for trivial stuff, in-memory objects will likely be faster to access. But, for more complicated stuff (lots of data, object relations, pagination, etc.), a database could start being faster.
You mention in-memory databases but those would only be used if you want the database features without the persistence and would be closer to your in-memory objects but without the file writing. So it just depends on if you care about keeping the data or not.
Also if you haven't ever worked with any kind of database, now's a perfect time to learn :).
What I'm thinking here is that the response time from objects like that is way faster than from a database, and saving them is easy.
That's not true. Databases are the persistence storage, there will always be I/O latency. I would recommend using Mysql for sql database and MongoDB or Cassandra for nosql.
An in-memory database is definitely faster but again you need persistence storage for those data. redis is a very popular in-memory database.
MongoDB store data in BSON (a superset of JSON) like formate, so it will be a good choice in your case.
This is more of a concept/database architecture related question. In order to maintain data consistency, instead of a NoSQL data store, I'm just storing JSON objects as strings/Text in MySQL. So a MySQL row will look like this
ID, TIME_STAMP, DATA
I'll store JSON data in the DATA field. I won't be updating any rows, instead I'll add new rows with the current time stamp. So, when I want the latest data I just fetch the row with the max(timestamp). I'm using Tornado with the Python MySQLDB driver as my primary backend application.
I find this approach very straight forward and less prone to errors. The JSON objects are fairly simple and are not nested heavily.
Is this approach fundamentally wrong ? Are there any issues with storing JSON data as Text in MySQL or should I use a file system based storage such as HDFS. Please let me know.
MySQL, as you probably know, is a relational database manager. It is designed for being used in a way where data is related to each other through keys, forming relations which can then be used to yield complex retrieval of data. Your method will technically work (and be quite fast), but will probably (based on what I've seen so far) considerably impair your possibility of leveraging the technology you're using, should you expand the complexity of your scope!
I would recommend you use a database like Redis or MongoDB as they are designed for document storage rather than relational architectures.
That said, if you find the approach works fine for what you're building, just go ahead. You might face some blockers up ahead if you want to add complexity to your solution but either way, you'll learn something new! Good luck!
Pradeeb, to help answer your question you need to analyze your use case. What kind of data are you storing? For me, this would be the deciding factor: every technology has its specific use case where it excels at.
I think it is safe to assume that you use JSON since your data structure needs to very flexible documents, compared to a traditional relational DB. There are certain data stores that natively support such data structures, such as MongoDB (they call it "binary JSON" or BSON) as Phil pointed out. This would give you improved storage and/or improved search capabilities. Again, the utility depends entirely on your use case.
If you are looking for something like a job queue and horizontal scalability is not an issue and you just need fast access of the latest you could use RedisDB, an in-memory key value store, that has a hash (associative array) data type and lists for this kind of thing. Alternatively, since you mentioned HDFS and horizontal scalability may very well be an issue, I can recommend using queue systems like Apache ActiveMQ or RabbitMQ.
Lastly, if you are writing heavily, and your are not client limited but your data storage is your bottle neck: look into distributed, flexible-schema data storage like HBase or Cassandra. They offer flexible data schemas, are heavily write optimized, and data can be appended and remains in chronological order, so you can fetch the newest data efficiently.
Hope that helps.
This is not a problem. You can also use memcached storage engine in modern MySQL which would be perfect. Although I have never tried that.
Another approach is to use memcached as cache. Write everything to both memcached, and also mysql. When you go to read data, try reading from memcached. If it does not exist, read from mysql. This is a common technique to reduce database bottleneck.
I'm creating a search engine for deals, disscounts and coupons. First with my engine I collect deals from some sites and write that deals into database. So an records have a:
records: name,dissount,price,latitude,longitude
Now i'm using mysql but is my search engine will be faster if I use mongodB becouse all results in is similar json format
What is better solution if I have 1,000,000 records mysql or mongoDB ? I need faster searching.
http://test.pluspon.com
For your use case MongoDB would indeed be faster.
You can easy implement processing with multiple mongos in sharded environments, there would not be any blocking and even more performance gain for your use case.
But keep in mind that speed benchmarks and fast data processing is not the only thing you should care about. MongoDB is still at very young age compared to more mature enterprise databases. But for your named use case i would advise to go with it.
Also as commented there are other NoSQL databases that could help you even better in some cases. Read up this blog for more understanding
I have been working to learn Mongodb effectively for one week in order to use for my project. In my project, I will store a huge geolocation data and I think Mongodb is the most appropriate to store this information. In addition, speed very important for me and Mongodb responds faster than Mysql.
However, I will use some joins for some parts of the project, and I'm not sure whether I store user's information in Mongodb or not. I heard some issues can occur in mongodb during writing process. should I use only mongodb with collections (instead of join) or both of them?
In most situations I would recommend choosing one db for a project, if the project is not huge. On really big projects (or enterprises in general), I think long term organizations will use a combination of
RDBMS for highly transactional OLTP
NoSQL
a datawarehousing/BI project
But for things of more reasonable scope, just pick the one that does the core of the use case, and use it for everything.
IMO storing user data in mongodb is fine -- you can do atomic operations on single BSON documents so operations like "allocate me this username atomically" are doable. With redo logs (--journal) (v1.8+), replication, slavedelayed replication, it is possible to have a pretty high degree of data safety -- as high as other db products on paper. The main argument against safety would be the product is new and old software is always safer.
If you need to do very complex ACID transactions -- such as accounting -- use an RDBMS.
Also if you need to do a lot of reporting, mysql may be better at the moment, especially if the data set fits on one server. The SQL GROUP BY statement is quite powerful.
You won't be JOINing between MongoDB and MySQL.
I'm not sure I agree with all of your statements. Relative speed is something that's best benchmarked with your use case.
What you really need to understand is what the relative strengths and weaknesses of the two databases are:
MySQL supports the relational model, sets, and ACID; MongoDB does not.
MongoDB is better suited for document-based problems that can afford to forego ACID and transactions.
Those should be the basis for your choice.
MongoDB has some nice features in to support geo-location work. It is not however necessarily faster out of the box than MySQL. There have been numerous benchmarks run that indicate that MySQL in many instances outperforms MongoDB (e.g. http://mysqlha.blogspot.com/2010/09/mysql-versus-mongodb-yet-another-silly.html).
Having said that, I've yet to have a problem with MongoDB losing information during writing. I would suggest that if you want to use MongoDB, you use if for the users as well, which will avoid having to do cross database 'associations', and then only migrate the users to MySQL away if it becomes necessary.
I am new to MySQL and I am looking for some answers to the following questions:
a) Can MySQL community server be leveraged for a key-value pair type database?
b) Which MySQL engine is best suited for a key-value pair type database?
c) Is MySQL cluster a must for horizontal scaling of key-value based datastore or can it be acheived using MySQL replication?
d) Are there any docs or whitepapers for best practices when implementiing a key-value datastore on MySQL?
e) Are there any known big implementations other than friendfeed doing key-value pair using MySQL?
Any relational database can provide a key-value store, but it's not what they're for: and they aren't good at it, not when compared to native key-value databases like e.g. Cassandra.
If your requirements aren't extreme, your best bet would be MyISAM as it's probably fastest and transaction support is not (high) on the priority list of key-value databases.
If you're only doing key-value stuff, you might want to check out HandlerSocket for Mysql. I haven't used it, but this overview shows installation and usage. Basically, it strips out all the relational stuff (query parsing, joins, etc) and uses mysql's storage directly, making it extremely fast (but suitable only for key-value storage).
I know I am late to the game here, but actually, MySQL version 5.6 has some changes that allow it to become a key value store with memcached built in. It looks really slick:
NoSQL Interface via memcached